admin管理员组

文章数量:1323377

I have large -- several hundred files -- directory that have the file names in the general format of a unique user name (all A-Z and a-z, underlines, no spaces), followed by 2 or 3 numbers, and then one letter from a to z. I'd like to be able to list each unique user's name followed by the number of files that start with user's name. For example:

  • JosephB01a.txt
  • JosephB02a.jpg
  • JosephB02b.jpg
  • JosephB03a.png
  • JoeM01a.jpg
  • JoeM02a.zip
  • SamJ01a.jpg
  • SamJ01b.txt
  • LisaW10a.tif

Would return:
JosephB 4
JoeM 2
SamJ 2
LisaW 1

Since I haven't done much for coding outside of excel for 20-some years, I tried doing this in excel first. No problem importing the file list into a spread sheet. However, I can't think of a way of automating the counting. Manually creating a COUNT for each name is still too time consuming.
I tried grep and wc with a little refresher from some youtube videos but ran into the same problem. Doing grep with each name is too much.
I spent a day relearning some perl with some help from youtube and here. Regular expressions seem to only return true or false and not the string that matches the PATTERN. I couldn't figure out a way to create an array for each unique USERNAME to then count the grep() matches with.
I'm sure there's C++ solution to this but I haven't had a compiler for over a decade now.

A Solution in excel is preferred, but I mention the others if there isn't a solution in excel.

I have large -- several hundred files -- directory that have the file names in the general format of a unique user name (all A-Z and a-z, underlines, no spaces), followed by 2 or 3 numbers, and then one letter from a to z. I'd like to be able to list each unique user's name followed by the number of files that start with user's name. For example:

  • JosephB01a.txt
  • JosephB02a.jpg
  • JosephB02b.jpg
  • JosephB03a.png
  • JoeM01a.jpg
  • JoeM02a.zip
  • SamJ01a.jpg
  • SamJ01b.txt
  • LisaW10a.tif

Would return:
JosephB 4
JoeM 2
SamJ 2
LisaW 1

Since I haven't done much for coding outside of excel for 20-some years, I tried doing this in excel first. No problem importing the file list into a spread sheet. However, I can't think of a way of automating the counting. Manually creating a COUNT for each name is still too time consuming.
I tried grep and wc with a little refresher from some youtube videos but ran into the same problem. Doing grep with each name is too much.
I spent a day relearning some perl with some help from youtube and here. Regular expressions seem to only return true or false and not the string that matches the PATTERN. I couldn't figure out a way to create an array for each unique USERNAME to then count the grep() matches with.
I'm sure there's C++ solution to this but I haven't had a compiler for over a decade now.

A Solution in excel is preferred, but I mention the others if there isn't a solution in excel.

Share Improve this question edited Jan 12 at 18:41 Beanyurza asked Jan 12 at 5:31 Beanyurza Beanyurza 31 silver badge2 bronze badges 4
  • sed -E 's/^([a-zA-Z_]+).*/\1/' files.txt | sort | uniq -c close enough? – Shawn Commented Jan 12 at 5:43
  • So what do you want? A solution in Excel, a solution in PERL, a solution in c++, or some other solution in, say, bash with use of grep or sed? – Tangentially Perpendicular Commented Jan 12 at 5:44
  • @Blackcat Probably. Details matter so hopefully OP will edit the question to clarify this. – Ted Lyngmo Commented Jan 12 at 6:34
  • 1 If you have full fat Excel 365, =GROUPBY(REGEXEXTRACT(A.:.A,"^[A-Za-z_]+"),ROW(A.:.A),COUNT) – Tom Sharpe Commented Jan 12 at 16:50
Add a comment  | 

1 Answer 1

Reset to default 1

With Excel 365

The formula D1

=LET(nam,BYROW(A1:A9,LAMBDA(x,MATCH(1,IF(ISNUMBER(VALUE(MID(x,SEQUENCE(LEN(x)),1))),1,0),0))), gr,LEFT(A1:A9,nam-1), res,BYROW(UNIQUE(gr),LAMBDA(x,SUM(IF(gr=x,1,0)))), res)

Take the text until the first number in the string, and then count the unique values of them.

本文标签: excelAnalyzing file namesStack Overflow