admin管理员组

文章数量:1125103

I'd like to

  • (1) search a git repo codebase files (not commit messages!) for a certain string,
  • and then (2) list all occurrences (filename only, but for each result),
  • but (3) also have those occurrences annotated with the last commit date (or potentially other commit info, like the message).

Is that possible?

What I've tried
I can do (1) and (2) easily with for example grep -r "somestring" ./src | cut -d ":" -f1 (the cut pipe shows only filename, but still lists each occurrence, which grep's -l flag wouldn't).
But how to fit in (3) is unclear to me. It's sort of git blame, but applied to all files.

The overall goal is to get an idea of how recently/frequently certain constructs (tags) are still added in the code.

My current manual workaround is to do a global search in VSCode, click each result and wait until VSC's Git Lens shows the last commit date for that line

For example, an output like the following would help me to find all occurrences of e.g. "print(" in the codebase, with a blame date annotation behind each filename. It shows that the string occurs 4 times, of which twice in the same file, and when their line was last inserted/changed

$ magiccommand "print("
/src/file1.py   2024-05-06
/src/file5.py   2021-02-13
/src/file5.py   2021-10-01
/src/subdir/core.py   2025-01-03
...

I'd like to

  • (1) search a git repo codebase files (not commit messages!) for a certain string,
  • and then (2) list all occurrences (filename only, but for each result),
  • but (3) also have those occurrences annotated with the last commit date (or potentially other commit info, like the message).

Is that possible?

What I've tried
I can do (1) and (2) easily with for example grep -r "somestring" ./src | cut -d ":" -f1 (the cut pipe shows only filename, but still lists each occurrence, which grep's -l flag wouldn't).
But how to fit in (3) is unclear to me. It's sort of git blame, but applied to all files.

The overall goal is to get an idea of how recently/frequently certain constructs (tags) are still added in the code.

My current manual workaround is to do a global search in VSCode, click each result and wait until VSC's Git Lens shows the last commit date for that line

For example, an output like the following would help me to find all occurrences of e.g. "print(" in the codebase, with a blame date annotation behind each filename. It shows that the string occurs 4 times, of which twice in the same file, and when their line was last inserted/changed

$ magiccommand "print("
/src/file1.py   2024-05-06
/src/file5.py   2021-02-13
/src/file5.py   2021-10-01
/src/subdir/core.py   2025-01-03
...
Share Improve this question edited yesterday Rabarberski asked 2 days ago RabarberskiRabarberski 24.9k22 gold badges81 silver badges99 bronze badges 3
  • 5 Isn't git log -S"your-string-here" --name-only good enough? If not, what's missing? It's a bit messy without any format, but --pretty can probably fix it, depending on your exact needs. – Romain Valeri Commented 2 days ago
  • 1 I was thinking more of git grep -n some-text and then take that output and parse the file name, the line number and blame that. – eftshift0 Commented 2 days ago
  • @RomainValeri: the output does not correspond to my expectation, even after fiddling with some options. My main problem is that the number of instances/lines returned is not the same as when doing a normal grep or global search in my IDE. I've added some more info (current workaround with Git Lens, example output, and updated my grep command which was not 100% correct) – Rabarberski Commented 2 days ago
Add a comment  | 

1 Answer 1

Reset to default 2

Would the following suit your needs?

  1. Find all files that contain your search pattern
  2. Blame each file from step 1
  3. Filter lines containing your search pattern

It's slow, but it does what you want. The output is not exactly like specified in your question, but maybe it works for you.

Example from git.git:

$ git grep -l 'printf(' | while IFS= read -r file; do git blame -f "$file"; done | grep 'printf('
d7d850e2b97 Documentation/CodingGuidelines (Ævar Arnfjörð Bjarmason 2022-10-10 13:37:59 -0700 303)    . %z and %zu as a printf() argument for a size_t (the %z being for
d7d850e2b97 Documentation/CodingGuidelines (Ævar Arnfjörð Bjarmason 2022-10-10 13:37:59 -0700 305)      printf("%"PRIuMAX, (uintmax_t)v).  These days the MSVC version we
76644e3268b Documentation/MyFirstContribution.txt (Emily Shaffer           2019-05-17 12:07:02 -0700  179)  printf(_("Pony saying hello goes here.\n"));
2656fb16ddb Documentation/MyFirstContribution.txt (Emily Shaffer           2019-05-29 13:18:09 -0700  291) existing `printf()` calls in place:
76644e3268b Documentation/MyFirstContribution.txt (Emily Shaffer           2019-05-17 12:07:02 -0700  298)  printf(Q_("Your args (there is %d):\n",
76644e3268b Documentation/MyFirstContribution.txt (Emily Shaffer           2019-05-17 12:07:02 -0700  303)      printf("%d: %s\n", i, argv[i]);
76644e3268b Documentation/MyFirstContribution.txt (Emily Shaffer           2019-05-17 12:07:02 -0700  305)  printf(_("Your current working directory:\n<top-level>%s%s\n"),

You might want to have a look at the -p/--porcelain options and friends which output "in a format designed for machine consumption" to post-process the output to your liking.

You can easily store it as executable git-magic in your PATH with the following content:

#!/bin/sh
git grep -l "$1" | while IFS= read -r file; do
  git blame -f "$file"
done | grep "$1"

and then simply run git magic 'printf(' to execute it.

NB. File names with line breaks are not supported.

本文标签: