This week's Git Tip of the Week is about grepping to find content. You can subscribe to the feed if you want to receive new instalments automatically.
Sometimes, when investigating the contents of a repository, it's not always obvious where to find the definition of a function (or method). Clearly, Unix tools such as grep
allow you to find the content easily enough, but if there's a large amount of generated data (such as compiled code) simply looking through all files can be time-consuming.
To find a file with a specific content element, you could use something like find . -exec grep pattern '{}' ';'
. This will run the grep
command on all files in the working directory. However, there's a faster way of achieving this with git grep
instead. Let's say we wanted to find the contents of occurrences in the EGit repository of the variable newPushURI
. We could use find
to achieve this:
EGit (master)$ find . -exec grep newPushURI '{}' ';' URIish newPushURI = uri; newPushURI = newPushURI.setPort(GERRIT_DEFAULT_SSH_PORT); newPushURI = newPushURI.setScheme(Protocol.SSH.getDefaultScheme()); newPushURI = newPushURI.setPort(GERRIT_DEFAULT_SSH_PORT); newPushURI = prependGerritHttpPathPrefix(newPushURI); uriText.setText(newPushURI.toString()); scheme.select(scheme.indexOf(newPushURI.getScheme()));
OK, we've found some occurrences but it doesn't print the names of the files, which isn't too helpful. We could print the file out afterwards if we wanted to but this wouldn't help for files which don't have the match. Or, you could write some kind of script or alias to handle the scan-and-test. It's also not particularly fast:
EGit (master)$ time find . -exec grep newPushURI '{}' ';' > /dev/null real 0m1.605s user 0m0.541s sys 0m0.849s
An alternative is to use git grep
to scan the contents of the current working tree:
EGit (master)$ git grep newPushURI org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: URIish newPushURI = uri; org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = newPushURI.setPort(GERRIT_DEFA org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = newPushURI.setScheme(Protocol. org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = newPushURI.setPort(GERRIT_DEFA org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = prependGerritHttpPathPrefix(ne org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: uriText.setText(newPushURI.toString()); org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: scheme.select(scheme.indexOf(newPushURI.getScheme()
Not only does it show which files they are located, it's also an order of magnitude faster:
EGit (master)$ time git grep newPushURI > /dev/null real 0m0.024s user 0m0.014s sys 0m0.033s
The arguments that git grep
takes are much the same as grep
itself; for example, -l
lists files with matches (and -L
is files without), -E
allows an extended regexp, -i
is ignore case and -w
is word regexp.
There are also some options which are specific to git
. The --no-index
example scans the files in the directories, whilst --cached
searches blobs in the index.
As well as the current working directory, git grep
can also be used to specify a treeish (tag, branch, commit) and subfolders within a repository. If we wanted to look for the regex extension.point
in the stable-1.0
branch for matches located in the org.eclipse.egit.core
folder, we could do:
EGit] (master)$ git grep extension.point stable-1.0 -- org.eclipse.egit.core stable-1.0:org.eclipse.egit.core/plugin.xml: <extension point="org.eclipse.core.runtime.preferences"> stable-1.0:org.eclipse.egit.core/plugin.xml: <extension point="org.eclipse.team.core.repository">
Finally, it's possible to use -p
to print out the name of a function in which a match occurs.
EGit (master)$ git grep -p newPushURI org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java= private void setDefaults(RepositorySelection selection) { org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: URIish newPushURI = uri; org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = newPushURI.setPort(GERRIT_DEFA org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = newPushURI.setScheme(Protocol. org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = newPushURI.setPort(GERRIT_DEFA org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: newPushURI = prependGerritHttpPathPrefix(ne org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: uriText.setText(newPushURI.toString()); org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java: scheme.select(scheme.indexOf(newPushURI.getScheme()
Note that this shows the function annotated by =
instead of a :
at the end of the name. This can be used to quickly find which functions contain a reference to a given pattern:
EGit] (master)$ git grep -p newPushURI | grep java= org.eclipse.egit.ui/src/org/eclipse/egit/ui/internal/clone/GerritConfigurationPage.java= private void setDefaults(RepositorySelection selection) {
Come back next week for another instalment in the Git Tip of the Week series.