Using Regular Expressions to Exclude GitHub Directories

GitHub file paths do not contain the GitHub org name or repository names. They only contain the folder name(s) and file name. Hence a regular expression to match GitHub directories must only contain characters to just match the folder name and file name.

Matching Files at the Repository Level

These are the files that are directly located under a repository. They are not nested under any repository folders. If the file name is abcd.py and the repository name is Python repository, in a GitHub org called Python Project, then the file path would be Python Project/Python repository/abcd.py. However, as mentioned above, GitHub file paths do not include the GitHub org name and repository name, and hence the file path would just be abcd.py in this case.

To exclude all such files (.py), you must create the regular expression as follows.

 .*\.py

Similarly, to exclude any other file types, you must replace py in the above pattern with your respective file extension.

Matching Files Nested in Directories

You can match files nested under repositories, by using the escape sequence character (\) for every level of nesting. An escape sequence character is required to match a forward slash (/) used in directories.

For instance, to match a file abcd.py under the folder first (effective GitHub file path is first/abcd.py), you must use the following regular expression.

first\/.*

The above expression matches all the files under the first folder and not just the abcd.py. To match only Python files (.py extension), you must use the following regular expression.

first\/.*\.py

To match only the abcd.py file, under the first folder, you must use the following regular expression.

first\/abcd\.py

Matching Nested Directories

To exclude all the files under a directory, you must match the entire directory. Consider that a directory is first/second. You wish to exclude all the files under this directory. Also, in this case, you must use the escape sequence character twice, since there are two levels of nesting and as a result, two forward slashes.

first\/second\/.*

This regex matches and excludes all the files under the directory.

Similarly, to exclude files nested at multiple levels, you can use escape sequence character-based matching.

Cheat Sheet

This cheat sheet displays the regex to be used for various scenarios.

MatchRegexComments

first/abcd.py

first\/abcd.py

Match a file called abcd.py under a directory called first.

first/abcd.py, first/efgh.py, first/ijkq.py

first\/.*.py

Match any file with .py extension under a directory called first.

first/abcd.py, first/abcd.java, first/abcd.cpp

first\/.*

Match any file under a directory called first.

first/second/abcd.py

first\/second\/abcd.py

Match a file called abcd.py under a directory called second, which is nested under another direcory called first.

first/second/abcd.py, first/second/efgh.py, first/second/ijkq.py

first\/second\/.*.py

Match any file with .py extension under a directory called second, which is nested under another direcory called first.

first/second/abcd.py, first/second/efgh.java, first/second/ijkq.cpp

first\/.*

Match any file under a directory called second, which is nested under another direcory called first.

abcd.py

 .*.py

Match a file called abcd.py which is located directly undet the repository and not under any folder.

abcd.py, efgh.java, ijkq.cpp

.*

Match any file which is located directly undet the repository and not under any folder.

You can use this link to generate a regular expression that exactly matches your requirements.

Last updated