Join the community!
Visit our GitHub or LinkedIn page to join the Tilburg Science Hub community, or check out our contributors' Hall of Fame!
Want to change something or add new content? Click the Contribute button!
As a project grows, the myriad changes made to the codebase may become intractable and this is why version control is an important principle to adhere to in one’s workflow.
Version Control with Git
Git is a command-line tool that functions as an open-source version control system - a system that keeps track of all the changes made to a file/codebase of an empirical project or app development, which facilitates efficient collaboration as all team members can work simultaneously on the latest version of the project or access previous versions of the file to make changes.
Git vs. GitHub
|Environment||Installed locally on a system
and source code history can
be managed on local machines
|Functional scope||Git focuses exclusively on
Source Code Management (SCM)
tasks like push, pull,
commit, fetch and merge.
|It serves as a centralised location
for uploading copies of a Git repository.
The GitHub GUI also offers
one access control, collaboration
features and other project-management tools.
How does it work?
In a nutshell, a standard Git workflow involves Branch-Commit-Push-Pull request-Merge (BCPPM).
All work on an issue happens in a separate repository branch.
When work is done, whoever is assigned to the issue commits the changes and creates a pull request which may include a request for peer review.
Once review (if any) is complete, the changes are merged back to the master branch and the final comment / deliverable are posted.
By default, we follow the Github Flow workflow model, which is explained below. This information is also summarized in a Github Cheatsheet, so you can easily follow along or refresh your memory in case you ever forget a step in the workflow!
The assignee should create a new branch using
git checkout -bon the command line, by clicking the “add branch” icon in the Github desktop client, or by creating the branch on the repository’s Github page.
If you are resolving an issue you can create a new branch most easily by going to the issue page and clicking “Create a branch” under Development. This will automatically link the branch to the issue and their pull requests.
The issue branch should be named
XXis the Github issue number and
descriptionis a version of the issue title (e.g.,
issue123-update-appx-figures). The description can be more compact than the issue title itself, but should be descriptive enough that other team members can understand what is being worked on in the branch.
For complex issues, additional branches can be made off of the main issue branch. These should be named
issue123-update-appx-figures/refactor) These sub-issue branches should be merged into the main issue branch before the main issue branch is merged back to the master branch.
All commits related to an issue should be made to the issue branch(es).
Every commit must have a commit message whose first line has the form
#X Description of commitwhere
Xis the Github issue number (e.g., “#123 Add first appendix figure”).
Any commit to master, that is merged to master, or that defines an issue deliverable should follow a complete run of the relevant modules / directories' build scripts (e.g.,
Crafting good commit messages is crucial to the history of work on a project being clear and readable. In this sense, a good commit message:
should describe the purpose of the commit
should not be redundant with what Git is already recording (“Update code” or “Modify slides.lyx” are redundant; “Refactor estimate() function” and “Add robustness figure to slides” are better).
should be written in sentence caps, use the imperative mood, and not end in a period ("#123 Revise abstract") not ("#123 abstract.").
This post by Chris Beams has an excellent discussion of what makes a good commit message.
When work on an issue is complete, the assignee should create a pull request by selecting the issue branch on the
codetab of the repository’s Github page then clicking
New Pull Request.
The title of the pull request should be
PR for #X: original_issue_titlewhere
Xis the Github issue number (e.g., “PR for #123: Update appendix figures”).
The description / first comment of the pull request should begin with a line that says
Xis the number the Github issue number (e.g., “Closes #123”). This will close the original Github issue and create a link in that issue to the pull request. Subsequent lines of the description can be used to provide instructions (if any) to the peer reviewer.
The pull request should be assigned to the assignee of the original issue.
If an issue requires peer review, the assignee should assign the reviewer(s) in the pull request. In this case the description / first comment of the pull request should provide instructions that define the scope of the peer review along with any information the reviewer will need to execute it efficiently.
Any issue that involves substantial changes to code should be peer reviewed by at least one other lab member (e.g. a research assistant), though it is ultimately up to the assignee’s discretion whether or not to send the issue for peer review.
The job of the peer reviewer IS to verify that:
- The deliverable is clear, complete, and conforms to the standards from the deliverables section here
- Files committed to the repository conform to our organizational and code style rules
- Empirical and theoretical results are clear and appear correct
It is NOT typically the job of the peer reviewer to:
- go over every detail of the output and every line of code.
While commenting on fine points of code style from time to time is fine, for example, this should not be the primary content of the peer review. When requesting peer review the assignee can request feedback in addition to the above: e.g., particular code or results that need a careful check.
All peer review comments should be made on the pull request itself, not on the original issue.
Revision to the code and other files in the repository as part of the peer review process should be made in the original issue branch (
issueXXX_description) since the pull request will automatically track changes made in the branch.
When peer review is complete, the output is finalized, and issue-specific content like the
/issue/ subdirectory has been deleted, the issue branch should be merged back to
master using a squash merge. You can normally perform this merge automatically from the pull request page on Github.
Once the merge is complete, you should delete the issue branch.
Exceptions to the Standard Workflow
In some cases we may use simpler workflows, for example skipping the branch/merge steps and just committing directly to
master. Here are a few exceptions:
- No issue branch: It is permissible to skip the step of creating an issue branch and commit changes directly to
masterwhen all of the following are true:
The issue is small in scope and will involve no more than a few commits
No one else is likely to be working on the same content at the same time
All commits follow complete runs of relevant build scripts (e.g.,
Separately, for some projects or repositories we may decide to use a simplified workflow where we commit everything to
masterby default. This could happen, for example, if some co-authors are unfamiliar with
gitand prefer the simpler workflow. In such cases we need to pay attention to avoid cases where many people will be working on the same content at the same time. We also impose a strict rule that all commits follow complete runs of
No pull request: It is permissible to skip the step of creating a pull request if an issue does not require pull review and no changes will be merged back to the master branch.
No peer review: It is permissible to skip the peer review step when the assignee is confident the output is correct and the issue involves no changes or only minor changes to code that is being merged back to the repository.