%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
Version Control
Your digital lab notebook
In this section, we will learn about version control, and using git with GitHub to track our research.
Open introduction presentation ↗
Presentation content
What is version control?
- Often described as “track changes”, although that’s not quite right
- A super-powerful “undo” button
- A way of working collaboratively without over-writing each other’s work
- A way of recording exactly what you did and when
What is version control?

What is version control?
- We’re going to use
git, which is a strangely named version control software - We’ll use this to explain how version control works, and why it matters
- There are lots of other ways to do version control, but
gitis very widely used- If version control was word processing,
gitwould be Microsoft Word
- If version control was word processing,
Part 1: local git
What does git do?
gitallows you to bundle up changes to various files, and give the group of changes a unique commit hash and an explanatory message.gitworks on a project level, so you can make a bunch of changes to different files in a folder, and then commit all those changes with a descriptive message- It’s recorded that you made those changes, and there’s a unique commit hash that you can quote to point at the exact state of your folder when you added those changes.
What does git do?
gitis a command line program- There are actually only a few commands you’ll really use regularly
- But before we move on to learning what commands are needed, let’s try to build a mental model of
git - Hopefully this will be useful to those of you who are already using
gittoo!
What does git do?
Old version of python-file.py
1 # This is a comment
2 import matplotlib.pyplot as plt
3 x = [1, 2, 3, 4, 5]
4 y = [3, 4, 5, 6, 7]
5 plt.scatter(x, y)
New version of python-file.py
1 # This is a comment
2 import matplotlib.pyplot as plt
3 import numpy as np
4 x = [1, 2, 3, 4, 5]
5 y = [3, 4, 5, 6, 7]
6 plt.scatter(x, y)
6 plt.plot(x, y)
Line 3 added, line 6 removed, line 6 added.
What does git do?
New version of python-file.py
1 # This is a comment
2 import matplotlib.pyplot as plt
3 import numpy as np
4 x = [1, 2, 3, 4, 5]
5 y = [3, 4, 5, 6, 7]
6 plt.scatter(x, y)
6 plt.plot(x, y)
Line 3 added, line 6 removed, line 6 added.
Associated git commit
File: python-file.py
Commit hash: u87wy9o2
Commit message: change plotting method
+++ 3 import numpy as np
- – 6 plt.scatter(x, y)
+++ 6 plt.plot(x, y)
What does git do?
- When we work with
git, we bundle up changes in our project folder (/directory) and commit our changes. - Each commit (bundle of changes) gets a unique id - a hexadecimal hash that’s 40 digits long (we’re just going to abbreviate to the first 7)
- The commits are made to a “branch” - pause any thinking for a moment
What does git do?
- When we work with
git, we bundle up changes in our project folder (/directory) and commit our changes. - Each commit (bundle of changes) gets a unique id - a hexadecimal hash that’s 40 digits long (we’re just going to abbreviate to the first 7)
- The commits are made to a “branch” - pause any thinking for a moment
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
- Because each “bundle” of changes has been saved with a unique id, we can roll back our changes to a previous version if we want
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
- But let’s say we’re happy with how our code is working, but we want to try out a different way of doing something
- Or say we’ve written the conclusion section of our paper in a certain way, but our supervisor has some ideas for structuring it differently
How can we try this out without risking our current work that we’re happy with?
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
The answer is branching
- We mentioned earlier that your bundled changes (commits) were on the main branch
- We can create other branches to try out experimental changes while keeping our main branch safe
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
The answer is branching
- When we are happy with the changes we have made on the experimental branch, we can decide to mix them back in with our main branch
- We can merge the changes with the main branch
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
The answer is branching
- When we are happy with the changes we have made on the experimental branch, we can decide to mix them back in with our main branch
- We can merge the changes with the main branch
- This merge get’s it’s own unique id
Remember that you can always reverse to a previous commit, even across different branches!
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc"
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
- We can continue committing bundles of changes, and making new branches that support us taking risks with our work
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc"
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
- We can create branches from other branches, if we want to noodle around with changes
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc"
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
checkout experimental-2
commit id: "ad1cb45"
- We can create branches from other branches, if we want to noodle around with changes
- We can then abandon those branches if we realise we made terrible choices, and keep working on the original branch like nothing happened…
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc"
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
checkout experimental-2
commit id: "ad1cb45"
checkout main
merge experimental-2 id: "be34af1"
commit id: "def134a"
commit id: "563bdef"
- And we usually bring everything back to the main branch once we are happy with it
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc" tag:"v0.1.0 - preprint" type: HIGHLIGHT
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
checkout experimental-2
commit id: "ad1cb45"
checkout main
merge experimental-2 id: "be34af1"
commit id: "def134a"
commit id: "563bdef" tag:"v1.0.0 - published paper" type: HIGHLIGHT
- We can also tag specific commits if the code at that point in time is important!
- For example, the version of the code you used to generate results for a preprint or the final paper!
What does git do?
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc" tag:"v0.1.0 - preprint" type: HIGHLIGHT
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
checkout experimental-2
commit id: "ad1cb45"
checkout main
merge experimental-2 id: "be34af1"
commit id: "def134a"
commit id: "563bdef" tag:"v1.0.0 - published paper" type: HIGHLIGHT
- These tags can also be called releases: (hopefully!) fairly complete, working, nice versions of your code
- Commits in between releases (merged to the main branch) are like patches to video games
- The commit notes are like patch notes, telling you what’s changed
Part 2: the git cycle
The git cycle
So we’ve looked at the idea of bundling up changes as commits, but what does that actually involve?
I think of it like packing a picnic basket:
- I make a sandwich and wrap it up
- I add it to the basket
- I chop up some fruit and put it in a lunchbox
- I add that to the basket
- I make a smoothie and bottle it
- I add that to the basket too
- Finally, I close over the top of the picnic basket and secure the latch
The git cycle
How does this have anything to do with git?
- I make a sandwich and wrap it up
- I add it to the basket
- I chop up some fruit and put it in a lunchbox
- I add that to the basket
- I make a smoothie and bottle it
- I add that to the basket too
- Finally, I close over the top of the picnic basket and secure the latch
The git cycle
Let’s introduce the concept of add as well as commit.
- I make a sandwich and wrap it up -> I make some edits to files/create new files in my project folder
- I add it to the basket -> I
addmy changes - I chop up some fruit and put it in a lunchbox -> I make some more edits to files
- I add that to the basket -> I
addmy changes - I make a smoothie and bottle it -> I make some more edits to files
- I add that to the basket too -> I
addmy changes - I close over the top of the picnic basket and secure the latch -> I
commitall these changes that I previouslyadded
The git cycle
Let’s introduce the concept of add as well as commit.
- You create or edit files, then
- You
addthose changes, then - You either create/edit more files and repeat
addor youcommitall the added changes with a little message
I think of git add as like a quick save, whereas git commit is a full, proper save.
The git cycle
create/edit -> add -> commit
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Untracked -->|**git add**| Staged
Staged -->|**git commit**| Committed
Committed -.->|*edit files*| Untracked
- We need to
addeverything to our picnic basket - When we are happy with our bundle of changes, we close up the basket and
committhe changes, and add a nice little label to it in the form of a commit message
The git cycle
Some new jargon - the state of the files in your repository
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Untracked -->|**git add**| Staged
Staged -->|**git commit**| Committed
Committed -.->|*edit files*| Untracked
- Untracked/modified: files that have been created or edited since the last cycle, that haven’t been added or committed
- Staged: files that have been added (put in the basket) but not committed yet.
- Committed: files that have been added and committed and now have a unique id attached to their most recent changes (and have not been edited since the last commit)
As well as being able to “undo” entire commits, you can undo different stages of this cycle (e.g. you can unstage files so they go from being staged to untracked)
Where does the git cycle fit in?
Every node on this graph…
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc" tag:"v0.1.0 - preprint" type: HIGHLIGHT
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
checkout experimental-2
commit id: "ad1cb45"
checkout main
merge experimental-2 id: "be34af1"
commit id: "def134a"
commit id: "563bdef" tag:"v1.0.0 - published paper" type: HIGHLIGHT
contains this cycle
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Untracked -->|**git add**| Staged
Staged -->|**git commit**| Committed
Committed -.->|*edit files*| Untracked
If you understand this, then actually using git is just a matter of googling the right commands.
Part 3: remote git
Back up your work!
When doing research, or back in undergrad, we all heard the common refrain “back up your work!”
- This sometimes involved floppy disks, usb sticks, external harddrives;
- Also now likely to include cloud platforms and storage options (Dropbox, OneDrive, etc.)
Back up your work!
So, we have the history of our work saved in our git repository (which is just the folder that our files are stored in) - but it’s just as vulnerable to loss as any other files on our pc.
We need a back-up!
The “backup” of our local git repository is called a remote repository.
Remote repository
The remote repository can be:
- on a different computer
- on an external harddrive
- on a cloud service like GitHub
Remote repository
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Untracked -->|**git add**| Staged
Staged -->|**git commit**| Committed
Committed -.->|*edit files*| Untracked
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
gitGraph
commit id: "a1b2c3d"
commit id: "4e5f678"
branch experimental-1
checkout experimental-1
commit id: "90abcde"
commit id: "7835cd3"
checkout main
merge experimental-1 id: "df37ba1"
commit id: "34efc1a"
commit id: "32753bc" tag:"v0.1.0 - preprint" type: HIGHLIGHT
branch experimental-2
checkout experimental-2
commit id: "cb45ad1"
branch experimental-3
checkout experimental-3
commit id: "456abc1"
checkout experimental-2
commit id: "ad1cb45"
checkout main
merge experimental-2 id: "be34af1"
commit id: "def134a"
commit id: "563bdef" tag:"v1.0.0 - published paper" type: HIGHLIGHT
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Local -->|**git push**| Remote
Remote -.->|**pull**| Local
- You can
pushbundles of commits to your remote repository - You can also
pullchanges from the remote repository to your local…- You can have different “local” repositories on different machines…
Remote repository
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Local-1 -->|**push**| Remote
Remote -.->|**pull**| Local-1
Remote repository
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Local-1 -->|**push**| Remote
Remote -.->|**pull**| Local-1
Remote -.->|**pull**| Local-2
Remote repository
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
}
}
}%%
flowchart TD
Local-1 -->|**push**| Remote
Remote -.->|**pull**| Local-1
Remote -.->|**pull**| Local-2
Local-2 -->|**push**| Remote
- You can use a remote repository to sync repositories across different machines
- For yourself, or for collaborators
Remote git
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
},
'gitGraph': {
'showBranches': true,
'showCommitLabel': false,
'mainBranchName': 'main (remote)',
'mainBranchOrder': 2}
}
}%%
gitGraph
commit
commit
branch 'mmq-patch-01 (remote)' order: 1
branch 'mmq-patch-01 (local)' order: 0
checkout 'mmq-patch-01 (local)'
commit
commit
checkout 'mmq-patch-01 (remote)'
merge 'mmq-patch-01 (local)'
checkout 'main (remote)'
merge 'mmq-patch-01 (remote)'
branch 'pt-patch-01 (local)' order:4
branch 'pt-patch-01 (remote)' order:3
checkout 'pt-patch-01 (local)'
commit
commit
checkout 'pt-patch-01 (remote)'
merge 'pt-patch-01 (local)'
checkout 'main (remote)'
merge 'pt-patch-01 (remote)'
This allows us to collaborate with others
Remote git
%%{
init: {
'theme': 'base',
'themeVariables': {
'primaryColor': '#9fe1ff',
'primaryTextColor': '#470044',
'primaryBorderColor': '#000000',
'lineColor': '#9158A2',
'secondaryColor': '#e79aff',
'tertiaryColor': '#fffc58'
},
'gitGraph': {
'showBranches': true,
'showCommitLabel': false,
'mainBranchName': 'main (remote)',
'mainBranchOrder': 2}
}
}%%
gitGraph
commit
commit
branch 'mmq-patch-01 (remote)' order: 1
branch 'mmq-patch-01 (local)' order: 0
checkout 'mmq-patch-01 (local)'
commit
commit
checkout 'mmq-patch-01 (remote)'
merge 'mmq-patch-01 (local)'
checkout 'main (remote)'
merge 'mmq-patch-01 (remote)'
branch 'pt-patch-01 (local)' order:4
branch 'pt-patch-01 (remote)' order:3
checkout 'pt-patch-01 (local)'
commit
commit
checkout 'pt-patch-01 (remote)'
merge 'pt-patch-01 (local)'
checkout 'main (remote)'
merge 'pt-patch-01 (remote)'
commit tag: 'v1.0.0'
When using a cloud-based remote, we can mint a DOI for the release for a snapshot of the code
Remote git
- If we compare version control to a tool such as a word processor
- Then
gitis Microsoft Word (but there are other options) - And GitHub is Microsoft365 (and again, there are other options!)
(funnily enough GitHub is owned by Microsoft)
GitHub
In the same way that we are focussing on Git, we are going to focus on GitHub for this course
- There are lots of things we can get GitHub to do that are not “version control” specific; we will link to these in the “extended reading” section at the end of the course.
- For now, lets think of it as a remote repository for
git.
Learn by doing
The best way to get used to using git is by actually using it, which brings us on to our next practical…