Workshop: Introduction to version control
(Generally, this workshop is offered at least once every week on a rotating basis. Check the DaBL calendar for up-to-date availability!)
Version control is a technique for managing the complexity of a digital project - especially one which is done collaboratively.
Contents
What is version control?
When a project is in its infancy, it is fairly straightforward to keep track of the files contained within; it can, however, become unmanageable very quickly - especially if revisions happen frequently. Version control can be imagined as a series of snapshots of a project's changes as time progresses; it effectively stores the entire history of the project. Each snapshot allows you to see exactly the state of the project at that time; you can travel back to any of these snapshots at any time.
The version control information (the files, the snapshot information, et al.) are held locally in a folder called a repository, or repo. Additionally, you can sync this repo to a server somewhere else - this copy of the repo called a remote. The remote can serve as a backup, as well as a way to share the repo with others who may also be working on the project.
A version control system (VCS) is a tool that handles all of this complexity: the repo itself, the snapshots, the remote(s), et al. There are many VCS's; the most popular is called git.
Why use version control?
The two advantages that version control provides:
- You are able to travel back in time to any point in a project and utilize it from that moment; and
- Management of a collaborative project is streamlined.
Utilizing version Control
Required tools
- A VCS - preferably git.
- A web-based system to host your remotes.
- Git is accessed through the command line interface (CLI) of your choice:
- Windows: you can use Git Bash (which should install when you installed git) or PowerShell; Bash is preferred because it is nicely color-coded.
- Linux/MacOS: terminal (which is installed with the operating system).
Acquiring a repo
There are two main ways to work with a repo: create one yourself, or use someone else's.
I want to create one myself!
- If you are only going to be working on this repository locally, then:
- In your CLI, navigate to the place you'd like to save the repo folder:
cd [path]
- Create the repo folder:
mkdir [repository folder name]
- Navigate into the repo folder:
cd [repository folder name
- Initialize the repository:
git init
- In your CLI, navigate to the place you'd like to save the repo folder:
- If instead, you are going to be working with a remote, then:
- Create a new repo on your remote of choice.
- Locate and copy the url for the git remote. Typically, this is in the top-right corner of the repo's main web page.
- In your CLI, navigate to the place you'd like to save the repo folder:
cd [path]
- Use the clone function of git to download a copy of the repo locally:
git clone [the_link_you_copied_in_step_2.git]
- You can now begin working with your repo!
I want to work with someone else's!
- This process is also simple: find a repo that you'd like to work with on a remote, and clone it onto your system:
- Find a remote repository you'd like to work with.
- Locate and copy the url for the git remote. Typically, this is in the top-right corner of the repo's main web page.
- In your CLI, navigate to the place you'd like to save the repo folder:
cd [path]
- Use the clone function of git to download a copy of the repo locally:
git clone [the_link_you_copied_in_step_2.git]
- You can now begin working with your repo!
Working with a repo
Typical workflow
When working on a project locally, your workflow will typically consist of the following:
- Modify files within the repository's folder (e.g. changing lines of code).
- Stage those changes:
git add [filename]
- Commit those changes to the repository:
git commit -m "[a message describing the changes]"
If you have a remote set up, then two steps bookend the workflow above:
- Pull any changes from the remote repository to your local copy:
git pull
- Modify files within the repository's folder (e.g. changing lines of code).
- Stage those changes:
git add [filename]
- Commit those changes to the repository:
git commit -m "[a message describing the changes]"
- Push the new commit to the remote repository:
git push [remote name] [branch name]
Terminology
repository: the location and contents of a project, including all version control information
commit (n): a particular snapshot of a project
commit (v): to take a snapshot of a project
local: a copy of project stored on your system
remote: a copy of project stored on a server somewhere else
branch: a particular timeline of snapshots - the main branch is typically called "master"; other branches may be used to develop new features without risking the master branch's stability
Common git commands
Command | Syntax | Function | Example usage |
---|---|---|---|
add | git add [filename]
|
stages files; these files will then be committed during a commit .
|
git add README.md
|
branch | git branch [-b] [new branch name]
|
create a new branch; the option -b flag will immediately checkout the new branch.
|
git branch experimental
|
checkout | git checkout [branch or snapshot name]
|
switches the repository over to a particular snapshot or branch; typically snapshots are referenced by a hash. | git checkout experimental
|
commit | git commit -m "[commit message]"
|
writes changes to the repo, otherwise known as "taking a snapshot". | git commit -m "changed header"
|
fetch | git fetch [remote branch name]
|
grabs the remote branch's snapshots not on the local machine and stored them locally; does not merge. branch name optional. | git fetch
|
log | git log [options]
|
shows a list of all the previous commits in the current branch. | git log
|
merge | git merge [branch to be merged]
|
inserts branch to be merged's snapshots into current branch's history. | git checkout master git merge experimental merges experimental into master
|
pull | git pull [options]
|
checks the remote for any new commits not on your local timeline; if so, downloads them to your local repo, then fast-forwards your local repo to the current snapshot. this is effectively a fetch>/code> followed by a |
git pull
|
push | git push [remote name] [branch name]
|
connects to the remote and copies any new snapshots on your system to it. | git push origin master
|
rebase | git rebase [-i] [snapshot or branch name]
|
folds all snapshots from current to target snapshot or branch name; this seems like editing history, but is very useful when working on larger teams, to condense your changes into one snapshot (which will be merged into the target branch) so that it doesn't flood the target branch with hundreds of commits from each collaborator. | git rebase -i master
|
status | git status [options]
|
checks the status of the repo; shows a visual distinguishing among files which have and haven't been included in the repo, files which have been modified and staged, et al. | git status
|
Sample git workflow
The typical git workflow will depend on whether you're working on a personal repository or on a team.
Simple personal workflow: | Team workflow: |
---|---|
External Links
Git's git's own reference manual
Roger Dudler's simple git guide
Git cheat sheet
A very thorough reference guide
Git on a team
Another git on a team source