A Git Walkthrough

By Zach Dennis on 21 06 2012

When interacting closely with a client's internal team, we make a special effort to set standards for a consistent workflow. A clear, visible process works wonders for keeping the entire team on the same page and working in the same direction. It also quickens the pace for new team members to get up to speed so they can begin contributing earlier.

One thing I've noticed in trying to communicate a consistent workflow is the varying level of knowledge of the tools being used. One tool in specific is git.

Workflows are often high level and assume everyone participating has a certain knowledge of the tools to employ the workflow successfully. Experience has shown me this is not always the case. There's no fault to be assigned here; it's just a fact of life.

Git is huge and there are a million things to learn and do. The makeup of our teams will likely involve someone who revels in exploring the nooks and crannies of git as well those who want to learn just what's necessary to use it productively.

This post walks through the common git commands used in our workflow, roughly in the order we use them. It takes time to expand on each command but not to get carried away with git's internals. This post isn't an in-depth post on our workflow or an in-depth post on git. Rather, this post is dedicated to providing a shared foundation of knowledge for any team looking to use git.

Along the way I'll share links to other resources if you wish to dive further into a particular topic or git command. My intention is that this post will not only be informational, but a good resource to refer back to.

If you don't have git installed you'll need to install it in order to use it. There's a great section for installing Git on different platforms in the Installing Git section of the online Pro Git book.

With the introduction out of the way, let's start where we all start when encountering a project using git: cloning a repository.

Cloning a repository

The first step in working on any project that uses git is to have a copy of it locally on your machine. If we wanted to work on Jeremy Ashkenas's coffee script project we would issue use git's clone command to do just that:

git clone https://github.com/jashkenas/coffee-script.git

This will checkout the latest version of the project into a coffee-script/ directory in your current working directory. The name of the directory created defaults to the name of the repository being cloned. You can override the default name by specifying a directory name after the URL, like this:

git clone https://github.com/jashkenas/coffee-script.git jeremys-coffee

This will checkout the latest version of the project like before, but this time it will be placed in the jeremys-coffee/ directory.

Cloning isn't the only way to get started with git, it's just the most common. You can also initialize an existing directory with git making it a git repository. You can do this with the init command:

cd my-proj/
git init

Now the my-proj/ directory is a full-fledged git repository and is ready to be committed to. For more information on initializing repositories check out the Getting a Git Repository section in the Pro Git book.

Having a basic understanding of what cloning does

Now that you've got this freshly cloned project it may help to understand what has happened locally. Within the project directory git has done four important things:

  • created a hidden .git/ directory with the entire project's history
  • placed a git config in .git/config for this project
  • created an origin remote
  • checked out a local copy of the master branch

Let's briefly look into each of these four areas starting with the .git/ directory.

.git/

You will likely never spend time in the .git/ directory except for editing the .git/config file. Often times you won't even do that as that is often done for you as the result of using various command-line or GUI-based tools for git.

Since you won't spend much time in the .git/ directory I'm not going to spend time on its contents. Know that it exists and that it contains everything git needs to do its job. If you delete it your project will no longer have a history as that was contained in .git/ directory.

For more information on what's in the .git/ directory check out what's inside your git directory on gitready.

The Pro Git book by Scott Chacon, published by Apress, is freely available online and is a great resource for becoming more familiar with git.

gitready.com is another great reference which has been collecting information and tips for git since 2009. It's organized them all into beginner, intermediate, advanced sections. And it provides a great list of resources.

gitimmersion.com is an online learning lab dedicated to teaching you git through following a step by step set of labs and tutorial.

.git/config

The .git/config is a plain text file that houses the configuration for your git repository. You can edit it directly, but most of the time it's easier to interact with it through different command-line or GUI-based tools. For example, to see the config for a project you can issue the following command:

git config --list

This will not only list out the project specific configs but also any inherited configuration that come from any global or user-based git settings.

For a great reference of available options and how to set them check out the customizing git page from the Pro Git book.

For a more thorough and always up to date version of all possible options you can always refer to the following command, though it's not nearly as user-friendly:

git config --help

Your first remote: origin

Git at heart is a decentralized version control system. This means that there doesn't have to be a single authoritative server that keeps track of everything. One of the ways that git allows for this is with the idea of a remote.

A remote is nothing more than a name that identifies another git repository. When we cloned Jeremy Ashkenas's coffee-script project earlier git automatically set us up with a default remote named origin. We can check to see what remotes git currently knows about with the following command:

> git remote
origin

After issuing the above command git shows us the one remote it currently knows about. Let's go ahead and find out more information about this remote:

git remote show origin

This will print out a lot of information but the part we're interested in at this time is the Fetch URL and the Push URL. They are both set to the URL we used to clone the coffee-script repository. When you try to fetch or pull updates it will get them from the Fetch URL and when you want to push changes it will send them to the Push URL.

You can do a lot of tweaking of remotes, but that's a bit beyond the goal of this post so if you're interested in finding out more about remotes check out working with remotes.

Besides being the default name for the first remote created after cloning, "origin" is also the expected default for all git commands that interact with remote repositories. You can have more than one remote repositories configured for your project and you can name them whatever you'd like, but "origin" is always the assumed default if you omit the remote repositories name.

Take for example this command:

git fetch

Here we didn't tell git where we wanted to fetch changes from so it assumes we want to fetch them from the origin remote. If we wanted to be specific we could have issued the following command which explicitly states which remote to fetch from:

git fetch origin

Your first branch: master

The last thing I want to touch on related to cloning is the branch that you will initially be on after you clone a project.

By default, git will check out a local copy of the master branch from the remote repository you just cloned. If you navigate into your project directory you can check this out by running the following command:

git branch

You will likely see the following output:

* master

The asterisk represents that master is the currently checked out branch. Additionally, git has automatically set up your local master branch to track the remote master branch. We'll skip what tracking is for now and touch on that later.

Now, everything I just said is mostly true. Most of the time the default branch git will check out for you is master. But, this can be changed. For example, at the time of this post going up Chris Eppstein's Compass repository has changed the default branch to stable. You can check this out by cloning Compass and seeing what the current branch is:

> git clone https://github.com/chriseppstein/compass.git
> cd compass
> git branch
* stable

If you're interested in changing the default branch in your repository check out the setting default git branch post by François Marier.

A common practice with git users is to update their shell prompt to tell them what branch they're on so you don't have to type git branch every time. There are a lot resources available online (just google "git bash prompt"), but as a quick starter check out this gist.

Now that we have an basic understanding of what happens when we clone a project let's look at doing work and putting git to use.

Doing Work

How we work with git is pretty straightforward. We follow a rebase style workflow and in this section we're going to look at the commands we use in that workflow.

Sandbox Yourself

As you saw above, you have one local branch checked out after you clone a repository: master. In our experience it's best to avoid working directly in the master branch and instead to use topic (aka feature) branches to do any work.

Once the work is done and you're happy with it then to merge it back into master and then ultimately to push those changes back up to the remote repository which is most likely origin.

There are more detailed explanations of why this is, but I'm going to go out on a limb and forego those for now so we can focus on the commands used to employ this workflow.

Be up to date before creating your topic branch

Before creating a topic branch, it's good practice to be up to date first as it avoids unnecessary conflicts later. Since we use master as the parent of our topic branches we want to make sure it's up to date. Here's one way we do that:

git checkout master
git pull --rebase

git checkout master will check out your local master branch and make that your current branch.

git pull --rebase will fetch all remote changes and pull them into your local master branch. This command uses the default origin remote as the repository to pull changes from since we're not explicitly telling git what remote to use.

There's more than one way to do anything in Git. Another way to make sure your local master is up to date is the following:

git fetch
git rebase origin/master master

This tells git to fetch all remote changes from the default remote (which is origin) in the first command. And in the second command we tell git to rebase in the changes from the master branch on remote origin into our local master branch.

You will now be on your up to date master branch and you're ready to create a topic branch.

Creating a topic branch

We do all of our work in topic branches. Each topic branch is used for a specific set of work on our projects. Often times it will refer back to a card, feature, bug, issue, or chore in our project management tool.

99% of the time we create our topic branches off from master:

# make sure we're on our local _master_
git checkout master

# create our topic branch
git checkout -b 1234_sales_tax_calculations

The second command above creates a branch named "1234_sales_tax_calculations" off from our current branch (which is master). After it creates our branch git will then checkout the branch we just created. This means that our current branch after issuing the above command will be 1234_sales_tax_calculations.

The above checkout -b command uses whatever branch we are currently on as the place to branch from. If you're not on master you can still create your topic branch without first checking out master. A shortcut is to simply append master to the command which would then become this:

git checkout -b 1234_sales_tax_calculations master

This command is identical to the two commands we issued earlier. It's just a more succinct way of achieving the same thing.

You can name a branch whatever you'd like. A simple rule of thumb we follow is to have the branch name be meaningful. For us, this often involves tying in a reference number and/or title from our project management tool into the branch name. This makes it easy to cross reference between our various tools.

Now that we've got a topic branch let's look about making changes and committing them.

Adding files

Git doesn't track changes you don't explicitly tell it to track so you need to tell it about any new or modified files. The git add command is used to tell git about the changes you want to track.

At any time you want to see what changes git thinks you have you should issue the git status command. And once you're up to speed with the changes you want to commit you can go about adding them for git to commit. Here are a few ways to do that:

# add all files under the current directory
git add .

# add all files under the lib/ directory
git add lib/

# add foo1_spec.rb and foo2_spec.rb specifically
git add spec/foo1_spec.rb
git add spec/foo2_spec.rb

git status will show you the state of things in the repository and I'm going to skip the details of it since there's a great write-up on checking the status of your files in the Pro Git book.

Deleting files

You may also find yourself in a situation where you want to delete a file. You may have even deleted it from the file system, but deletion is another change that you have to tell git about.

There are two common ways to do this: git rm and git add -u.

git rm works like git add: tell it the path of the file(s) you want deleted:

# remove a specific file
git rm path/to/file

# remove an entire directory
git rm -r path/to/directory/

Another way to remove files is to use the -u option that comes with git add. It will tell git that you want to remove any files you have deleted on the file system.

# Add all files under the current directory
# and also remove any deleted files under the current directory
git add -u .

# Add all files under lib/ directory
# and also remove any deleted files under lib/
git add -u lib/

Now that we know about telling git what changes to keep or remove let's look at committing.

Committing changes

Once we've told git about all of the changes you want to track or otherwise remove we're ready to commit. Let's say we just implemented the sales tax calculations and we want to commit the changes to our 1234_sales_tax_calculations topic branch. Here's how'd we commit those changes:

git commit -m "Implemented the Michigan sales tax calculations [#1234]"

The -m option is the commit message. If you don't supply it on the command line git will open your default text editor (determined by the EDITOR environment variable) and expect you to supply a message. Git won't let you commit without a commit message.

git commit has an -a option that tells git to automatically add changes (including removing files) from all known files before actually committing. Let's say we modified lib/sales_tax.rb and deleted lib/sales_tax_helper.rb and we want to add both of these changes and commit with one command:

git commit -am "Refactoring SalesTaxHelper into SalesTax module [#1234]"

As you can imagine the -a option is very convenient as it lets you bypass having to add or remove specific changes if you know you want to commit all changes. One behavior to note about the -a option is that it will not work for new files that git doesn't already know about. You still have to issue the git add command for new files.

Now that our changes are committed we could keep making changes (rinse/repeat adding files, committing, etc), call our work done and merge back into master, or publish our branch so it can be backed up and shared with others.

Last year, our fellow human Chris Rittersdorf posted an excellent write-up on crafting good commit messages. Rather than repeat what he's already shared I'll point you to his Clean Commits post.

There are a few other useful commands that we use while working in our topic branch. Before we look ahead to publishing our branch or merging into master let's take a moment to look at amending, rolling back, and doing away with commits.

Amending Commits

It's fairly common to make a bunch of changes and commit only to realize that you forgot to add a new file or remove a file you meant to delete. In these situations you can do another commit to for those changes, but you don't have to as git gives us the ability to amend the previous commit.

Let's say that I forgot to add the README file in my last commit. Here's how I'd go about amending my previous commit:

git add README
git commit --amend

This will add the README file to the previous commit, but it will also open up my default text editor in case I want to update the commit message. If you don't want be prompted to update the commit message you can tell git what commit message to use. This version of the command is slightly longer:

git add README
git commit --amend -C HEAD

Most of the time you don't want to edit the commit message, you just want to amend the changes. Since git doesn't have an amend command so we can create one by aliasing it to the longer version of the amend command above:

git config --global alias.amend "commit --amend -C HEAD"

Now we can amend to the previous commit without being prompted to update the commit message:

git add README
git amend

That's much easier to remember and since we aliased it globally it will be available on any git repository we're working on, not just your current project.

A cousin to amending commits is rolling them back, let's look at those next.

Rolling back a commit

Let's say at the end of the day you committed a work in progress (WIP) commit before heading home. Now, it's the next morning and you want to continue working on those changes but you don't want a WIP commit. You could keep working and amend the WIP commit or you could roll back the WIP commit so you keep all of the changes, but remove the commit itself:

# assuming you're on the topic branch you with the WIP commit
git reset --soft HEAD~1

The git reset command can be used to roll back one or more commits. The term HEAD refers to our currently checked out branch -- our topic branch. In plain english this just means to rollback 1 commit in our current branch and to keep the changes.

An advanced but incredibly useful part of git is it's ability to let you rewrite history. While powerful and potentially destructive when you wield this power for good it can make things that are difficult or impossible in other version control systems quite easy.

If you're interested in learning how to change multiple commit messages, re-order/squash/split commits you'll want to see the Rewriting History page in the Pro Git book.

The --soft flag is what tells git to keep the changes. If you don't want to keep the changes then let's take a look at doing away with the last commit.

Doing away with a commit

Whereas reset --soft rolls back a commit and keeps the changes you can use reset --hard to roll back a commit and delete the changes. Let's say we did another WIP commit but then we realize it's junk and we don't want to keep the commit or any of the changes. We just want it gone. Here's the command to do just that:

# assuming you're on the topic branch you with the WIP commit
git reset --hard HEAD~1

The command is identical to rolling back a commit except we issue --hard instead of --soft, but everything else is the same.

In the above examples of git reset we saw how it could be used to rollback or undo a single commit. Part of the HEAD~1 madness is that git let's you do more than one commit. We could actually roll back the last 10 commits, but keep all of those changes on our file system:

git reset --soft HEAD~10

Or do away with the last 6 commits completely:

git reset --hard HEAD~6

There are plenty of other things that you can do related to reverting commits, reverting files, etc which you may be interested in. If so, here's a list of resources for those topics:

Now that we've seen a number of ways in which we can work with git as we're working in our topic branches once we're ready to share those changes we want to publish our topic branch.

Publishing your topic branch

Although git is a decentralized version control system we typically treat the origin remote as the authoritative repository for a project. This impacts our workflow when we have changes locally that don't exist anywhere else. Having changes on one person's machine is risky as those changes will all be lost if anything happens to that machine.

To mitigate this risk we publish our topic branches early even if we aren't planning on someone else work in them. So far in this article we've created a 1234_sales_tax_calculations branch that has changes that don't exist anywhere but our machine. Here's the command we'd issue to push a copy of that branch to the origin remote:

git push --set-upstream origin 1234_sales_tax_calculations

If you're worried about remembering that command don't worry you don't have to. You can simply issue git push and if your branch has not already been published then git will conveniently tell you what to do. Here's an example of that:

> git push
fatal: The current branch 1234_sales_tax_calculations has no upstream branch.
To push the current branch and set the remote as upstream, use

git push --set-upstream origin 1234_sales_tax_calculations

You can copy and paste the command that git tells you and just run that. From that point forward your local topic branch will be tied to a copy of that branch on the origin remote.

Once our local branch has been published remotely we can go on working in our branch and committing changes to it. Whenever we want to push (publish/backup) any more local commits us to the origin remote we can now simply issue a git push and git will send our local commits to the remote 1234_sales_tax_calculations branch:

git push

With our branch published let's look at a few other commands we commonly use.

Merging your work into master

Once we've finished all our commits, pushed our changes, and have decided our topic branch is ready for the primetime it's time to prepare it for getting merged back into master.

The first thing we start with is making sure our topic branch is up to date with master.

Make sure your topic branch is up to date

# make sure master is up to date
git checkout master
git pull --rebase

# make sure our topic branch is fully pushed
git checkout 1234_sales_tax_calculations
git push

# make sure our topic branch is up to date with master
git rebase master

The above sequence of commands go a long way to saving headaches. By first making sure master is up to date we can avoid any unnecessary conflicts later that would have resulted from that not being the case.

After we update master we make sure our topic branch is pushed. The reason I like doing this is it offers me a save point if I decide that I want to revert my topic branch back to before I updated it with the changes in master.

The last step is rebasing master into our topic branch. This will essentially create a new branch based off from master and then replay each of the commits in our topic branch one by one. This let's us fix any conflicts on a commit-by-commit basis.

If there are no conflicts then git will give you a message similar to:

git rebase master
First, rewinding head to replay your work on top of it...
Fast-forwarded 1234_sales_tax_calculations to master.

At this point you're ready to merge into master, but first let's consider what to do when there are conflicts.

Resolving conflicts from a rebase

Here's an example of our rebase having a conflict:

git rebase master
First, rewinding head to replay your work on top of it...
Applying: Refactoring SalesTaxHelper into SalesTax module [#1234]
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
Auto-merging lib/sales_tax.rb
CONFLICT (content): Merge conflict in lib/sales_tax.rb
Failed to merge in the changes.
Patch failed at 0001 Refactoring SalesTaxHelper into SalesTax module [#1234]

When you have resolved this problem run "git rebase --continue".
If you would prefer to skip this patch, instead run "git rebase --skip".
To check out the original branch and stop rebasing run "git rebase --abort".

In this rebase the lib/sales_tax.rb file conflicted with a change in master. If we were to run git status you would see a line similar to:

both modified:      lib/sales_tax.rb

The both modified text indicates that there's a conflict in that file. This will be displayed next to each file that had a conflict. We need to go in and resolve the conflict. For resolving conflicts I'm going to point you to the basic merge conflicts section in the Pro Git Book.

Once you've resolved the conflict you'll want to tell git about it with git add:

git add lib/sales_tax.rb

Now we're ready to tell git to continue the rebase:

git rebase --continue

Git will continue to apply each of our commits one by one until it's done. If there are other commits that have conflicts we'll have to go through the same steps of resolving the conflicts, adding the files, and continuing the rebase.

Once we're done we typically do a sanity check of the project like launching the application and running our test suite.

If everything looks good we force push our topic branch.

Force pushing the topic branch

git push -f

This will forcefully push our local 1234_sales_tax_calculations branch to the origin remote. The reason we force push is because when we rebased master into our topic branch it changed the history of our topic branch. If we had tried to simply push the changes git would have rejected them:

> git push
To git@mygitrepos:zdennis/my-project.git
! [rejected] 1234_sales_tax_calculations -> 1234_sales_tax_calculations (non-fast-forward)
error: failed to push some refs to 'git@mygitrepos:zdennis/my-project.git'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes (e.g. 'git pull') before pushing again. See the
'Note about fast-forwards' section of 'git push --help' for details.

In our workflow we treat master as the authoritative history. So when it comes down to updating our topic branches we use rebase because we want to maintain master's history. It makes life easier for everyone in the long run.

Finally, merging into master

Now that our topic branch is up to date with master and it's been pushed we're ready to merge it into master:

# get back on master
git checkout master

# merge our topic branch into master
git merge 1234_sales_tax_calculations

# push the latest changes to master
git push

95% of the time this works flawlessly. However, the other 5% of the time someone may have pushed to master while you were rebasing master into the topic branch. If the git push fails after you merge into master DO NOT force push over master. Instead pull rebase any changes that may have been pushed:

git pull --rebase

Using pull --rebase is consistent with our view on treating master as the authoritative branch and the origin remote as the authoritative repository. We want to keep the history of master on the origin remote intact, so we use pull --rebase to rebase any new changes that someone else may have pushed to master into our local master.

Part of our reasoning for doing this is that there are likely more people that have already fetched the changes on origin master than who have the changes from our topic branch (likely just 1 person: us). It's easier for us to pull --rebase and resolve any conflicts on our end than it is to change master and cause everybody else pain and suffering along the way.

If there are conflicts with any new changes the process for resolving them is the same as before.

Summary

We've just walked through an entire set of commands which are at the core of many git workflows, including ours at Mutually Human. Using them on projects becomes a matter of rinsing and repeating these steps, and often re-arranging the ones in the middle. For example, you may rebase in changes into your topic branch from master more than once.

In the context of git's overall capabilities, we visited only a small subset of what git has to offer. I tried to pick the most common commands used in our workflow and expand a little bit on each one of them. Not only are there several commands not covered, but there are many variations on the above commands that I failed to mention.

There are plenty of git resources out there, and I didn't want this post to be the place to expand on the details of each command. Instead, I wanted to use this as a way to see how certain commands are used in a workflow. Rather than break all of these commands up into their own posts, I wanted to keep them together to show how they are connected.

In the days ahead, I look forward to building on this post, exploring more git commands, and expanding the discussion of our workflow. Until then, I hope this post connected some dots which may have been missing before, or reinforced some assumptions you may have had. If you have any questions or comments feel free to hit us up on twitter: @mutuallyhuman / @zachdennis.