Git/Introduction

Here, we will introduce the simplest git commands: creating a new repository, adding and committing files, removing files, reverting bad commits, throwing away uncommitted changes, and viewing a file as of a certain commit.

Creating a git repository

Creating a new git repository is simple. There are two commands that cover this functionality: git-init(1), and git-clone(1). Cloning a pre-existing repository will be covered later. For now, let's create a new repository in a new directory:

$ git init myrepo

Initialized empty Git repository in:

/home/username/myrepo/.git/ on Linux.
C:/Users/username/myrepo/.git/ on Windows.

If you already have a directory you want to turn into a git repository:

$ cd $my_preexisting_repo
$ git init

Taking the first example, let's look what happened:

$ cd myrepo
$ ls -A
.git

The totality of your repository will be contained within the .git directory. Conversely, some SCMs leave files all over your working directory (eg, .svn, .cvs, .acodeic, etc.). Git refrains, and puts all things in a subdirectory of the repository root aptly named .git.

Remark: to set the default directory where Git will point at each opening, under Windows right click on the shortcut, and change the path of the field called "start in".

Checking Your Status

To check the status of your repo, use the git-status(1) command. For example, a newly-created repo with no commits in it as yet should show this:

$ git status
On branch master

Initial commit

nothing to commit (create/copy files and use "git add" to track)

Get into the habit of frequent use of git-status, to be sure that you’re doing what you think you’re doing. :)

Adding and committing files

Unlike most other VCSs, git doesn't assume you want to commit every modified file. Instead, the user adds the files they wish to commit to the staging area (also known as the index or cache, depending on which part of the documentation you read). Whatever is in the staging area is what gets committed. You can check what will be committed with git-status(1) or git diff --staged.

To stage files for the next commit, use the command git-add(1).

$ nano file.txt
hack hack hack...
$ git status
# On branch master
#
# Initial commit
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#	file.txt
nothing added to commit but untracked files present (use "git add" to track)

This shows us that we're using the branch called "master" and that there is a file which git is not tracking (does not already have a commit history). Git helpfully notes that the file can be included in our next commit by doing git add file.txt:

$ git add file.txt
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#	new file:   file.txt
#

After adding the file, it is shown as ready to be committed. Let's do that now:

$ git commit -m 'My first commit'
[master (root-commit) be8bf6d] My first commit
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 file.txt

In most cases, you will not want to use the -m 'Commit message' form - instead, leave it off to have $EDITOR opened so you can write a proper commit message. We will describe that next, but in examples, the -m 'Commit message' form will be used so the reader can easily see what is going on.

You can use git add -A to automatically stage all changed and untracked files for inclusion in the next commit. Once a file is being tracked, git add -u will stage it if it has changed.

reset

If you change your mind about staging a file, and you haven’t committed yet, you can unstage it with the simplest form of the git-reset(1) command:

$ git reset file.txt

to unstage just the one file, or

$ git reset

to remove everything in the staging area.

git-reset has many more functions than this, for example:

To cancel the two latest commits without touching the files: git reset 'HEAD~2'.
To cancel the two latest commits and their modifications into the files: git reset HEAD~2 --hard.
To cancel the two last operations on the branch: git reset 'HEAD@{2}' (which uses git reflog). This can be used to cancel an undesired reset.

To exclude certain untracked files from being seen by git add -A, read on...

restore

git restore comes back to a version of the file specified in parameter^[1].

Excluding files from Git

Often there are files in your workspace that you don't want to add to the repository. For example, emacs will write a backup copy of any file you edit with a tilde suffix, like filename~. Even though you can manually avoid adding them to the commit (which means never using git add -A), they clutter up the status list.

In order to tell Git to ignore certain files, you can create an ignore file, each line of which represents a specification (with wildcards) of the files to be ignored. Comments can be added to the file by starting the line with a blank or a # character.

For example:

# Ignore emacs backup files:
*~

# Ignore everything in the cache directory:
app/cache

Git looks for an ignore file under two names:

.git/info/exclude — this is specific to your own personal copy of the repository, not a public part of the repository.
.gitignore — since this is outside the .git directory, it will normally be tracked by Git just like any other file in the repository.

What you put in either (or both) of these files depends on your needs. .gitignore is a good place to mention things that everybody working on copies of this repository is likely to want to be ignored, like build products. If you are doing your own personal experiments that are not likely to concern other code contributors, then you can put the relevant ignore lines into .git/info/exclude.

Note that ignore file entries are only relevant to the git status and git add -A (add all new and changed files) commands. Any files you explicitly add with git add filename will always be added to the repository, regardless of whether their names match ignore entries or not. And once they are added to the repository, changes to them will henceforth be automatically tracked by git add -u.

Good commit messages

Tim Pope writes about what makes a model Git commit message:

Short (50 chars or less) summary of changes

More detailed explanatory text, if necessary.  Wrap it to about 72
characters or so.  In some contexts, the first line is treated as the
subject of an email and the rest of the text as the body.  The blank
line separating the summary from the body is critical (unless you omit
the body entirely); tools like rebase can get confused if you run the
two together.

Write your commit message in the present tense: "Fix bug" and not "Fixed
bug."  This convention matches up with commit messages generated by
commands like git merge and git revert.

Further paragraphs come after blank lines.

- Bullet points are okay, too

- Typically a hyphen or asterisk is used for the bullet, preceded by a
  single space, with blank lines in between, but conventions vary here

- Use a hanging indent

Let’s start with a few of the reasons why wrapping your commit messages to 72 columns is a good thing.

Git log doesn’t do any special wrapping of the commit messages. With the default pager of less -S, this means your paragraphs flow far off the edge of the screen, making them difficult to read. On an 80 column terminal, if we subtract 4 columns for the indent on the left and 4 more for symmetry on the right, we’re left with 72 columns.
git format-patch --stdout converts a series of commits to a series of emails, using the messages for the message body. Good email netiquette dictates we wrap our plain text emails such that there’s room for a few levels of nested reply indicators without overflow in an 80 column terminal.

Vim users can meet this requirement by installing my vim-git runtime files, or by simply setting the following option in your git commit message file:

:set textwidth=72

For Textmate, you can adjust the “Wrap Column” option under the view menu, then use ^Q to rewrap paragraphs (be sure there’s a blank line afterwards to avoid mixing in the comments). Here’s a shell command to add 72 to the menu so you don’t have to drag to select each time:

$ defaults write com.macromates.textmate OakWrapColumns '( 40, 72, 78 )'

More important than the mechanics of formatting the body is the practice of having a subject line. As the example indicates, you should shoot for about 50 characters (though this isn’t a hard maximum) and always, always follow it with a blank line. This first line should be a concise summary of the changes introduced by the commit; if there are any technical details that cannot be expressed in these strict size constraints, put them in the body instead. The subject line is used all over Git, oftentimes in truncated form if too long of a message was used. The following are just a handful of examples of where it ends up:

git log --pretty=oneline shows a terse history mapping containing the commit id and the summary
git rebase --interactive provides the summary for each commit in the editor it invokes
If the config option merge.summary is set, the summaries from all merged commits will make their way into the merge commit message
git shortlog uses summary lines in the changelog-like output it produces
git-format-patch(1), git-send-email(1), and related tools use it as the subject for emails
git-reflog(1), a local history accessible intended to help you recover from mistakes, get a copy of the summary
gitk, a graphical interface which has a column for the summary
Gitweb and other web interfaces like GitHub use the summary in various places in their user interface.

The subject/body distinction may seem unimportant but it’s one of many subtle factors that makes Git history so much more pleasant to work with than Subversion.

Removing files

Let's continue with some more commits, to show you how to remove files:

$ echo 'more stuff' >> file.txt
$ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#	modified:   file.txt
#
no changes added to commit (use "git add" and/or "git commit -a")

Although git doesn't force the user to commit all modified files, this is a common scenario. As noted in the last line of git status, use git commit -a to commit all modified files without reading them first:

$ git commit -a -m 'My second commit'
[master e633787] My second commit
 1 files changed, 1 insertions(+), 0 deletions(-)

See the string of random characters in git's output after committing (bolded in the above example)? This is the abbreviation of the identifier git uses to track objects (in this case, a commit object). Each object is hashed using SHA-1, and is referred to by that string. In this case, the full string is e6337879cbb42a2ddfc1a1602ee785b4bfbde518, but you usually only need the first 8 characters or so to uniquely identify the object, so that's all git shows. We'll need to use these identifiers later to refer to specific commits.

To remove files, use the "rm" subcommand of git:

$ git rm file.txt
rm 'file.txt'
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#	deleted:    file.txt
#
$ ls -a
.  ..  .git

Note that this deletes the file from your disk. If you only want to remove the file from the git repository but want to leave the file in your working directory, use git rm --cached.

$ git commit -m 'Removed file.txt'
[master b7deafa] Removed file.txt
 1 files changed, 0 insertions(+), 2 deletions(-)
 delete mode 100644 file.txt

Reverting a commit

To revert a commit with another, use git revert:

$ git revert HEAD
Finished one revert.
[master 47e3b6c] Revert "My second commit"
 1 files changed, 0 insertions(+), 1 deletions(-)
$ ls -a
.  ..  file.txt  .git

You can specify any commit instead of HEAD. For example:

The commit before HEAD: git revert HEAD^
The commit five back: git revert HEAD~5
The commit identified by a given hash: git revert e6337879

Resetting a commit

git reset provides the same options as git revert, but instead of creating a revert commit, it just cancels the commit(s) and lets the file(s) uncommitted.

To cancel a reset, use: git reset 'HEAD@{2}'.

Throwing away local, uncommitted changes

To throw away your changes and get back to the most recently-committed state:

$ git reset --hard HEAD

As above, you can specify any other commit:

$ git reset --hard e6337879

If you only want to reset one file (where you have made some stupid mistake since the last commit), you can use

$ git checkout filename

This will delete all changes made to that file since the last commit, but leave the other files untouched.

Get a specific version of a file

To get a specific version of a file that was committed, you'll need the hash for that commit. You can find it with git-log(1):

$ git log
commit 47e3b6cb6427f8ce0818f5d3a4b2e762b72dbd89
Author: Mike.lifeguard <myemail@example.com>
Date:   Sat Mar 6 22:24:00 2010 -0400

    Revert "My second commit"
    
    This reverts commit e6337879cbb42a2ddfc1a1602ee785b4bfbde518.

commit e6337879cbb42a2ddfc1a1602ee785b4bfbde518
Author: Mike.lifeguard <myemail@example.com>
Date:   Sat Mar 6 22:17:20 2010 -0400

    My second commit

commit be8bf6da4db2ea32c10c74c7d6f366be114d18f0
Author: Mike.lifeguard <myemail@example.com>
Date:   Sat Mar 6 22:11:57 2010 -0400

    My first commit

Then, you can use git show:

$ git show e6337879cbb42a2ddfc1a1602ee785b4bfbde518:file.txt
hack hack hack...
more stuff

Git Checkout Is Not Subversion Checkout

If you are coming to Git after having used the Subversion centralized version-control system, you may assume that the checkout operation in Git is similar to that in Subversion. It is not. While both Git and Subversion let you check out older versions of the source tree from the repository, only Subversion keeps track of which revision you have checked out. Git does not. git-status(1) will only show you that the source tree does not correspond to the current branch HEAD; it will not check whether it corresponds to some prior commit in the history.

`diff` and `patch`: The Currency of Open-Source Collaboration

It is important to understand early on the use of the diff(1) and patch(1) utilities. diff is a tool for showing line-by-line differences between two text files. In particular, a unified diff shows added/deleted/changed lines next to each other, surrounded by context lines which are the same in both versions. Assume that the contents of file1.txt are this:

this is the first line.
this is the same line.
this is the last line.

while file2.txt contains this:

this is the first line.
this line has changed.
this is the last line.

Then a unified diff looks like this:

$ diff -u file1.txt file2.txt
--- file1.txt   2014-04-18 11:56:35.307111991 +1200
+++ file2.txt   2014-04-18 11:56:51.611010079 +1200
@@ -1,3 +1,3 @@
 this is the first line.
-this is the same line.
+this line has changed.
 this is the last line.
$

Notice the extra column at the start of each line, containing a “-” for each line that is in the first file but not in the second, a “+” for each line that is in the second file but not the first, or a space for an unchanged line. There are extra lines, in a special format, identifying the files being compared and the numbers of the lines where the differences were found; all this can be understood by the patch utility, in order to change a copy of file1.txt to become exactly like file2.txt:

$ diff -u file1.txt file2.txt >patch.txt
$ patch <patch.txt
patching file file1.txt
$ diff -u file1.txt file2.txt
$

Notice how the second diff command no longer produces any output: the files are now identical!

This is how collaborative software development got started: instead of exchanging entire source files, people would distribute just their changes, as a unified diff or in patch format (same thing). Others could simply apply the patches to their copies. And provided there were no overlaps in the changes, you could even apply a patch to a file that had already been patched with another diff from someone else! And so this way changes from multiple sources could be merged into a common version with all the new features and bug fixes contributed by the community.

Even today, with version-control systems in regular use, such diffs/patches are still the basis for distributing changes. git-diff(1) and git-format-patch(1) both produce output which is compatible with diff -u, and can be correspondingly understood by patch. So even if the recipient of your patches isn’t using Git, they can still accept your patches. Or you might receive a patch from someone who isn’t using Git, and so didn't use git-format-patch, so you can’t feed it to git-am(1) to automatically apply it and save the commit; but that’s OK, you can use git-apply(1), or even patch itself on your source tree, and then make a commit on their behalf.

Conclusion

You now know how to create a new repository, or turn your source tree into a git repository. You can add and commit files, and you can revert bad commits. You can remove files, view the state of a file in a certain commit, and you can throw away your uncommitted changes.

Next, we will look at the history of a git project on the command line and with some GUI tools, and learn about git's powerful branching model.

↑ https://git-scm.com/docs/git-restore

[1] ttps://git-scm.com/docs/git-restore

[1]