Using git for Pintos
You are highly encouraged to use git for the Pintos projects. Git is an open-source industry standard revision control system that makes collaborating on software much easier.
Git principles
A git repository is a directory storing one or more branches of a project. A branch consists of a series of commits, each of which is a snapshot of your source tree accompanied by a commit message and some other metadata including the date of the commit and, most importantly, a list of parent commits. There can be multiple parent commits, because you can merge two branches of a repository. Since each commit specifies its parent or parents, a branch simply needs to identify its most recent commit, as doing so recursively identifies all prior commits in the history.
One of the great things about git is that it is a peer-to-peer system. Git makes it easy to push and fetch branches between repositories. Conceptually, a branch contains a series of copies of versions your source tree, one per commit. However, through the magic of delta compression and naming objects by collision-resistant message digests1, in reality a branch consumes much less space than the sum of all its source trees. Similarly, pushing and fetching branches to and from remote repositories consumes much less bandwidth than would naïvely copying source trees back and forth.
There are two kinds of git repositories. A bare repository just contains git branches and their commits. You can use a bare repository as a place for multiple users to push and fetch branches, but you can’t directly work on a project in a bare repository, because the contents of source trees in commits are stored compressed in git’s internal format. Conversely, you can have a non-bare repository that contains a working tree. A working tree is the source tree of a particular commit that has been checked out into the file system, where you can edit it and create a new commit whose parent is the previously checked out commit.
Git takes a toolkit-based approach to revision control. The
git
command has a bunch of different subcommands for doing
various things. There are low-level commands, known as
plumbing, that directly manipulate tree objects, commits, and
branches. Then there are higher-level, more user-friendly commands known
as porcelain, implemented in terms of the low-level plumbing.
This is a very empowering architecture because, even though most users
employ porcelain commands most of the time, if the porcelain doesn’t do
what you need, you have direct access to the plumbing right from the
command line. However, this approach does mean that git can have an
intimidating number of different subcommands and options, and that
generally there are multiple ways to accomplish a particular task.
This guide will give you one suggested way of approaching the Pintos projects, but it is not the only way to do things. We encourage you to understand how these commands work. Learning git now will pay dividends for the rest of your career.
Git workflows
Because git is peer-to-peer, it enables many different workflows. In one workflow, you use a (usually bare) repository as a centralized, authoritative copy of a project. Individual users clone this repository, develop features in their own local copies, and then push an updated branch to the central repository when they are ready to share changes with other developers.
In another workflow, each user has his or her own repository, and users pull changes from one another’s repositories, but no single repository is the authoritative version.
There are hybrid workflows. For instance, it is common to have an authoritative repository to which only a project maintainer can write, but to have many contributors. In that case, contributors send “pull requests” to the maintainer. In industry, you will often find workflows where a code review is required before merging someone’s pull request. In particular, you may have the authority to review and merge your co-worker’s pull requests, and they may be allowed to review and merge yours, but no one is allowed to merge their own changes into the central repository.
For the Pintos projects, because you only have three developers, we recommend that you use a centralized repository that everyone can push to, and that you don’t bother with pull requests. This will be the simplest, and as long as no one does a force-push (which deletes history), you can always go back and look at history to fix any big problems.
Creating a shared git repository on AFS
Before creating a git repository, you need to create a shared
directory that your team can access. For this, you will want to use the
fs sa
and fs la
commands that let you set and
examine the access control lists on AFS directories. Make sure you are
in AFS, i.e., in your home directory on myth, and in not your NFS home
directory on rice. On rice, you can find your AFS directory under
/afs/ir.stanford.edu/users/a/b/abc
if your sunet ID is
abc
.
Suppose your teammates’ sunet IDs are aaa
and
bbb
. You can set the access control list as follows:
$ cd
$ mkdir pintos.git
$ fs sa -dir . -acl aaa l -acl bbb l
$ fs sa -dir pintos.git -acl aaa rlidwka -acl bbb rlidwka
Line 1 moves to your home directory. Line 2 creates a
pintos.git
directory. Line 3 allows your teammates to look
up (but not read or write) files in your home directory, so they can get
to pintos.git
. Line 4 gives your teammates complete
permissions on the pintos.git
subdirectory. At the end of
the quarter, if you want to remove permissions, you could say the
following in your home directory:
$ fs sa -dir . -acl aaa none -acl bbb none
$ fs la
[verify acl is satisfactory]
Now you can create a bare git repository in pintos.git
as follows:
$ cd pintos.git
$ git --bare init -b master
Initialized empty Git repository in /afs/ir.stanford.edu/users/a/b/abc/pintos.git/
$ cd ..
If you get an error about the -b
flag being unsupported,
then you have an older version of git. No big deal, just run
git --bare init
. (Newer versions of git have a
configuration-dependent name for the default branch, so supplying
-b master
will avoid unexpected behavior and suppress
warning messages.)
You now have a git repository that you and your teammates can clone with the following command:
The pintos
directory will now be a non-bare repository
configured with a remote called origin
that corresponds to
the bare pintos.git
repository you set up for your team.
However, the working tree in the pintos
directory is not
going to be very useful yet because the repository you cloned doesn’t
have anything in it!
Copying the Pintos source tree into your git repository
Since git is so good at fetching and pushing branches back and forth, we are making the Pintos source code available in a git repository. You don’t have to fetch the code this way (you can download the code as an archive file as well). But if for any reason we need to make bug fixes to Pintos in the middle of the quarter, it will be much easier to merge in our changes if you have fetched the starter code with git.
Git allows you to store the locations of other git repositories,
called remotes, in the configuration of your git repository. As
previously mentioned, if you cloned a repository, the new repository has
a remote called origin
corresponding to the source of the
clone. However, you can create a second remote corresponding to
the official CS212 Pintos starter code (let’s call that remote
cs212
) as follows:
$ cd pintos
$ git remote add cs212 http://cs212.scs.stanford.edu/pintos.git
$ git fetch cs212
warning: redirecting to https://web.stanford.edu/class/cs212/pintos.git/
From http://cs212.scs.stanford.edu/pintos
* [new branch] master -> cs212/master
$ git checkout --no-track -B master cs212/master
Reset branch 'master'
Your branch is based on 'origin/master', but the upstream is gone.
(use "git branch --unset-upstream" to fixup)
$ git push -u origin master
Line 10 changes directory into the non-bare repository you created by
cloning pintos.git
as above. Line 11 adds the git
repository through which we are distributing the assignment as a remote
named cs212
. Line 12 copies all the branches in the
cs212
repository into the local repository, under the
namespace cs212
, so that you can refer to the branch called
master
in the remote repository as
cs212/master
in your local repository.2
cs212/master
is known as a remote-tracking branch,
which is just there to supply a cache of the data in a remote
repository, not to contain any local work.
Line 16 is where something useful finally happens. After running this
command, you will see new directory called src
appear that
actually contains the files for the assignments. The git checkout
command copies the contents of a particular git commit (in this case
cs212/master
) into your working tree. Generally, git has a
notion of a currently active branch, named HEAD
3. You can’t point HEAD
to a remote-tracking branch, because that’s just a cache of some remote
repository, not a branch you should be working on locally. The option
-B master
says “overwrite the current meaning of the branch
called master
and set it to the latest commit of the
cs212/master
branch, and also point HEAD
to
master
.” In general, -B
is a slightly
dangerous option so use it carefully. More often, you will want to use
lowercase -b
which will create a branch that doesn’t exist
but refuse to clobber an existing branch. Local branches can have a
notion of a remote branch they are tracking. By default, if you create a
branch from cs212/master
, git will think you will
ultimately want to synchronize your work to the cs212
remote, but you can’t do that since it’s the main class repository. The
--no-track
option says don’t adjust master
to
track cs212/master
.
Finally, after line 16 makes master
identical to
cs212/master
, line 20 copies the contents of the
master
branch into the origin
repository,
which is the one you are sharing with your teammates. This command is
short for git push -u origin master:master
, which says take
the contents of the local branch called master
(to the left
of the colon) and copy it on to the branch master
in the
origin
repository. The -u
flag says configure
origin/master
as the upstream remote-tracking
branch of the local branch master
. (Technically that flag
is redundant here, but you usually want to use it when first pushing to
a new repository.)
At this point, if your teammates clone your shared
pintos.git
repository, they will get the contents of the
project.
Another way to populate pintos.git
Instead of cloning the shared git repository you will be using with
your teammates, another way to get started with git is to clone the
(read-only) class git repository, but then reconfigure it to work with
your shared repository instead of the class repository. To do this,
first create your shared pintos.git
repository with
appropriate AFS permissions as described above, but do not clone it.
Instead, clone the class repository as follows:
$ git clone -o cs212 http://cs212.scs.stanford.edu/pintos.git
Cloning into 'pintos'...
warning: redirecting to https://web.stanford.edu/class/cs212/pintos.git/
$ cd pintos
$ git remote add origin /afs/ir.stanford.edu/users/a/b/abc/pintos.git
$ git push -u origin master
Enumerating objects: 638, done.
Counting objects: 100% (638/638), done.
Delta compression using up to 48 threads
Compressing objects: 100% (548/548), done.
Writing objects: 100% (638/638), 5.32 MiB | 49.48 MiB/s, done.
Total 638 (delta 82), reused 638 (delta 82), pack-reused 0
remote: Resolving deltas: 100% (82/82), done.
To /afs/ir.stanford.edu/users/a/b/abc/pintos.git
* [new branch] master -> master
branch 'master' set up to track 'origin/master'.
Here line 9 creates a new (non-bare) git repository by cloning the
read-only class git repository. The argument -o cs212
says
call the source of the clone cs212
rather than the default
origin
. Line 13 configures a second remote called
origin
and points it to the shared repository you created
on AFS. Note, however, that at this point the master
branch
of your repository is tracking cs212/master
rather than
origin/master
. However, in line 14 you push to origin using
the -u
flag, which tells git to change the “upstream”
branch that master
is tracking to be the branch and origin
you are pushing to, namely origin/master
.
Working with git
Once you have set up a git repository and everyone has cloned it, the following commands will be useful. (We assume most students have seen these, if not there are many introductory git tutorials out there.)
git pull
: A combination ofgit fetch
andgit merge
. Use this to fetch work that has been pushed to the main repository by your teammates. If there’s an update conflict, you may need to edit files manually, then rungit commit -a
to create a merge commit.git add file.c
: Use this to addfile.c
to the repository (or inform git that you have modified the file).git commmit -a
: Create a new git commit (will open an editor to ask you for a log message). The-a
flag says to include all modified files. Without-a
, only changed files on which you have rungit add
will be reflected in the commit, and only the version of the files as of when you rangit add
.git push
: copy your latest commit into the remote repository. With no options, defaults to pushing the current branch (named byHEAD
) to the branch’s upstream remote. This will only succeed if your local branch contains all commits in the branch of the remote repository, as otherwise you would lose work. If that’s not the case, rungit pull
to merge remote changes, then trygit push
again.
Resolving merge conflicts
If you and your teammate edit the same file, you will create a merge
conflict. Things will work fine for the first person who pushes the
changes to origin
. The next person cannot
git push
because their local repository does not contain
all the commits on origin
’s master
branch.
Therefore, that person will first need to do a git pull
,
and may see something like this:
$ git pull
remote: Enumerating objects: 7, done.
remote: Counting objects: 100% (7/7), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 4 (delta 2), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (4/4), 332 bytes | 332.00 KiB/s, done.
From /tmp/212/pintos
79574e9..e291cbe master -> origin/master
Auto-merging src/Makefile
CONFLICT (content): Merge conflict in src/Makefile
Automatic merge failed; fix conflicts and then commit the result.
You can see that there’s a conflict in the file
src/Makefile
. You can also verify this by running
git status
and seeing that src/Makefile
is an
unmerged path.
To fix the problem, you will have to edit every file that has a merge conflict. When you open the file to edit it, you will see conflicts marked as follows:
<<<<<<< HEAD
# This is going to create a conflict
=======
# Added this in directory 2
>>>>>>> e291cbe622aed15b1c9cc568c4fe8a8a6e3d684f
What this means is that in your local branch (HEAD
) you
had the first comment
(# This is going to create a conflict
). In the commit that
you are trying to merge from origin/master
(which in this
case has message digest
e291cbe622aed15b1c9cc568c4fe8a8a6e3d684f
) you had a
different comment (# Added this in directory 2
). Since each
change was made in a version of the file that did not reflect changes
from the other, git does not know how to resolve the problem
automatically.
You must manually decide how to resolve each such conflict, for
instance by keeping one comment or the other or both or doing something
else. You must also manually delete the
<<<<<<<
, =======
and
>>>>>>>
markers after fixing the
conflict. Search through the whole file for any occurrence of
=======
(seven equals signs) to find what you need to fix.
When you are done, run git add
on the file. When you have
fixed all conflicts, run git commit
to create a merge
commit with two parent commits (namely the previous commit at
HEAD
, and the most recent commit on
origin/master
, e291cbe622aed15b1c9cc568c4fe8a8a6e3d684f in
this case).
Don’t forget to run git push
after you have created the
merge commit. This time the push will succeed, because the merge commit
is now a descendant of the master
branch in the
origin
repository.
For more information, see the Branching and Merging section of the git book.
Changing AFS permissions on a git repository
A git repository, including bare repositories, contains lots of
subdirectories. If your team changes membership, you will need to edit
the permissions on every single subdirectory of your git repository.
Unfortunately, the fs sa
command doesn’t have any kind of
recursive option. Hence, you will need to use a combination of the
find
and xargs
utilities to do this. To add
the user xxx
to your repository and remove the user
yyy
, you can use the following command:
find pintos.git -type d -exec fs sa -dir {} -acl xxx rlidwka -acl yyy none \;
Don’t forget to add and remove the l
permission on your
home directory as well:
fs sa -dir ~ -acl xxx l -acl yyy none \;