Using Git for HPC

This guide covers using traditional Git on HPCMP systems, in particular, hosting a repository in a personal or group project directory. The HPCMP now offers a Gitlab service which provides both web-based code project management and command-line access to repositories. Gitlab provides several features and advantages over plain Git. For more information on this option, see the Gitlab Service User Guide.

If you cannot or do not wish to use Gitlab, the following guide shows you how to set up Git repositories in your own directories. For example, if you are unable to access the web interface for Gitlab or are working in a classified environment where Gitlab is not available, then the following would allow you to establish Git repositories that can be shared across systems and groups of users.

1. Using Git for the First Time

Do the following on any system in which you plan to use Git, substituting in your full name and email address, respectively. This will properly identify you as the author of any changes made to any repository.

$ git config --global user.name "Firstname Lastname" 
$ git config --global user.email "myemail@myhost.mil"

There are other recommended options to set on each system.

Editor (used mostly for editing commit messages); pick one:

$ git config --global core.editor emacs
$ git config --global core.editor vim
$ git config --global core.editor gvim    # Uses a separate window; not always supported, use with care

Diff tool (used to resolve merge conflicts); pick one:

$ git config --global merge.tool vimdiff
$ git config --global merge.tool gvimdiff    # Uses a separate window; not always supported, use with care

2. Setting up a Repository

First, you need to make some decisions:

  1. Do you want to use the repository (aka, "repo") only locally, only remotely (via SSH), or both?
  2. Do you want the repo for yourself only or to share with others?
  3. If you share it, with whom do you want to share it?
    • Are you all in the same UNIX group?
  4. Should the repo be "bare"?
    • This depends, but in many cases, if you're sharing a centralized repo, you might want it to be bare.

2.1. Setting up a full, local repository for yourself only:

Assume that you have source files in "mydir" for which you want to use git:

$ cd mydir
$ git init
$ git add files_and_directories                       # whatever files/dirs you want added to repo
$ git commit -m "My clean, initial copy of source"    # or whatever message you like

Note that this is a "full" repo, meaning that it also provides a "working copy" of all of the source files. You would do all source work within this directory, and commit all changes here. Here is what your repo should look like:

$ ls -a
./  ../  .git/  README

$ ls .git
branches/  config  description  HEAD  hooks/  index  info/  logs/  objects/  packed-refs  refs/

$ git log
commit 4f858c0f52a636a64784489d459f28dff00dc0eb
Author: Firstname Lastname <myemail@myhost.mil>
Date:   Mon Feb 15 20:59:00 2016 +0000
    My clean, initial copy of source.

Note the .git directory - this is where the repo "metadata" is stored. Git knows to look here for operations, such as the "log" command shown.

2.2. Setting up a bare, remote repository for yourself only:

Using a "bare" repo means that it only contains the version tracking information - there is no working copy of the source files in that location. Your working copy would be a separate repo (or repos) maintained elsewhere. The bare repo is often used as a centralized location to synchronize revisions. There are numerous ways to do start such a repo, but this example will create the remote repo from another system.

First, make sure you have access to the remote system (tickets, proper SSH for HPCMP systems, etc.), log in and create a bare, empty repo there:

$ ssh carpenter.erdc.hpc.mil         # or wherever you want the remote repo
$ git init --bare mycentralrepo.git  # or whatever you want to name it; note absence of --shared
$ chmod 700 mycentralrepo.git        # Optional: Force user-only permissions
$ ls mycentralrepo.git
branches/  config  description  HEAD  hooks/  info/  objects/  refs/

If you use a special Kerberized SSH that is not the default on your path, you can set $GIT_SSH to use it, for example:

$ export GIT_SSH=/usr/local/krb5/bin/ssh     # For sh/ksh/bash
$ setenv GIT_SSH /usr/local/krb5/bin/ssh     # For csh/tcsh

You should probably set this permanently in your .cshrc, .bashrc, .profile, .login, or similar shell initialization file.

If you don't have one already, set up your local, personal "working" repo, assuming you have a Linux-like system (use another DSRC system, if you like). This is just like setting up the full, local repo in the previous section.

$ cd mydir
$ git init
$ git add files_and_directories                       # whatever files/dirs you want added to repo
$ git commit -m "My clean, initial copy of source"    # or whatever message you like

Then, tell your local repo about the new remote repo, and push the changes to it:

$ git remote add onyxrepo ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/mycentralrepo.git
$ git push onyxrepo master    # 'master' is just the default master branch of the repo
Counting objects: 3, done.
Writing objects: 100% (3/3), 223 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/mycentralrepo.git
 * [new branch]      master -> master

2.3. Setting up a bare, remote repository for a UNIX group

This is just like the previous example, except that you're planning to share your repo with another UNIX group. Many of the instructions and information are repeated for convenience. Using a "bare" repo means that it only contains the version tracking information - there is no working copy of the source files in that location. Your working copy would be a separate repo (or repos) maintained elsewhere. The bare repo is often used as a centralized location to synchronize revisions. There are numerous ways to do start such a repo, but this example will create the remote repo from another system.

First, make sure you have access to the remote system (tickets, proper SSH for HPCMP systems, etc.), log in and create a bare, empty repo there:

$ ssh carpenter.erdc.hpc.mil                  # or wherever you want the remote repo
$ newgrp mygroup                              # switch my default group to the group with which I want to share
$ git init --bare --shared mycentralrepo.git  # or whatever you want to name it; note --shared
$ ls -l mycentralrepo.git
drwxrws--- 7 mylogin mygroup 512 Feb 15 22:03 mycentralrepo.git/
    ^^^ (note the group permissions & sticky bit)
$ ls mycentralrepo.git
branches/  config  description  HEAD  hooks/  info/  objects/  refs/

Back on your local system, if you use a special Kerberized SSH that is not the default on your path, you can set $GIT_SSH to use it, for example:

$ export GIT_SSH=/usr/local/krb5/bin/ssh   # For sh/ksh/bash
$ setenv GIT_SSH /usr/local/krb5/bin/ssh   # For csh/tcsh

You should probably set this permanently in your .cshrc, .bashrc, .profile, .login, or similar shell initialization file.

If you don't have one already, set up your local, personal "working" repo, assuming you have a Linux-like system (use another DSRC system, if you like). This is just like setting up the full, local repo in the previous section.

$ cd mydir
$ git init
$ git add files_and_directories                     # whatever files/dirs you want added to repo
$ git commit -m "My clean, initial copy of source"  # or whatever message you like

Then, tell your local repo about the new remote repo, and push the changes to it:

$ git remote add onyxrepo ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/gittst/mycentralrepo.git
$ git push onyxrepo master    # 'master' is just the default master branch of the repo
Counting objects: 3, done.
Writing objects: 100% (3/3), 223 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/gittst/mycentralrepo.git
 * [new branch]      master -> master

2.4. Setting up a bare, remote repository with ACLs

This is just like the previous example, except that you're planning to share your repo with another UNIX group. Many of the instructions and information are repeated for convenience. Using a "bare" repo means that it only contains the version tracking information - there is no working copy of the source files in that location. Your working copy would be a separate repo (or repos) maintained elsewhere. The bare repo is often used as a centralized location to synchronize revisions. There are numerous ways to do start such a repo, but this example will create the remote repo from another system.

First, make sure you have access to the remote system (tickets, proper SSH for HPCMP systems, etc.). Also, make sure ACLs are supported on the system and file system. Note that ACLs could be supported in home directories and not work directories, for example. After that, log in and create a bare, empty repo there. All of these are shown below:

$ ssh carpenter.erdc.hpc.mil            # or wherever you want the remote repo
$ touch testfile                        # Do this once to check for ACL support
$ setfacl -m "u:myfriend:rwX" testfile  # myfriend must be a real user
$ getfacl testfile
# file: testfile
# owner: mylogin
# group: mygroup
user::rw-
user:myfriend:rw-       #<<< This line means that the ACL worked
...
$ rm testfile           # if you want to clean up

$ mkdir mycentralrepo.git
$ setfacl -m "u:mylogin:rwX" mycentralrepo.git       # Don't forget yourself!
$ setfacl -d -m "u:mylogin:rwX" mycentralrepo.git    # "Default" ACL for self
$ setfacl -m "u:myfriend:rwX" mycentralrepo.git
$ setfacl -d -m "u:myfriend:rwX" mycentralrepo.git   # "Default" ACL
# Repeat the above for as many users (or groups) you wish to add
$ git init --bare mycentralrepo.git                  # or whatever you want to name it; note NO --shared
$ ls -l mycentralrepo.git
drwxrwx---+ 7 mylogin mygroup 512 Feb 15 22:03 mycentralrepo.git/
                  ^ (note the '+' indicating an ACL)
$ ls mycentralrepo.git
branches/  config  description  HEAD  hooks/  info/  objects/  refs/

Back on your local system, if you use a special Kerberized SSH that is not the default on your path, you can set $GIT_SSH to use it, for example:

$ export GIT_SSH=/usr/local/krb5/bin/ssh     # For sh/ksh/bash
$ setenv GIT_SSH /usr/local/krb5/bin/ssh     # For csh/tcsh

You should probably set this permanently in your .cshrc, .bashrc, .profile, .login, or similar shell initialization file.

If you don't have one already, set up your local, personal "working" repo, assuming you have a Linux-like system (use another DSRC system, if you like). This is just like setting up the full, local repo in the previous section.

$ cd mydir
$ git init
$ git add files_and_directories                     # whatever files/dirs you want added to repo
$ git commit -m "My clean, initial copy of source"  # or whatever message you like

Then, tell your local repo about the new remote repo, and push the changes to it:

$ git remote add onyxrepo ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/gittst/mycentralrepo.git
$ git push onyxrepo master    # 'master' is just the default master branch of the repo
Counting objects: 3, done.
Writing objects: 100% (3/3), 223 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/gittst/mycentralrepo.git
 * [new branch]      master -> master

3. Using a Git Repository

Assume you have followed the directions of one of the approaches above to create a central bare repo. If you're working with a new local system that isn't the same as the remote system, then set up Git for use on that system, including $GIT_SSH and such, as per above.

To work with your remote, central, bare repo elsewhere, you want to "clone" the repo to an alternate working repo:

$ git clone /u/home/mylogin/mycentralrepo.git newworkspace     # This only works on the same system
$ git clone ssh://mylogin@carpenter.erdc.hpc.mil/u/home/mylogin/mycentralrepo.git  newworkspace

This will create a "newworkspace" full repo on your local system. Explore it a bit:

$ cd newworkspace
$ git log
commit 4f858c0f52a636a64784489d459f28dff00dc0eb
Author: Firstname Lastname <myemail@myhost.mil>
Date:   Mon Feb 15 20:59:00 2016 +0000
    My clean, initial copy of source.

$ git remote
origin

By default, Git names the remote repo "origin". To make changes to a local repo, then move those changes to the remote repo, you must "commit" your changes locally, then "push" them to the remote repo. Here is an example, starting from the previous example:

$ echo "something" > testfile   # Create some file
$ git add testfile
$ git commit -m "Added testfile." 
$ git push origin master           # Push the master branch to the origin repo

You might recall from previous examples, when creating the remote repos, that you had set the remote repo name inside the local repo, e.g., "onyxrepo". If you prefer that to "origin", you can rename the remote repo name:

$ git remote rename origin onyxrepo

You can have multiple remote repos attached to a local repo. There are many reasons to do this, but it's an advantage of Git's distributed version control approach. Say, an exact copy of the "onyxrepo" exists on another system, Gaffney. You can add it as another remote repo in your local repo:

$ git remote add gaffneyrepo ssh://mylogin@gaffney.navydsrc.hpc.mil/u/home/mylogin/mycentralrepo.git
$ git remote
onyxrepo
gaffneyrepo
#... Make some changes to local repo ...
$ git commit -m "Made some changes." 
$ git push onyxrepo master
$ git push gaffneyrepo master