{Optional Review} Introduction to git

Reading

For an overview of how Git can be used in the biological sciences, please read this excellent article by Ram.

For a practical introduction please read Chapter 5 in Bioinformatics Data Skills available from the library here

The Github Handbook is also nice.

After you have some experience with git, this cheat sheet may be helpful (but right now it will probably just be confusing.)

Git: reproducibility and collaboration

This document will introduce you to Git, a version control system that is a great aid in writing software, maintaining documentation, and maintaining reproducibility.

What does Git do? Git keeps track of changes that you (and your collaborators) make in your documents. By maintaining a record of all the changes that have been made you can restore your project to an earlier state if needed (i.e. if you screw up). Git also allows you to maintain different versions (known as branches in Git) simultaneously, an incredibly useful feature. For example you can maintain a “main” branch that works correctly. You try out changes in a “develop” branch without breaking the working “main” version. Once you know that your changes in “develop” are functioning as intended you can merge them into the “main”.

A few key concepts

A project that is tracked by Git is called a repository (repo for short).

To start a new repository you use the git init command.

To add files for git to track you use git add [FILE]. (where [FILE] is replaced by the actual filename)

When you have made some changes to your project and you want to commit those changes to the repository, it is a two step process. First add the changes git add [FILE] and then use git commit typically with the option -m to include a brief message about the changes made.

If you are collaborating with others, or just want to share your project, you will want to set up a remote repository. One common (and free!) hosting site is GitHub. When you want to add your changes to the remote repository you push to the repository using git push. When you want to download changes that others have made then you want to pull changes using git pull.

Learn about git using a tutorial

Now let’s see some of this in action.

Exercise Keep track of what each command that you learn does by making notes for yourself in a markdown document named gitNotes.md . Save this file, to be turned in later.

We will next do a tutorial, Git-it.

IMPORTANT: WHEN THE TUTORIAL ASKS YOU TO CLONE YOUR REPOSITORY USING HTTPS, DON’T DO IT. USE SSH INSTEAD

The tutorial is already installed on your instance. To start the tutorial:

  1. Open the “Git-it” folder on your desktop. Be Patient the first time it runs it may take ~ 1 minute to start. If it still hasn’t started after a minute then double-click again. If you do not see the folder, click on the “Files” icon at the bottom first.
  2. In the “Git-it” folder, double-click the “Git-it” icon.

Proceed through the Git-it exercises. It may take a few seconds for the window to open when you click on the “SELECT DIRECTORY” button. Skip the section “Step Install Git” on the first page, it is already installed. But do configure git on your instance as descrbied in “Step Configure Git” also on the first page. You also will need to create a github account (unless you already have one) as instructed on the fourth page of the git-it tutorial.

When asked to edit files in the tutorial you can use nano or Rstudio.

If you want an alternative (or additional) tutorial, you can try the one at katacoda (not required)

Now let’s try it in real life.

Make a repository and collaborate

Work with one or two lab partners. Each partner should follow along with what the others are doing so you are versed in all steps.

Designate one of you to create a new repository. This is Partner 1.

There are two ways to make a new repository and get the local and remote versions linked. Either you create it on Github first and clone it down to your computer or you init it on your computer and link it to a Github repository.

Partner 1 (only) should create a new repository, using one of the two options below:

1) Create the repository on Github first.

Do NOT type git init. In this case, since you already initialized a repository on github it is not needed

  1. From your github.com home page click on the green “+ New Repository” button
  2. On the resulting page give it a name, check the “Initialize this repository with a README” box and press the “Create Repository” button.
  3. Click on “SSH” and then on the clipboard icon to copy the URL.
  4. Open the terminal on your computer, cd to the parent directory of wherever you want the repository to reside and then git clone URL where URL is the URL that you copied from Github.
  5. Next, cd to your repository and begin working on it.

OR

2) Create the repository on your computer first (this is what you did in the tutorial).

  1. cd to the parent directory of where you want the repository to reside.
  2. mkdir NAME where NAME is the name you want for your repository.
  3. Very Important cd NAME to move into the repository
  4. git init to initialize a repository in the current directory
  5. Add a file to the repository. For example:

    touch README.md
    git add README.md
    git commit -m "Added README.md"

  6. Go to Github.com
  7. From your github.com home page click on the green “+ New Repository” button (right hand side of screen)
  8. On the resulting page give it a name and press the “Create Repository” button. DO NOT check the “Initialize this repository with a README” box.
  9. Click on the clipboard icon to copy the URL next to the heading “…or push an existing repository from the command line”
  10. Paste that into the terminal while in the directory of your repository. i.e.

    git remote add origin git@github.com:jnmaloof/test2.git
    git push -u origin main

Now let’s collaborate!

  • Partner 1:
    • Add a file to the repository with a bit of text (what your plans are for the weekend?).
    • Commit your change
    • Push the repository to github
    • Go to the github website for this repository.
    • Add Partner 2 (and 3) as collaborators.
  • Partner 2 (and 3):
    • Check your email for an email from github. Click on the link to the repo
    • Clone the repository to your computer. You do NOT need to fork it
    • Add your information (what your plans are for the weekend?)
    • Commit your change
    • Push the changes to the repository
    • Run git log and save the output to a file.
  • Partner 1:
    • Pull the changes back to your computer
    • Run git log and save the output to a file.

Use github in RStudio

  • Tired of using the command line to commit and push your changes?
  • You can also use the git module in RStudio as shown in lecture.
  • If you already have a git repository cloned onto your computer:
    • In RStudio go to File > New Project....
    • Choose Existing Directory. Then Select the folder that correspond to your git repository.
  • If you want to clone a repository from Github to your computer:
    • In Rstudio go to File > New Project...
    • Choose Version Control > Git.
    • Paste in the ssh clone url from Github
    • Optionally check the parent directory
  • You can stage, commit, and push using the tools in the upper right hand pane using the git tab.
  • Each partner should try this out.
  • If you close RStudio you will need to choose Open Project either from the file menu or the right hand corner to get the git menu to come up again.

More resources

Still confused? Or want to go further? Here are some additional resources

Tutorial

An alternative tutorial

GitHub for beginners Part 1
GitHub for beginners Part 2

Videos

The four part git basics series (the first two were shown in class)

  1. https://www.youtube.com/watch?v=8oRjP8yj2Wo
  2. https://www.youtube.com/watch?v=uhtzxPU7Bz0
  3. https://www.youtube.com/watch?v=wmnSyrRBKTw
  4. https://www.youtube.com/watch?v=7w5Z7LmyLgI

A longer video (50 minutes)

GitHub Video Channel

Online book

The official online git manual