Welcome to BIS180L

Julin Maloof
Lecture 01

Course Personnel

Instructor: Julin Maloof jnmaloof@ucdavis.edu

Teaching Assistant: Bryshal Moore brymoore@ucdavis.edu

What this course is about

The goal of this course is to introduce you to the tools and thinking required for bioinformatics analysis.

This is a computer-based class. No bench work.

Learning Objectives

At the end of this course you should be able to:

  • Use cloud computing resources like Jetstream and AWS
  • Navigate the Linux operating system at the command line for file management and invocation of bioinformatics programs
  • Automate repetitive computational tasks using loops
  • Write scripts in R for data analysis and display
  • Perform genome wide association studies (GWAS)
  • Analyze Illumina sequencing data to find sequence polymorphisms

Learning Objectives, Continued

  • Perform a genome Assembly
  • Tentative: Find methylated regions of the genome
  • Analyze RNA-seq data to find differentially expressed genes and pathways
  • Make and deploy Shiny web applications
  • Reconstruct genetic co-expression networks
  • Analyze metagenomics data
  • Apply best practices for reproducible data analyses via git version control and markdown notebooks
  • Use Agentic AI tools to help with coding

Amount of Work (Bad News / Good News)

  • This class will require a lot of work outside of lab hours.
  • But you will gain a solid foundation in bioinformatic analysis, focused on second-generation sequencing data.

Why R and Linux?

  • Both R and Linux have command-line interfaces
  • Antiquated?
  • NO! Written code provides flexibility, creativity, and power not available in any other way
  • Linux (or Unix / Mac)
    • Outstanding built-in tools for data crunching
    • Provides access to hundreds of bioinformatics programs
    • Easy to automate repetitive tasks

Why R and Linux?

  • R
    • Powerful scripting, statistical, data processing, and graphical capabilities
    • Many bioinformatics packages are developed in R
    • Easy to automate repetitive tasks.

Some thoughts on AI and coding

  • AI is rapidly changing the way people code
  • Currently it is still important to understand what AI generated code is doing and how it works
  • Especially true in data analysis applications (how do you trust the output?)
  • You want to use AI to speed up your coding but not replace your thinking

AI and this class

  • You ARE allowed to use AI tools
  • You are NOT allowed to directly paste any text/question/code into any query/search/AI engine
  • You are encouraged to solve the problems on your own without AI…you will learn better
  • There will be a written/in-class portion of both exams that does not allow usage of AI
  • I plan to have an AI module later in the class to show how to use agentic tools (going beyond pasting things in to ChatGPT).
    • need to determine what to cut out!

Do you want to succeed in this class?

Secrets for success:

  1. You are learning a new language, treat it as such.
  2. Go slow. If your goal is to get out of this room as soon as possible you will not succeed.
  3. Read the lab instructions carefully. See #2.
  4. Do not cut and paste code. Type it; you will learn it better.
  5. Take the time to understand a command. THINK!
  6. Do ask for help when you are stuck or confused.
  7. Be kind to yourself and your community.

Course Schedule

  • Most lectures pre-recorded
  • Do we want to keep this? To be revisited at the end of this week
  • Lecture Videos and embedded Playposit Quizzes due 9AM Tue/Thur
  • Released at least 24 Hours in advance.
  • To take the quiz be sure to start the video from Canvas > Assignments
  • To watch again: go to Canvas > Media Gallery
  • Tuesdays, Thursdays
    • Lab 1:10 - 5:00
  • Fridays
    • Discussion 1:10 - 2:00
    • Varied use
      • Office hours
      • Q & A (with student helping to A as well as Q)
      • lecture
      • Keep on working

(Tentative) Course Outline

  • Week 1:
    • Linux fundamentals
    • Markdown, git repositories
    • Sequence analysis and BLAST
  • Week 2:
    • Analyze BLAST in R
  • Week 3:
    • Tidyverse
    • Alignment, tree building
  • Week 4:
    • SNPs
    • population structure, GWAS
  • Week 5:
    • Build a web-app
  • Week 6:
    • Genome Assembly
    • Methylation Analysis
  • Weeks 7-9:
    • Illumina reads
    • RNAseq
  • Week 9:
    • Genetic Networks
  • Week 10:
    • Metagenomics

Course Grading

  • 45% Lab assignments
  • 25% Midterm (In class May 01, Take Home Available May 01, Due May 07, 1:10 PM)
  • 25% Final (In Class June 10 3:30, Take Home Available June 07, Due June 09, 5:00 PM)
  • 5% Playposit lecture quizzes + possible discussion Qs + Attendance

Do your own work

Developing code is an interactive process. Both your friends and the web can be excellent resources.

However Any direct copying of text or code from any person (in this class or on the web, etc) is considered plagiarism in the context of this course. Okay to use AI generated code–give attribution.

If you receive inspiration or ideas from an external source give attribution

Course Website

Reference Text

Author: Vince Buffalo

Book cover of Bioinformatics Data Skills by Vince Buffalo

This is an excellent book that covers much of the material that is covered in lab.

You can use this to help with ideas that are not clear from class, or for more in depth coverage of the material.

I particularly recommend reading it in depth for anyone planning to build on the skills learned in this class.

It is available online for free, through UCD library. See links on course website.

Reference Text2

Authors: Hadley Wickham and Garrett Grolemund

Book cover of R for Data Science by Hadley Wickham and Garrett Grolemund

This is another excellent book that more generally covers data manipulation and analysis in R.

The author, Hadley Wickham, has built some very nice additions to R that we will make use of in this class.

Available online for free

Bioinformatics best practices

  1. Clear documentation
  2. Reproducible results
  3. Informative names (files, variables, functions)
  4. Logical organization
  5. Documents/Data in open (non-proprietary) formats
    • This is essential for achieving 1 and 2

Chicken and Egg Problem

  • You need some knowledge before you can do anything
  • But you need to do some things to set up your machine before you can gain knowledge
  • As a result the first few days of this class can feel a little disorienting
  • It will get better

Today's Lab

  1. Get a virtual linux machine running
  2. Clone Assignment 01 repo into Rstudio
  3. Practice Markdown
  4. Learn a little Linux command line

Virtual Linux machine

  • The computer lab machines run Windows
  • Bioinformatics on a Windows machine is painful (although getting better) (R is fine)
  • Solution: virtual machine!
  • Use Jetstream2 to run a virtual Linux machine in the cloud as part of the NSF-funded Access-CI
  • You can connect to your virtual machine from any computer, including your laptop or home computer

What is a virtual machine?

  • Universities have many very large computers (a.k.a. servers) in various places around the country
  • These servers can be split up to be many separate “virtual” machines, each emulating an individual computer
  • You can connect to these virtual machines and it just like having your own computer, but it is in the “cloud”
  • Terminology:
    • Each virtual machine is an instance of a machine image.
    • You can think of the image as a snapshot (or template) of a machine that captures the OS, the installed programs, etc.
    • The image that you are using is called BIS180L and was created by John Davis and myself for this class.

Connecting to the virtual machine

  • There are two ways to connect to your instance:
  1. Using a Virtual Network Connection (VNC). This allows the graphical display of the instance to be displayed on your local computer
  2. Using a secure shell (SSH). This provides a text connection to your instance. An advantage for slow internet connections.
  • You can also transfer files from your instance to other computers using SFTP (Secure File Transfer Protocol)

Demo VM connections

Quick demo of VNC and SSH

Other virtual machine notes

As detailed in the lab notes for today:

  • We have created a virtual machine instance for each of you
  • The first lab section details how to connect.

FAQ: Can I use my own computer?

  • Can you use your own computer to connect to your Virtual Machine?
    • Yes, this is easy and you will probably want to, to make it easy to work outside of class
    • But for today, please first connect using the lab PCs
    • Also eduroam is slow, so if everyone is in here on their own laptop on wifi, things won't work very well
  • Can you use your own computer to run the analyses directly?
    • For R/Rstudio analyses, probably (But make sure your R is up to date!)
    • For Linux/Possibly, if you are running a Mac or Linux machine. The first lab manual links to install notes for the Linux machine. I have a similar, although out-of-date document for Mac.
    • We will NOT provide troubleshooting help for your own computer (other than helping you connect it to the VM)

Today's Lab

  1. Get a virtual linux machine running
  2. Clone Assignment 01 repo into Rstudio
  3. Practice Markdown
  4. Learn a little Linux command line

Markdown

Markdown is a text-based formatting system for quickly and easily generating nicely formatted output.

This presentation as well as the entire BIS180L website was written in markdown.

It helps achieve three guiding principles:

  1. Clear documentation
  2. Reproducibility
  3. Open formats

Markdown vs docx

What is we want to produce this:

Screenshot of formatted text output with headers and lists in a document

The markdown file that generates it is

Screenshot of markdown source code showing simple text formatting with hash symbols for headers and asterisks for lists

Markdown vs docx

What is we want to produce this:

Screenshot of formatted text output with headers and lists in a document

The word file that generates it is Screenshot of Microsoft Word document showing complex binary gibberish

Today's Lab

  1. Get a virtual linux machine running
  2. Clone Assignment 01 repo into Rstudio
  3. Practice Markdown
  4. Learn a little Linux command line

Linux Command Line

We will work through a tutorial developed by Ian Korf.

You are learning a new language; treat it as such.

  • Keep notes
  • Be patient
  • Practice and repetition help
  • It will get easier

Quick orientation to Linux command line

  • Instead of clicking on an icon, typing a command
    • command [options] [file-path]
  • directory structure

Assignments to turn in for this lab

  • Assignments will be turned in via github
  • There will be a separate repo for each assignment
  • Detailed instructions are given on the lab page for today
  • We can help if you get stuck
  • For today's lab you will need to turn in four files (two markdown files and their correspongind .html counterparts).
  • Due before class on Tuesday, but there will be new material this Thursday

Github in Rstudio vs Github Desltop

  • If you took BIS 15L you are probably familiar with using Github Desktop
  • Github Desktop is installed on the instances and you can use it
  • Alternatively you can interact with Git directly in Rstudio. This my personal preference.
  • I'll demo this later in this period once people get going.

Slack Channels

Slack is our main tool for communication (other than lecture and one-on-one lab time).

There are two UC Davis Slack channels that I have added you to:

  • bis180l-student-announcements-2026
    • Posting on this channel should be limited to me or John, although you can reply to posts with follow-up questions
  • bis180l-student-help-2026
    • This is the place to seek help. If you are confused or stuck someone else is as well.
    • If you know the answer to a question please contribute

Slack and Class Etiquette

  • Be kind and respectful to each other
  • Remember that we all come from different backgrounds
  • Promote questions and discussion
  • Think about how your Slack posts could be viewed by others

Refresher: Secrets for success

Website Tour

Lets go have fun!