CNR Bioinformatics Workshop
This is a curriculum of a practical introduction workshop to bioinformatics analysis.
This workshop is an very general and very basic introduction to tools and practices of bioinformatics analyses, focusing on RNAseq and single cell RNAseq. It is designed for an audience with limited to no bioinformatics or coding experience and hopefully will help them overcome the innate fear of the command line. Since all the contents as of now are derived exclusively from my own personal experience, I cannot claim them to be the most common, most efficient or best practices, but they got the jobs done for me when I, a neuroscientist with no formal training in computer science, needed. Once again, the goal isn’t to provide a rigorous guide of standard practices but rather helping those who are in need to kick the can down the road.
Prerequisite
Miniconda: https://docs.conda.io/en/latest/miniconda.html
Miniconda installation
Refer to the installation page for detail.
-
On Windows:
Download installer from here -
On Mac or Linux:
Check here
Download installer with wget and install with bash:
Mac
wget link/to/Miniconda3-latest-MacOSX-x86_64.sh
bash Miniconda3-latest-MacOSX-x86_64.sh
Linux
wget link/to/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
Follow prompts and finish the installation
Required files and recording of workshop
https://www.dropbox.com/sh/s9ylxo4jmwta9yu/AABiCtc3SXlVPynQXhNlNTyBa?dl=0
Topics
- The Mighty CMD: command line interface
- Version control
1. Data, script and environment management
- wget
- fastq-dump
- conda
- Git/github
2. Alignment and read quantification
- cellranger
- The STAR of the show: Alignment with STAR/featureCounts
- Getting started with R
- Load libraries and datasets
- Dimension reduction
- Clustering
- Differential gene expression
- ggplot2
- Cerebro
Not covered but useful
- Access high performance computer with PuTTy (on windows) or ssh (on Mac and Linux);
- Upload and download files with Filezilla;
- Upload and download files with scp;
- Parallelization: run a program with more than one processor.