In this series of posts, we will see how we can perform RNA-Seq analysis in Windows operating system using linux tools in WSL. Also, the part of analysis which require “R”, will be done in the Windows system.
We will be using following tools:
- samtools (WSL / Linux)
- HISAT2 (WSL / Linux)
- Stringtie (WSL / Linux)
- gffcompare (WSL / Linux)
- Ballgown (R)
Install samtools
We need to install three things for this.
- htslib
- bcftools
- samtools
Open WSL. A terminal will open. Remember to open WSL as administrator.
Prepare
To install samtools we need to have few other things installed on our linux system first. Following are the commands you need to execute on the WSL terminal one-by-one to install them. For those who are not familiar with linux, these commands are called as bash commands.
sudo apt-get update
sudo apt-get install gcc
sudo apt-get install make
sudo apt-get install libbz2-dev
sudo apt-get install zlib1g-dev
sudo apt-get install libncurses5-dev
sudo apt-get install libncursesw5-dev
sudo apt-get install liblzma-devDownload samtool, htslib and bcftools from
https://www.htslib.org/download
The downloads would be in/mnt/c/Users/<yourname>/Downloads
Note: in place of <yourname>, it will be the username which you are currently using. For example, if your user name is john, then the downloaded packages will be in /mnt/c/Users/john/Downloads
Move the downloaded packages to a different folder (directory)
Let’s create a directory where all the software needed will be stored. Note that the versions of software you would have downloaded will be different. Here I am writing the filenames based on the version which I had downloaded when this work was done.
cd ~
mkdir bin
export PATH=$HOME/bin/:$PATH
cd bin
mv /mnt/c/Users//<yourname>/Downloads/htslib-1.21.tar.bz2 ./
mv /mnt/c/Users//<yourname>/Downloads/samtools-1.21.tar.bz2 ./
mv /mnt/c/Users//<yourname>/Downloads/ bcftools-1.21.tar.bz2 ./Now that we have copied the software files to a directory, we can now install them. Continue from the same terminal window above. If you had restarted the WSL, first make the current directory to bin using:
cd ~
cd bintar -vxjf htslib-1.21.tar.bz2
cd htslib-1.21
make
cd ..
tar -vxjf samtools-1.21.tar.bz2
cd samtools-1.21
make
cd ..
tar -vxjf bcftools-1.21.tar.bz2
cd bcftools-1.21
make
cd ..Make these software available for use by exporting their path to environment variables.
export PATH=$HOME/bin/htslib-1.21:$PATH
export PATH=$HOME/bin/samtools-1.21:$PATH
export PATH=$HOME/bin/bcftools-1.21:$PATHSamtools installation is now complete.
Install HISAT2
First download hisat2 from https://daehwankimlab.github.io/hisat2/download/
Copy to the bin directory created above.
cd ~/bin
mv /mnt/c/Users//<yourname>/Downloads/hisat2-2.1.0-Linux_x86_64.zip ./
unzip hisat2-2.1.0-Linux_x86_64.zipIf unzip is not installed, install it first and run the unzip command above.
sudo apt install unzipAdd to path
cd ~/bin
cp hisat2-2.1.0/hisat2* hisat2-2.1.0/*.py ./This will show some warnings about files being repeated, don’t worry.
Install Stringtie
Download stringtie linux binary from https://ccb.jhu.edu/software/stringtie/
Download the linux binary not the osx.
cd ~/bin
mv /mnt/c/Users//<yourname>/Downloads/stringtie-2.2.3.Linux_x86_64.tar.gz ./
tar xvzf stringtie-2.2.3.Linux_x86_64.tar.gz
cp stringtie-2.2.3.Linux_x86_64/stringtie ./Install gffcompare
You need to setup github first
cd ~/bin
git clone https://github.com/gpertea/gffcompare
cd gffcompare
make releaseAdding this to path was not so straight forward
Close WSL and restart in administrator mode. On the terminal type:
nano ~./bashrcA terminal will turn into the nano text editor and would show the bashrc file contents. Go to its end by arrow keys and in new line type:
export PATH=$PATH:/home/trunil/bin/gffcompareYou can save and exit by Ctrl+X, then Y and then Enter. You will then be taken to normal command line prompt.
Restart WSL and check gffcompare is in path by typing:
which gffcompareInstall Ballgown
Ballgown is an `R` package. So, we need to first install R and Rstudio in our Windows system.
Once this is done, we will install tidyverse package in R. Start Rstudio and in its console, execute:
install.packages("tidyverse")It would take some time for this to complete.
To install ballgown execute:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("ballgown")Checking the installations are in path
Finally we can check if everything is installed correctly and is in path.
Restart WSL and type following commands:
which samtools
which hisat2
which stringtie
which gffcompareThey should give the output directory in which these commands are present in the path. If no output is given by any of the commands, it means that the command is not in the path and you will not be able to use the tool.
Other tools to install in R
Install devtools
install.packages("devtools")Install rtools
Go to
https://cran.r-project.org/bin/windows/Rtools/
Download .exe file of rtools version compatible with your R version.
Install.
Install tidyverse
Tidyverse is a collection of R programming language packages for data science. I can be installed by the following in R console.
install.packages("tidyverse")Install RSkittleBrewer
Sys.unsetenv("GITHUB_PAT")
devtools::install_github("alyssafrazee/RSkittleBrewer", auth_token = NULL)
Sys.unsetenv("GITHUB_PAT")
gitcreds::gitcreds_delete() # type 2
library(devtools)
devtools::install_github("alyssafrazee/RSkittleBrewer", auth_token = NULL)We are now set to do the RNA-Seq analysis. In later posts we will be seeing how to use these tools.