having trouble formatting data for DE analysis
1
0
Entering edit mode
zal4002 • 0
@zal4002-23428
Last seen 4.0 years ago

I have been trying to run the DESeq package on some NGS data I have from an experiment. I have the raw data in the form of .fastq files as well as RPKM values stored in an excel sheet. I have two cell types, two treatments, and two replicates for each group and I am trying to get a DE result file and generate a volcano plot. I have the end of the process figured out, I just can't get the program to run and produce a DDS.

deseq2 • 19k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

Bioconductor packages have vignettes as a guide, and DESeq2 also has a workflow which goes slower through the process and is designed for users who are trying DE analysis for the first time. You can find the workflow linked from the top of our vignette, or you can just search Google for "deseq2 rnaseq workflow" and you'll find it.

ADD COMMENT
0
Entering edit mode

I have gone over the workflow a couple times but I am getting stuck on the implementation of Salmon. Do I need to run that program in python/is there a way to skip that step and do the analysis exclusively in R?

ADD REPLY
2
Entering edit mode

So you already know that you cannot use RPKM as input to DESeq2.

There are a number of ways to generate counts, but Salmon is very fast and easy. You also know how to run Salmon because it's part of the workflow.

Alternatively, you could collaborate with someone who can help you with quantification from FASTQ, and then load into R using the various importers described in the workflow.

But we can't do all this for you on the support site. The site is for specific questions about Bioconductor software, but not to avoid having to do the work of trying things out yourself (or finding collaborators to help).

ADD REPLY
0
Entering edit mode

I don't think salmon is a python program, you just run it in Unix in the command line. I don't think there is a R implementation. There might be some aligners that are implemented in R, but the most popular ones, the ones that more people use and can help you with, are not.

ADD REPLY
0
Entering edit mode

thanks! I haven't used Anaconda in a while so I forgot "conda..." things go in the command window. I'm going to see if I can knock the rust off.

ADD REPLY
1
Entering edit mode

Once you get salmon running in a conda environment (see first section HERE), there is then useful information HERE (see section 7.2.4 Salmon quantification) about how to index the reference transcriptome and run the count abundance step. After all of that, you would then go back to the DESeq2 vignette about how to import the count abundances to DESeq2 for normalisation.

As per Michael, the RPKM data cannot be used as input to DESeq2.

It is important to follow the vignettes / tutorials provided by the authors of these programs so that one can then learn how they work, and, ultimately, learn how to apply these programs to your own data.

ADD REPLY
1
Entering edit mode

We also have Salmon indexing + quantification including a Snakemake example for looping over samples in the workflow now:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#quantifying-with-salmon

ADD REPLY
0
Entering edit mode

Thanks for both of your help! I am usually more of a lab tech than a coder but covid has me trying to find new ways to help our lab. Unfortunately, I still can't seem to get the Salmon program working. I have uninstalled and reinstalled the anaconda software, installed the binary file from github and tried several variations of conda install salmon and keep getting an error about the channels I have, or the abscence of salmon from the directory; despite the fact that conda-forge and bioconda are at the top of my channels list, and the salmon binary file is listed when I give the dir command. I just did my third un/reinstall of anaconda for today and these steps are still not working. If there is something I may have missed trivial or subtle please let me know if you have the chance. Thanks so much!

ADD REPLY
1
Entering edit mode

I’m actually really swamped now unfortunately so won’t be able to follow up here.

Again, a good approach is to ask someone local to your institute for help getting started beyond the online documentation. This site is really for reporting specific issues or questions to Bioconductor software maintainers.

ADD REPLY
1
Entering edit mode

zal4002, why not just try to download the pre-compiled executables ('binaries') from here and try to run those outside of conda? - https://github.com/COMBINE-lab/salmon/releases (scroll down to find the filename 'salmon-1.2.1linuxx86_64.tar.gz').

As per Michael, though, this website is for support with Bioconductor packages. You could try to make a post on Biostars, but please link back here when / if you do.

ADD REPLY
0
Entering edit mode

Thanks for the tip Re: Biostars. I posted on there and someone told me that I cannot run Salmon on Windows (didn't expect that to be an issue). I'm trying to find a workaround but at least now I'm going in the right direction.

ADD REPLY
0
Entering edit mode

Lots of bioinformatics software is primarily designed for Linux, and happens to work also on Mac usually (or can be extended with some effort to work on Mac) because MacOS is Unix-like.

While trying to get some experience doing typical bioinformatics tasks, you end up spending a lot of time and effort dealing with issues that bioinformatics users on a Linux cluster wouldn't encounter, and having to post questions on forums, etc.

Whereas the typical experience would be to just download the linux_x86_64 executable and it runs immediately. You end having a rough experience, but it's because no one is aligning or quantifying reads on a Windows machine, they are doing it on Linux clusters or Mac laptops, so there is little to no support.

If you are trying to recreate this experience on a Windows machine, you could use a virtual machine, or you could just ask someone with access to a Linux cluster / laptop / Mac laptop to do this step for you.

ADD REPLY

Login before adding your answer.

Traffic: 913 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6