Search
Question: Calculate DESeq2 fold change of 1 gene to 1 of the sample
0
3.4 years ago by
bharata180320
Japan
bharata180320 wrote:

Hello,

I am doing an RNA-seq data analysis. I already have several spesific gene as my target. What I want to do is actually comparing the gene expression level with the SNP mutation. So, for now, I want to get the gene expression level with this RNA-seq data. Because of my analysis focus on SNP and gene expression analysis, I don' have any interest in between group differential expression analysis. I already follow DESeq2 tutorial but I only get the fold change difference for 2 groups, normal and disease. So, the question is, I want to check the fold change to 1 of the sample as the base. I imagine like this. I have 14 sample. I will have fold change for difference between 1 and 2, 1 and 3, and so on, until 1 and 14. So, is this method possible to do in DESeq2 and probably you can share your idea how to do this with DESeq2. Thank you very much for your help.

modified 3.4 years ago by Simon Anders3.5k • written 3.4 years ago by bharata180320
1
3.4 years ago by
United Kingdom
andrew.j.skelton73310 wrote:

Using one sample as a baseline is not a good idea. Check the manual on using single samples in dispersion estimation. What you could do, is if this is a SNP for a particular gene, you could get the normalised counts and look at that particular gene entry?

counts(dds,normalized=TRUE)


Hmm. I am a bit confused here. So, basically, for this particular gene, the info about possible SNP is already there. My task now is to check the SNP for each sample and then try to make some correlation between gene expression and thos SNP. For example, for SNP type 1, the gene expression level is reduced half if compared to SNP type2. So, what you mean is I just can compare the read counts for those comparison? I'm still confused whether read counts can be interpreted as gene expression level.

If you already know what samples have the SNP of interest then theoretically you could just visualise the normalised counts of that gene, if you want to see gene expression effects on the SNP. If you're doing SNP detection then DESeq2 is not going to do that for you, you'll need to do something like follow the GATK pipeline for SNP and Indel calling.

"I'm still confused whether read counts can be interpreted as gene expression level"

Well, that's what DESeq2 uses to quantify gene expression, integer gene level counts. What else could you use as a metric for 'gene expression'?

For SNP detection, I will do another workflow because the sample is separated between RNA-seq and exome seq. So, in conclusion, I just can compare the normalized count as the basis of comparison between sample. Ok. Thank you so much for your explanation.

0
3.4 years ago by
Simon Anders3.5k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.5k wrote:

Is "calculating fold changes" really what you want?

If so, just use

nc <- counts( dds, normalized=TRUE )

to obtain a matrix of normalized counts. Now, if you write

nc[,3]/nc[,1]

then you get a vector with fold changes of sample 3 vs sample 1. Of course, DESeq also offers what we call "shrunken log fold changes", which are log fold changes that are biased towards zero for noisy genes. (See vignette and paper for details.) For this, use "rlogTransformation" instead of "counts" (and use "-" instead of "/" because it is on the log scale).

Somehow, though, I feel that you might need more than though for your problem.

And the question you wanted to ask was not "how to calculate fold changes between specific samples?", but "How to perform a standard eQTL analysis on RNA-Seq data in R/Bioconductor?"

I guess one answer to that would be to use DESeq2's rlogTransformation to get a data matrix to feed to the GGTool package, but maybe somebody else here has a more up-to-date answer. (I have little hands-on experience with this.)