Question

Calculate DESeq2 fold change of 1 gene to 1 of the sample

0

Entering edit mode

bharata1803 ▴ 60

@bharata1803-7698

Last seen 5.1 years ago

Japan

Hello,

I am doing an RNA-seq data analysis. I already have several spesific gene as my target. What I want to do is actually comparing the gene expression level with the SNP mutation. So, for now, I want to get the gene expression level with this RNA-seq data. Because of my analysis focus on SNP and gene expression analysis, I don' have any interest in between group differential expression analysis. I already follow DESeq2 tutorial but I only get the fold change difference for 2 groups, normal and disease. So, the question is, I want to check the fold change to 1 of the sample as the base. I imagine like this. I have 14 sample. I will have fold change for difference between 1 and 2, 1 and 3, and so on, until 1 and 14. So, is this method possible to do in DESeq2 and probably you can share your idea how to do this with DESeq2. Thank you very much for your help.

deseq2 • 3.3k views

ADD COMMENT • link updated 9.0 years ago by Simon Anders ★ 3.7k • written 9.0 years ago by bharata1803 ▴ 60

0

Entering edit mode

Simon Anders ★ 3.7k

@simon-anders-3855

Last seen 3.7 years ago

Zentrum für Molekularbiologie, Universi…

Is "calculating fold changes" really what you want?

If so, just use

nc <- counts( dds, normalized=TRUE )

to obtain a matrix of normalized counts. Now, if you write

nc[,3]/nc[,1]

then you get a vector with fold changes of sample 3 vs sample 1. Of course, DESeq also offers what we call "shrunken log fold changes", which are log fold changes that are biased towards zero for noisy genes. (See vignette and paper for details.) For this, use "rlogTransformation" instead of "counts" (and use "-" instead of "/" because it is on the log scale).

Somehow, though, I feel that you might need more than though for your problem.

And the question you wanted to ask was not "how to calculate fold changes between specific samples?", but "How to perform a standard eQTL analysis on RNA-Seq data in R/Bioconductor?"

I guess one answer to that would be to use DESeq2's rlogTransformation to get a data matrix to feed to the GGTool package, but maybe somebody else here has a more up-to-date answer. (I have little hands-on experience with this.)

ADD COMMENT • link 9.0 years ago Simon Anders ★ 3.7k

0

Entering edit mode

Yeah, I also feel the analysis is more than that. My Professor said to me that he wanted to know the expression level difference and he gave me the example of one paper about 1 SNP reduced the expression level by 30%. So, I think probably the fold change is what we need. I also done some reading about eQTL but I still try to understand it. I also tried the rlog transform but just tried the function and make some graphs. The result seems similar from the qualitative point of view but I still don't know how to get the exact percentage. What I'm a bit confused by reading many manual or tutorial is most of them tried to find which gene is differentiated but my problem is to find the difference level of several specific gene. I will try to use GGTool package from you suggestion to further my analysis. Thank you for your suggestion. I'm a newbie in this area and every suggestion is really worthy for me.

ADD REPLY • link 9.0 years ago bharata1803 ▴ 60

score 1 · Accepted Answer · 2015-05-11

1

Entering edit mode

andrew.j.skelton73 ▴ 370

@andrewjskelton73-7074

Last seen 5 weeks ago

United Kingdom

Using one sample as a baseline is not a good idea. Check the manual on using single samples in dispersion estimation. What you could do, is if this is a SNP for a particular gene, you could get the normalised counts and look at that particular gene entry?

counts(dds,normalized=TRUE)

ADD COMMENT • link 9.0 years ago andrew.j.skelton73 ▴ 370

0

Entering edit mode

Hmm. I am a bit confused here. So, basically, for this particular gene, the info about possible SNP is already there. My task now is to check the SNP for each sample and then try to make some correlation between gene expression and thos SNP. For example, for SNP type 1, the gene expression level is reduced half if compared to SNP type2. So, what you mean is I just can compare the read counts for those comparison? I'm still confused whether read counts can be interpreted as gene expression level.

ADD REPLY • link 9.0 years ago bharata1803 ▴ 60

0

Entering edit mode

If you already know what samples have the SNP of interest then theoretically you could just visualise the normalised counts of that gene, if you want to see gene expression effects on the SNP. If you're doing SNP detection then DESeq2 is not going to do that for you, you'll need to do something like follow the GATK pipeline for SNP and Indel calling.

"I'm still confused whether read counts can be interpreted as gene expression level"

Well, that's what DESeq2 uses to quantify gene expression, integer gene level counts. What else could you use as a metric for 'gene expression'?

ADD REPLY • link 9.0 years ago andrew.j.skelton73 ▴ 370

0

Entering edit mode

For SNP detection, I will do another workflow because the sample is separated between RNA-seq and exome seq. So, in conclusion, I just can compare the normalized count as the basis of comparison between sample. Ok. Thank you so much for your explanation.

ADD REPLY • link 9.0 years ago bharata1803 ▴ 60