Question

edgeR

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.4 years ago

Hi there, I have a question regarding edgeR - or it might actually be a more general statistical question. In any case, I am using edgeR to analyse my read counts and really would appreciate help. My experimental setup is: Two genotypes (B and S) Two treatments ('trt' vs 'ntrt') Two time points (0hs 8hs). (Three bio reps) Now, I would like to identify reads that are specific to either of the genotypes as their response to the treatment over the time points. I expect that I can do pairwise comparisons like: 'B_tr_0hs' vs 'B_trt_8hs'), and 'B_ntr_0hs' vs 'B_ntrt_8hs'), and continuing doing the same with the S-genotype. Subsequently, using a suitable tool, I could filter out the transcripts for, say, B's response to treatment over these two time points that are not found in B. It is, however, a little tedious so my question here is whether this can be modeled and extracted in edgeR's GLM ? regards JD -- output of sessionInfo(): not relevant -- Sent via the guest posting facility at bioconductor.org.

edgeR edgeR • 2.0k views

ADD COMMENT • link updated 5.2 years ago by Gordon Smyth 53k • written 11.8 years ago by Guest User ★ 13k

score 0 · Answer 1 · 2014-04-08

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 14 hours ago

WEHI, Melbourne, Australia

If you want to select genes that are DE for one contrast but not another, first test each contrast:

 lrt1 <- glmLRT(fit, contrast=mycontrast1)
 lrt2 <- glmLRT(fit, contrast=mycontrast2)

Then apply significance thresholds:

 dt1 <- decideTestsDGE(lrt1)
 dt2 <- decideTestsDGE(lrt2)

Then select the genes you want:

 selected <- !dt1 & dt2

Note that edgeR is designed to work on gene counts rather than transcript counts. It will give results with transcript counts, but the results will be noisier.

Best wishes
Gordon

ADD COMMENT • link 11.8 years ago • updated 5.2 years ago Gordon Smyth 53k

0

Entering edit mode

Great! Thank you. That worked well. An additional one: Can you in this framework filter out the DE transcripts with a particular direction? E.g., if I wanted only the one with higher counts in one of the elements from lrt1? I have done that using the logFC (and exact tests), but I am not able to see how this can be done here. jahn -----Opprinnelig melding----- Fra: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] Sendt: 8. april 2014 02:22 Til: Jahn Davik Kopi: Bioconductor mailing list Emne: edgeR If you want to select transcript that are DE for one contrast but not another, first test each contrast: lrt1 <- glmLRT(fit, contrast=mycontrast1) lrt2 <- glmLRT(fit, contrast=mycontrast2) Then apply significance thresholds: dt1 <- decideTestsDGE(lrt1) dt2 <- decideTestsDGE(lrt2) Then select the transcripts you want: selected <- !dt1 & dt2 Best wishes Gordon > Date: Sun, 6 Apr 2014 03:32:49 -0700 (PDT) > From: "Jahn Davik [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, jahn.davik at bioforsk.no > Subject: [BioC] edgeR > > > Hi there, > I have a question regarding edgeR - or it might actually be a more > general statistical question. In any case, I am using edgeR to analyse > my read counts and really would appreciate help. > My experimental setup is: > Two genotypes (B and S) > Two treatments ('trt' vs 'ntrt') > Two time points (0hs 8hs). > (Three bio reps) > > Now, I would like to identify reads that are specific to either of the > genotypes as their response to the treatment over the time points. > I expect that I can do pairwise comparisons like: > 'B_tr_0hs' vs 'B_trt_8hs'), and 'B_ntr_0hs' vs 'B_ntrt_8hs'), and > continuing doing the same with the S-genotype. Subsequently, using a > suitable tool, I could filter out the transcripts for, say, B's > response to treatment over these two time points that are not found in > B. It is, however, a little tedious so my question here is whether > this can be modeled and extracted in edgeR's GLM ? > > regards > JD ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD REPLY • link 11.8 years ago Jahn Davik ▴ 110

0

Entering edit mode

Yes. If you want transcripts that are up in contrast1 and down in contrast2:

selected <- (dt1>0) & (dt2<0)

Or up in contrast1 and not changing in contrast2:

selected <- (dt1>0) & (dt2==0)

Gordon

ADD REPLY • link 11.8 years ago • updated 5.2 years ago Gordon Smyth 53k

0

Entering edit mode

Smashing! Thanks. jahn -----Opprinnelig melding----- Fra: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] Sendt: 8. april 2014 10:02 Til: Jahn Davik Kopi: Bioconductor mailing list Emne: Re: edgeR Yes. If you want transcripts that are up in contrast1 and down in contrast2: selected <- (dt1>0) & (dt2<0) Or up in contrast1 and not changing in contrast2: selected <- (dt1>0) & (dt2==0) Gordon On Tue, 8 Apr 2014, Jahn Davik wrote: > Great! Thank you. > That worked well. > An additional one: Can you in this framework filter out the DE > transcripts with a particular direction? E.g., if I wanted only the > one with higher counts in one of the elements from lrt1? I have done > that using the logFC (and exact tests), but I am not able to see how > this can be done here. > > jahn > > -----Opprinnelig melding----- > Fra: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] > Sendt: 8. april 2014 02:22 > Til: Jahn Davik > Kopi: Bioconductor mailing list > Emne: edgeR > > If you want to select transcript that are DE for one contrast but not another, first test each contrast: > > lrt1 <- glmLRT(fit, contrast=mycontrast1) > lrt2 <- glmLRT(fit, contrast=mycontrast2) > > Then apply significance thresholds: > > dt1 <- decideTestsDGE(lrt1) > dt2 <- decideTestsDGE(lrt2) > > Then select the transcripts you want: > > selected <- !dt1 & dt2 > > Best wishes > Gordon > > >> Date: Sun, 6 Apr 2014 03:32:49 -0700 (PDT) >> From: "Jahn Davik [guest]" <guest at="" bioconductor.org=""> >> To: bioconductor at r-project.org, jahn.davik at bioforsk.no >> Subject: [BioC] edgeR >> >> >> Hi there, > >> I have a question regarding edgeR - or it might actually be a more >> general statistical question. In any case, I am using edgeR to >> analyse my read counts and really would appreciate help. > >> My experimental setup is: > >> Two genotypes (B and S) >> Two treatments ('trt' vs 'ntrt') >> Two time points (0hs 8hs). >> (Three bio reps) >> >> Now, I would like to identify reads that are specific to either of >> the genotypes as their response to the treatment over the time points. > >> I expect that I can do pairwise comparisons like: > >> 'B_tr_0hs' vs 'B_trt_8hs'), and 'B_ntr_0hs' vs 'B_ntrt_8hs'), and >> continuing doing the same with the S-genotype. Subsequently, using a >> suitable tool, I could filter out the transcripts for, say, B's >> response to treatment over these two time points that are not found >> in B. It is, however, a little tedious so my question here is whether >> this can be modeled and extracted in edgeR's GLM ? >> >> regards >> JD > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:14}}

ADD REPLY • link 11.8 years ago Jahn Davik ▴ 110

0

Entering edit mode

I'm surprised that the given answer here was to check whether two fold changes are themselves significant or not, rather than what I tend to think of as a "contrast of contrasts" scenario to test whether the two contrasts' fold changes are significantly different from another between groups (i.e. testing for an interaction term)? A transcript that fails to pass the threshold for dt1 may be insignificant at that particular threshold, yet still indistinguishable from the dt2 response that does happen to pass the threshold. The absence of significance is not evidence for non- response. The OP seemed to be asking for a statistical test to look for differences in response between the two groups. Another problem with !dt1 & dt2 is that dt1 and dt2 may both be "true" but of opposite signs (though I see this addressed in a later email). So in a nutshell: is there statistical reason to prefer independent FDR threshold tests vs. direct testing for an interaction term? Thanks, -Aaron On Mon, Apr 7, 2014 at 8:22 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > If you want to select transcript that are DE for one contrast but not > another, first test each contrast: > > lrt1 <- glmLRT(fit, contrast=mycontrast1) > lrt2 <- glmLRT(fit, contrast=mycontrast2) > > Then apply significance thresholds: > > dt1 <- decideTestsDGE(lrt1) > dt2 <- decideTestsDGE(lrt2) > > Then select the transcripts you want: > > selected <- !dt1 & dt2 > > Best wishes > Gordon > > > Date: Sun, 6 Apr 2014 03:32:49 -0700 (PDT) >> From: "Jahn Davik [guest]" <guest@bioconductor.org> >> To: bioconductor@r-project.org, jahn.davik@bioforsk.no >> Subject: [BioC] edgeR >> >> >> Hi there, >> > > I have a question regarding edgeR - or it might actually be a more >> general statistical question. In any case, I am using edgeR to analyse my >> read counts and really would appreciate help. >> > > My experimental setup is: >> > > Two genotypes (B and S) >> Two treatments ('trt' vs 'ntrt') >> Two time points (0hs 8hs). >> (Three bio reps) >> >> Now, I would like to identify reads that are specific to either of the >> genotypes as their response to the treatment over the time points. >> > > I expect that I can do pairwise comparisons like: >> > > 'B_tr_0hs' vs 'B_trt_8hs'), and 'B_ntr_0hs' vs 'B_ntrt_8hs'), and >> continuing doing the same with the S-genotype. Subsequently, using a >> suitable tool, I could filter out the transcripts for, say, B's response to >> treatment over these two time points that are not found in B. It is, >> however, a little tedious so my question here is whether this can be >> modeled and extracted in edgeR's GLM ? >> >> regards >> JD >> > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:13}}

ADD REPLY • link 11.8 years ago Aaron Mackey ▴ 200

0

Entering edit mode

Aaron,

One reason for your surprise may be that you have read a lot of things into my reply that weren't there.

I simply showed how to use to software to collate results from multiple contrasts. The contrasts could be anything including interactions.

Gordon

ADD REPLY • link 11.8 years ago • updated 5.2 years ago Gordon Smyth 53k

0

Entering edit mode

My apologies -- I did not mean to sound confrontational, though I realize I probably did. Email is bad at that. I guess I was fishing for an expert's opinion on whether one of the two approaches is preferable (i.e. is one approach thought to be more powerful, better control of false positives, more meaningful, less prone to statistical artifact, less assumptions to be broken, etc.) Gordon, thank you (again) for all of your time and energy spent answering our questions! -Aaron On Wed, Apr 9, 2014 at 7:35 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Aaron, > > One reason for your surprise may be that you have read a lot of things > into my reply that weren't there. > > I simply showed how to use to software to collate results from multiple > contrasts. The contrasts could be anything including interactions. > > Gordon > > > On Tue, 8 Apr 2014, Aaron Mackey wrote: > > I'm surprised that the given answer here was to check whether two fold >> changes are themselves significant or not, rather than what I tend to >> think >> of as a "contrast of contrasts" scenario to test whether the two >> contrasts' >> fold changes are significantly different from another between groups (i.e. >> testing for an interaction term)? A transcript that fails to pass the >> threshold for dt1 may be insignificant at that particular threshold, yet >> still indistinguishable from the dt2 response that does happen to pass the >> threshold. The absence of significance is not evidence for non- response. >> The OP seemed to be asking for a statistical test to look for differences >> in response between the two groups. Another problem with !dt1 & dt2 is >> that dt1 and dt2 may both be "true" but of opposite signs (though I see >> this addressed in a later email). >> >> So in a nutshell: is there statistical reason to prefer independent FDR >> threshold tests vs. direct testing for an interaction term? >> >> Thanks, >> -Aaron >> >> >> On Mon, Apr 7, 2014 at 8:22 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: >> >> If you want to select transcript that are DE for one contrast but not >>> another, first test each contrast: >>> >>> lrt1 <- glmLRT(fit, contrast=mycontrast1) >>> lrt2 <- glmLRT(fit, contrast=mycontrast2) >>> >>> Then apply significance thresholds: >>> >>> dt1 <- decideTestsDGE(lrt1) >>> dt2 <- decideTestsDGE(lrt2) >>> >>> Then select the transcripts you want: >>> >>> selected <- !dt1 & dt2 >>> >>> Best wishes >>> Gordon >>> >>> >>> Date: Sun, 6 Apr 2014 03:32:49 -0700 (PDT) >>> >>>> From: "Jahn Davik [guest]" <guest@bioconductor.org> >>>> To: bioconductor@r-project.org, jahn.davik@bioforsk.no >>>> Subject: [BioC] edgeR >>>> >>>> >>>> Hi there, >>>> >>>> >>> I have a question regarding edgeR - or it might actually be a more >>> >>>> general statistical question. In any case, I am using edgeR to analyse >>>> my >>>> read counts and really would appreciate help. >>>> >>>> >>> My experimental setup is: >>> >>>> >>>> >>> Two genotypes (B and S) >>> >>>> Two treatments ('trt' vs 'ntrt') >>>> Two time points (0hs 8hs). >>>> (Three bio reps) >>>> >>>> Now, I would like to identify reads that are specific to either of the >>>> genotypes as their response to the treatment over the time points. >>>> >>> >>> I expect that I can do pairwise comparisons like: >>> >>> 'B_tr_0hs' vs 'B_trt_8hs'), and 'B_ntr_0hs' vs 'B_ntrt_8hs'), and >>> >>>> continuing doing the same with the S-genotype. Subsequently, using a >>>> suitable tool, I could filter out the transcripts for, say, B's >>>> response to >>>> treatment over these two time points that are not found in B. It is, >>>> however, a little tedious so my question here is whether this can be >>>> modeled and extracted in edgeR's GLM ? >>>> >>>> regards >>>> JD >>>> >>> > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:10}}

ADD REPLY • link 11.8 years ago Aaron Mackey ▴ 200