Question

visualise model fit in edgeR

0

Entering edit mode

Iain Gallagher ▴ 930

@iain-gallagher-2532

Last seen 8.7 years ago

United Kingdom

Dear List I have been using the glmFit method in edgeR to analyse some RNA-Seq data. I will soon be presenting this data to a more statistically naive audience (and I'm no expert myself) and I was hoping to be able to prepapre a figure demonstrating how this particular edgeR analysis approach works. Basically what I'd like to do would be to plot count data for one (ore perhaps a few) of my genes and then draw a couple of lines showing the fit of the null and alternative models used in the glmLRT method of edgeR to assess gene regulation between conditions. I was hoping that this would allow me to illustrate the concept of testing for the likelihood of model fit and hence gene regulation between conditions. If anyone could help I'd be grateful. Best Iain

edgeR edgeR • 855 views

ADD COMMENT • link updated 12.5 years ago by Gordon Smyth 50k • written 12.5 years ago by Iain Gallagher ▴ 930

score 0 · Answer 1 · 2011-10-28

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 3 hours ago

WEHI, Melbourne, Australia

Hi Iain, You're asking a hard question, as drawing nice tutorial pictures for any statistical method can be lots of work, and the context here is harder than most. I think I'd find it hard to think of a good picture like you describe, even if I was just doing a ordinary multiple regression using lm() with univariate normal data. What covariate or factor are you testing for? Can you describe the picture you would draw if this was just an ordinary multiple regression problem? Best wishes Gordon ------------- original message ---------------- [BioC] visualise model fit in edgeR Iain Gallagher iaingallagher at btopenworld.com Tue Oct 25 16:06:11 CEST 2011 Dear List I have been using the glmFit method in edgeR to analyse some RNA-Seq data. I will soon be presenting this data to a more statistically naive audience (and I'm no expert myself) and I was hoping to be able to prepapre a figure demonstrating how this particular edgeR analysis approach works. Basically what I'd like to do would be to plot count data for one (ore perhaps a few) of my genes and then draw a couple of lines showing the fit of the null and alternative models used in the glmLRT method of edgeR to assess gene regulation between conditions. I was hoping that this would allow me to illustrate the concept of testing for the likelihood of model fit and hence gene regulation between conditions. If anyone could help I'd be grateful. Best Iain ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 12.5 years ago Gordon Smyth 50k

0

Entering edit mode

Dear Gordon Thanks for your reply. There's nothing like someone else's question to make one focus on what exactly one wants. This was certainly the case here! I have given this some thought from my statisically naive point of view and I have attached a mock-up picture of the kind of thing I envisaged (although I appreciate the real life situation is more complicated). The experimental design is as follows: Cells were collected from 6 animals and infected with one of 4 strains of bacteria or left uninfected. RNA was sampled at 2, 6, 24 & 48 hours post infection. There are thus 120 data points across the whole experiment. I have used edgeR to analyse the infected v control data at each timepoint using the GLM approach? - effectively a paired samples analysis for each timepoint? as per the edgeR manual (section 11). Perhaps there's something more sophisticated I could do here though. If you had any advice that would be great! design <- model.matrix(~ cow + infection) #dispersion estimate d <- estimateGLMCommonDisp(d, design) #fit the NB GLM for each gene fitFiltered <- glmFit(d, design, dispersion = d$common.dispersion) #carry out the likliehood ratio test lrtFiltered <- glmLRT(d, fitFiltered, coef = 7) For my audience I simply wanted to illustrate the fitting of the two models and how likelihood ratio tests are used rather than a t-test approach. In the attached pdf each black line represents the H1 model (with infection) and each red line represents the null model (cows only) for one gene only. The points are the 'raw data' (but not real data); C = control, I = infected. I realise this illustration is showing essentially a linear fit but I'm trying to aim for simplicity for the audience (a conceptual rather than entirely accurate approach). I would be happy to get my hands dirty coding something more lifelike as I think that would aid my understanding as well. I was going to describe this in terms of the 'fit' of each line to the data i.e. for the regulated gene the black line is the more 'likely' model whereas in the non-regulated gene there is little to separate the models. Hope this is somewhat useful. Best Iain ________________________________ From: Gordon K Smyth <smyth@wehi.edu.au> To: Iain Gallagher <iaingallagher at="" btopenworld.com=""> Cc: Yunshun Chen <yuchen at="" wehi.edu.au="">; Bioconductor mailing list <bioconductor at="" r-project.org=""> Sent: Friday, 28 October 2011, 6:36 Subject: visualise model fit in edgeR Hi Iain, You're asking a hard question, as drawing nice tutorial pictures for any statistical method can be lots of work, and the context here is harder than most.? I think I'd find it hard to think of a good picture like you describe, even if I was just doing a ordinary multiple regression using lm() with univariate normal data.? What covariate or factor are you testing for?? Can you describe the picture you would draw if this was just an ordinary multiple regression problem? Best wishes Gordon ------------- original message ---------------- [BioC] visualise model fit in edgeR Iain Gallagher iaingallagher at btopenworld.com Tue Oct 25 16:06:11 CEST 2011 Dear List I have been using the glmFit method in edgeR to analyse some RNA-Seq data. I will soon be presenting this data to a more statistically naive audience (and I'm no expert myself) and I was hoping to be able to prepapre a figure demonstrating how this particular edgeR analysis approach works. Basically what I'd like to do would be to plot count data for one (ore perhaps a few) of my genes and then draw a couple of lines showing the fit of the null and alternative models used in the glmLRT method of edgeR to assess gene regulation between conditions. I was hoping that this would allow me to illustrate the concept of testing for the likelihood of model fit and hence gene regulation between conditions. If anyone could help I'd be grateful. Best Iain ______________________________________________________________________ The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender. ______________________________________________________________________ -------------- next part -------------- A non-text attachment was scrubbed... Name: mockModel.pdf Type: application/pdf Size: 64641 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20111031="" 5c2543b2="" attachment.pdf="">

ADD REPLY • link 12.5 years ago Iain Gallagher ▴ 930