Question: LIMMA on RT-PCR data
0
5.9 years ago by
India

Dear list,

I have RT-PCR data on 4 genes and 85 samples. Can I use 'limma' on this small set of genes?

I want to use limma rather than usual paired t-test because I have missing values and I don't want to miss the information available on paired samples..

Thanks,
Sandhya

limma pcr • 1.5k views
modified 16 months ago by Gordon Smyth38k • written 5.9 years ago by Sandhya Pemmasani Kiran40
0
5.9 years ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson7.4k wrote:
Hi, I don't think you want limma. limma is for sharing information about variance across many genes to make up for few samples per gene, and you have the opposite scenario: only have 4 genes but many samples. I think you just want the base R "lm" function. -Ryan On Tue Nov 12 03:16:25 2013, Sandhya Pemmasani Kiran wrote: > > Dear list, > > I have RT-PCR data on 4 genes and 85 samples. > Can I use 'limma' on this small set of genes. > > I want to use limma rather than usual paired t-test because I have missing values and I don't want to miss the information available on paired samples.. > > Please advise. > > > Thanks, > Sandhya > > ________________________________ > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. > > > The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. > > OCIMUMBIO SOLUTIONS (P) LTD > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
0
5.9 years ago by
Juan Pedro Steibel130 wrote:
Hello Sandhya This can be dealt with using a linear mixed effects model that can easily accommodate missing values and the pairing structure of control and test genes. When I was doing my PhD I came up with this model to help a collaborator with analysis of unbalanced qPCR data. Unfortunately no bioC tool available. But you can use our program or write your own in R using lmer or any other mixed model engine (regress, etc). Paper: http://www.ncbi.nlm.nih.gov/pubmed/19422910 Tool and other materials: https://www.msu.edu/~steibelj/JP_files/QPCR.html Cheers Juan P. On Nov 12, 2013, at 6:16 AM, Sandhya Pemmasani Kiran <sandhya.p at="" ocimumbio.com=""> wrote: > > Dear list, > > I have RT-PCR data on 4 genes and 85 samples. > Can I use 'limma' on this small set of genes. > > I want to use limma rather than usual paired t-test because I have missing values and I don't want to miss the information available on paired samples.. > > Please advise. > > > Thanks, > Sandhya > > ________________________________ > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. > > > The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. > > OCIMUMBIO SOLUTIONS (P) LTD > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
0
5.9 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

Hi Sandhya,

Yes, limma should work fine on this data, although you are on the lower boundary in terms of number of genes.  Theoretically, the minimum number of genes for the empirical Bayes procedure to be beneficial is three. Four genes is probably the minimum from a practical point of view.

You may already know how to use duplicateCorrelation.  If you have a simple before vs after paired comparison with some unmatched samples, then you could proceed like this:

design <- model.matrix(~treatment)
dupcor <- duplicateCorrelation(y, design, block=patient)
fit <- lmFit(y, design, block=patient, correlation=dupcor$consensus) fit <- eBayes(fit) topTable(fit,coef=2) Be sure to check that dupcor$consensus is greater than zero.

We used this strategy to compare tumour vs normal tissue in the presence of unmatched samples in

although that was microarray data rather than PCR.

Best wishes
Gordon

Wow, I wasn't aware that limma could be beneficial for such a small number of genes. That's good to know.

ADD REPLYlink modified 16 months ago by Gordon Smyth38k • written 5.9 years ago by Ryan C. Thompson7.4k

Hi Ryan,

Here's a simulated example, just to illustrate:

Simulate 50 patients with before and after treatment:

> library(limma)
> y <- matrix(rnorm(4*100),4,100)
> patient <- gl(50,2)
> treatment <- gl(2,1,100)

Drop 15 observations as missing:

> keep <- sample(1:100,85)
> y <- y[,keep]
> patient <- factor(patient[keep])
> treatment <- treatment[keep]
> design <- model.matrix(~treatment)
> dupcor <- duplicateCorrelation(y2,design,block=patient)
> dupcor$consensus [1] 0.0166576 Theoretically the correlation should be zero. > fit <- lmFit(y,design,block=patient,correlation=dupcor$consensus)
> fit <- eBayes(fit)
> fit$df.total [1] 115.5603 115.5603 115.5603 115.5603 > fit$df.prior
[1] 32.56032
> fit$df.residual [1] 83 83 83 83 The perfect solution here is to pool the variances which would give df.total = 4*83 = 332. The estimated value is slighty more conservative. limma takes care never to use df.total greater than the pooled value. > topTable(fit,coef=2) logFC AveExpr t P.Value adj.P.Val B 4 0.47806544 0.07233048 1.8941941 0.06069901 0.2427960 -3.954093 1 -0.12812909 0.04446857 -0.5713735 0.56885611 0.6946831 -5.112549 3 -0.09334881 0.08623914 -0.4705573 0.63884385 0.6946831 -5.150553 2 0.09463560 -0.00908128 0.3934894 0.69468313 0.6946831 -5.174667 No DE, which is correct. Cheers Gordon ADD REPLYlink modified 16 months ago • written 5.9 years ago by Gordon Smyth38k Hi Gordon and Ryan, Thank you so much for all your help. Relieved after knowing that I can use limma on this gene set, because for microarray I used limma. This is validation experiment with RT PCR. So I wanted to maintain uniformity (if that makes sense) My experimental design is spread across 85 samples like this- 4 batches- A, B, C and D In each Batch, 3 treatments - T1, T2, T3 Same patients were taken for treatments of a batch. I don't have comparisons across the batches, but I have comparisons within batches SampleName Batch Trt Patient S1 A 1 1 S2 A 2 1 S3 A 3 1 S4 A 1 2 S5 A 2 2 S6 A 3 2 .................... ..................... S7 D 1 30 S8 D 2 30 S9 D 3 30 Observation are missing on some samples. TD = factor(paste(Batch, Trt, sep="_")) Patient <- factor(Patient) design <- model.matrix(~0+TD+Patient) fit <- lmFit(dCT, design) cont.matrix <- makeContrasts(TDA_2 - TDA_1, levels = design) fit1 = contrasts.fit(fit, cont.matrix) fit2 = eBayes(fit1) Is this correct? I hope I don't need to use duplicateCorrelation as I don't want to compare Batch A with Batch B. Please advise. Thanks, Sandhya -----Original Message----- From: Gordon K Smyth [mailto:smyth@wehi.EDU.AU] Sent: Wednesday, November 13, 2013 5:16 AM To: Ryan Cc: Bioconductor mailing list; Sandhya Pemmasani Kiran Subject: Re: [BioC] LIMMA on RT-PCR data Hi Ryan, Here's a simulated example, just to illustrate: Simulate 50 patients with before and after treatment: > library(limma) > y <- matrix(rnorm(4*100),4,100) > patient <- gl(50,2) > treatment <- gl(2,1,100) Drop 15 observations as missing: > keep <- sample(1:100,85) > y <- y[,keep] > patient <- factor(patient[keep]) > treatment <- treatment[keep] > design <- model.matrix(~treatment) > dupcor <- duplicateCorrelation(y2,design,block=patient) > dupcor$consensus [1] 0.0166576 Theoretically the correlation should be zero. > fit <- lmFit(y,design,block=patient,correlation=dupcor$consensus) > fit <- eBayes(fit) > fit$df.total [1] 115.5603 115.5603 115.5603 115.5603 > fit$df.prior [1] 32.56032 > fit$df.residual [1] 83 83 83 83 The perfect solution here is to pool the variances which would give df.total = 4*83 = 332. The estimated value is slighty more conservative. limma takes care never to use df.total greater than the pooled value. > topTable(fit,coef=2) logFC AveExpr t P.Value adj.P.Val B 4 0.47806544 0.07233048 1.8941941 0.06069901 0.2427960 -3.954093 1 -0.12812909 0.04446857 -0.5713735 0.56885611 0.6946831 -5.112549 3 -0.09334881 0.08623914 -0.4705573 0.63884385 0.6946831 -5.150553 2 0.09463560 -0.00908128 0.3934894 0.69468313 0.6946831 -5.174667 No DE, which is correct. Cheers Gordon On Tue, 12 Nov 2013, Ryan wrote: > Wow, I wasn't aware that limma could be beneficial for such a small > number of genes. That's good to know. > > On Tue Nov 12 14:50:21 2013, Gordon K Smyth wrote: >> Hi Sandhya, >> >> Yes, limma should work fine on this data, although you are on the >> lower boundary in terms of number of genes. Theoretically, the >> minimum number of genes for the empirical Bayes procedure to be >> beneficial is three. Four genes is probably the minimum from a >> practical point of view. >> >> You may already know how to use duplicateCorrelation. If you have a >> simple before vs after paired comparison with some unmatched samples, >> then you could proceed like this: >> >> design <- model.matrix(~treatment) >> dupcor <- duplicateCorrelation(y, design, block=patient) fit <- >> lmFit(y, design, block=patient, correlation=dupcor$consensus) fit <- >> eBayes(fit) >> topTable(fit,coef=2) >> >> Be sure to check that dupcor$consensus is greater than zero. >> >> We used this strategy to compare tumour vs normal tissue in the >> presence of unmatched samples in >> >> http://www.ncbi.nlm.nih.gov/pubmed/17236199 >> >> although that was microarray data rather than PCR. >> >> Best wishes >> Gordon >> >> >> ---------- original message ---------- BioC] LIMMA on RT-PCR data >> Sandhya Pemmasani Kiran sandhya.p at ocimumbio.com Tue Nov 12 >> 12:16:25 CET 2013 >> >> Dear list, >> >> I have RT-PCR data on 4 genes and 85 samples. >> Can I use 'limma' on this small set of genes. >> >> I want to use limma rather than usual paired t-test because I have >> missing values and I don't want to miss the information available on >> paired samples.. >> >> Please advise. >> >> >> Thanks, >> Sandhya ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:25}}