Question

Array data vs. Next Gen with log 2 Fold Change

0

Entering edit mode

john herbert ▴ 560

@john-herbert-4612

Last seen 10.6 years ago

I have microarray data, which is 2 colour agilent human of 3 technical replicates. Green dye case and Red dye control. I have analysed in Limma, normalising within arrays and between arrays using aQuantile normalisation. I also have some Next gen RNAseq data that has been mapped to the Refseq transcriptome and I have these raw counts. However there are no replicates; only one case and one control. I want to plot how the Log2 Fold change is correlated between the two data sets as they are looking at similar samples. The microarray data is easy as Limma reports log2 fold change but NGS on the other hand does not. What would be the best package/approach to generating a log2 fold change of the next gen counts? I am thinking they should be quantile normalised as the microarray data is???? [[alternative HTML version deleted]]

RNASeq Microarray limma RNASeq Microarray limma • 1.7k views

ADD COMMENT • link updated 13.7 years ago by Gordon Smyth 52k • written 13.7 years ago by john herbert ▴ 560

score 0 · Answer 1 · 2011-07-11

I have microarray data, which is 2 colour agilent human of 3 technical replicates. Green dye case and Red dye control. I have analysed in Limma, normalising within arrays and between arrays using aQuantile normalisation. I also have some Next gen RNAseq data that has been mapped to the Refseq transcriptome and I have these raw counts. However there are no replicates; only one case and one control. I want to plot how the Log2 Fold change is correlated between the two data sets as they are looking at similar samples. The microarray data is easy as Limma reports log2 fold change but NGS on the other hand does not. What would be the best package/approach to generating a log2 fold change of the next gen counts? I am thinking they should be quantile normalised as the microarray data is???? [[alternative HTML version deleted]]

score 0 · Answer 2 · 2011-07-14

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 14 hours ago

WEHI, Melbourne, Australia

Dear John, The limma equivalent for RNA-Seq is the edgeR package. However, with no replicates, this won't do you much good. If you only want log2-fold changes from your RNA-Seq data, this is easy, although you have to decide what you'll do with zero counts. I suggest, read in your counts into variables y1 and y2, then lib.size1 <- sum(y1) lib.size2 <- sum(y2) logFC <- log2((y1+0.5)/(lib.size1+0.5)/(y2+0.5)*(lib.size2+0.5)) Best wishes Gordon > Date: Mon, 11 Jul 2011 22:45:17 +0100 > From: john herbert <arraystruggles at="" gmail.com=""> > To: bioconductor at r-project.org > Subject: [BioC] Array data vs. Next Gen with log 2 Fold Change > > I have microarray data, which is 2 colour agilent human of 3 technical > replicates. Green dye case and Red dye control. I have analysed in > Limma, normalising within arrays and between arrays using aQuantile > normalisation. > > I also have some Next gen RNAseq data that has been mapped to the Refseq > transcriptome and I have these raw counts. However there are no > replicates; only one case and one control. > > I want to plot how the Log2 Fold change is correlated between the two > data sets as they are looking at similar samples. > > The microarray data is easy as Limma reports log2 fold change but NGS on > the other hand does not. > > What would be the best package/approach to generating a log2 fold change > of the next gen counts? > > I am thinking they should be quantile normalised as the microarray data > is???? ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 13.7 years ago Gordon Smyth 52k

0

Entering edit mode

Dear Gordan, Thank you for your explanation. From my simplistic point of view, I wonder why it is different each side of the division. So to separate this out. logFC <- log2( (y1+0.5) / (lib.size1+0.5) / (y2+0.5)*(lib.size2+0.5) ) On the left side you divide y1 by lib.size but on the right side you multiply Y2 by lib.size? To me, each side of the division does not look equivalent? Is that definitely right? I make the assumption that quantile normalisation is not needed? Please explain. Thanks. John. On Wed, Jul 13, 2011 at 11:05 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > Dear John, > > The limma equivalent for RNA-Seq is the edgeR package. ?However, with no > replicates, this won't do you much good. ?If you only want log2-fold changes > from your RNA-Seq data, this is easy, although you have to decide what > you'll do with zero counts. ?I suggest, read in your counts into variables > y1 and y2, then > > ? lib.size1 <- sum(y1) > ? lib.size2 <- sum(y2) > ? logFC <- log2((y1+0.5)/(lib.size1+0.5)/(y2+0.5)*(lib.size2+0.5)) > > Best wishes > Gordon > >> Date: Mon, 11 Jul 2011 22:45:17 +0100 >> From: john herbert <arraystruggles at="" gmail.com=""> >> To: bioconductor at r-project.org >> Subject: [BioC] Array data vs. Next Gen with log 2 Fold Change >> >> I have microarray data, which is 2 colour agilent human of 3 technical >> replicates. Green dye case and Red dye control. I have analysed in Limma, >> normalising within arrays and between arrays using aQuantile normalisation. >> >> I also have some Next gen RNAseq data that has been mapped to the Refseq >> transcriptome and I have these raw counts. However there are no replicates; >> only one case and one control. >> >> I want to plot how the Log2 Fold change is correlated between the two data >> sets as they are looking at similar samples. >> >> The microarray data is easy as Limma reports log2 fold change but NGS on >> the other hand does not. >> >> What would be the best package/approach to generating a log2 fold change >> of the next gen counts? >> >> I am thinking they should be quantile normalised as the microarray data >> is???? > > ______________________________________________________________________ > The information in this email is confidential and intended solely for the > addressee. > You must not disclose, forward, print or use it without the permission of > the sender. > ______________________________________________________________________ >

ADD REPLY • link 13.7 years ago john herbert ▴ 560

0

Entering edit mode

Dear Gordon, After experimenting in Excel, it was a typo I think. But I am still interested why no quantile normalization? Is that because of no replicates? Thank you, Kind regards, John. On Thu, Jul 14, 2011 at 5:15 AM, john herbert <arraystruggles at="" gmail.com=""> wrote: > Dear Gordan, > Thank you for your explanation. From my simplistic point of view, I > wonder why it is different each side of the division. So to separate > this out. > > logFC <- log2( ? (y1+0.5) / (lib.size1+0.5) ? ?/ > (y2+0.5)*(lib.size2+0.5) ? ) > > On the left side you divide y1 by lib.size but on the right side you > multiply Y2 by lib.size? > > To me, each side of the division does not look equivalent? > > Is that definitely right? I make the assumption that quantile > normalisation is not needed? > > Please explain. > > Thanks. > > John. > > On Wed, Jul 13, 2011 at 11:05 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: >> Dear John, >> >> The limma equivalent for RNA-Seq is the edgeR package. ?However, with no >> replicates, this won't do you much good. ?If you only want log2-fold changes >> from your RNA-Seq data, this is easy, although you have to decide what >> you'll do with zero counts. ?I suggest, read in your counts into variables >> y1 and y2, then >> >> ? lib.size1 <- sum(y1) >> ? lib.size2 <- sum(y2) >> ? logFC <- log2((y1+0.5)/(lib.size1+0.5)/(y2+0.5)*(lib.size2+0.5)) >> >> Best wishes >> Gordon >> >>> Date: Mon, 11 Jul 2011 22:45:17 +0100 >>> From: john herbert <arraystruggles at="" gmail.com=""> >>> To: bioconductor at r-project.org >>> Subject: [BioC] Array data vs. Next Gen with log 2 Fold Change >>> >>> I have microarray data, which is 2 colour agilent human of 3 technical >>> replicates. Green dye case and Red dye control. I have analysed in Limma, >>> normalising within arrays and between arrays using aQuantile normalisation. >>> >>> I also have some Next gen RNAseq data that has been mapped to the Refseq >>> transcriptome and I have these raw counts. However there are no replicates; >>> only one case and one control. >>> >>> I want to plot how the Log2 Fold change is correlated between the two data >>> sets as they are looking at similar samples. >>> >>> The microarray data is easy as Limma reports log2 fold change but NGS on >>> the other hand does not. >>> >>> What would be the best package/approach to generating a log2 fold change >>> of the next gen counts? >>> >>> I am thinking they should be quantile normalised as the microarray data >>> is???? >> >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the >> addressee. >> You must not disclose, forward, print or use it without the permission of >> the sender. >> ______________________________________________________________________ >> >

ADD REPLY • link 13.7 years ago john herbert ▴ 560