How is fold change reported in the samr output?

0

Entering edit mode

brian snail ▴ 20

@brian-snail-4257

Last seen 11.4 years ago

Hi. Below is an outline of the code I am using to analyse two colour array data. Normal = Cy3 Case = Cy5 There are 3 replicates. Can the fold change be reported back in the siggenes.table? From the documentation, it is not obvious how this is done. Thank you for any help. # Code for Two colour data normalised with maNorm. ######## data.normalised <- (two colour array matrix of data); # Make column names 1, 2 and 3 colnames(data.normalised) <- seq(1:3); # Make the row names gene names rownames(data.normalised) <- gene_names; library(impute); # Replace nas with numbers imputed<-impute.knn(data.normalised,k=3); # Take the raw data from impute object x<-imputed$data; # One class, so a list of 1's y<-c(rep(1,3)); # Prepare the samr data data.sam <- list(x=x, y=y, genenames=rownames(data.normalised), geneid=gene_names,logged2=T); library(samr); # Do the one class SAMR analysis samr.obj <- samr(data.sam,resp.type="One class",nperms=100); # Set up a delta value delta=0.4 # Compute the delta table delta.table <- samr.compute.delta.table(samr.obj) # Collect the significant genes into a table siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, data.sam, delta.table); [[alternative HTML version deleted]]

siggenes impute siggenes impute • 2.6k views

ADD COMMENT • link updated 15.3 years ago by James W. MacDonald 68k • written 15.3 years ago by brian snail ▴ 20

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 18 hours ago

United States

Hi Brian, On 9/14/2010 4:30 AM, brian snail wrote: > Hi. > Below is an outline of the code I am using to analyse two colour array data. > Normal = Cy3 > Case = Cy5 > There are 3 replicates. > > Can the fold change be reported back in the siggenes.table? From the > documentation, it is not obvious how this is done. I'm not sure what you are asking here. If I run the example for samr.compute.siggenes.table, and then look at the colnames for the two relevant list members I get: > colnames(siggenes.table$genes.lo) [1] "Row" "Gene ID" "Gene Name" [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" [7] "Fold Change" "q-value(%)" > colnames(siggenes.table$genes.up) [1] "Row" "Gene ID" "Gene Name" [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" [7] "Fold Change" "q-value(%)" But maybe I misunderstand your question? > > Thank you for any help. > > > > > > # Code for Two colour data normalised with maNorm. ######## > > data.normalised<- (two colour array matrix of data); > > # Make column names 1, 2 and 3 > colnames(data.normalised)<- seq(1:3); Ugh. This isn't Perl, or god forbid, SAS. Although the R interpreter is happy to strip off these trailing semicolons, they are not necessary, and IMO really ugly. Best, Jim > > # Make the row names gene names > rownames(data.normalised)<- gene_names; > > library(impute); > # Replace nas with numbers > imputed<-impute.knn(data.normalised,k=3); > > # Take the raw data from impute object > x<-imputed$data; > > # One class, so a list of 1's > y<-c(rep(1,3)); > > # Prepare the samr data > data.sam<- list(x=x, y=y, genenames=rownames(data.normalised), > geneid=gene_names,logged2=T); > > library(samr); > > # Do the one class SAMR analysis > samr.obj<- samr(data.sam,resp.type="One class",nperms=100); > > # Set up a delta value > delta=0.4 > > # Compute the delta table > delta.table<- samr.compute.delta.table(samr.obj) > > # Collect the significant genes into a table > siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, data.sam, > delta.table); > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD COMMENT • link 15.3 years ago James W. MacDonald 68k

0

Entering edit mode

Hi James thanks for the answer, :-) Perl is not so bad With my data, I get this.... colnames(siggenes.table$genes.lo) [1] "Row" "Gene ID" "Gene Name" "Score(d)" [5] "Numerator(r)" "Denominator(s+s0)" "q-value(%)" Did I run SAMR in the wrong manner? On Tue, Sep 14, 2010 at 2:36 PM, James W. MacDonald <jmacdon@med.umich.edu>wrote: > Hi Brian, > > > On 9/14/2010 4:30 AM, brian snail wrote: > >> Hi. >> Below is an outline of the code I am using to analyse two colour array >> data. >> Normal = Cy3 >> Case = Cy5 >> There are 3 replicates. >> >> Can the fold change be reported back in the siggenes.table? From the >> documentation, it is not obvious how this is done. >> > > I'm not sure what you are asking here. If I run the example for > samr.compute.siggenes.table, and then look at the colnames for the two > relevant list members I get: > > > colnames(siggenes.table$genes.lo) > [1] "Row" "Gene ID" "Gene Name" > [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" > [7] "Fold Change" "q-value(%)" > > colnames(siggenes.table$genes.up) > [1] "Row" "Gene ID" "Gene Name" > [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" > [7] "Fold Change" "q-value(%)" > > But maybe I misunderstand your question? > > > >> Thank you for any help. >> >> >> >> >> >> # Code for Two colour data normalised with maNorm. ######## >> >> data.normalised<- (two colour array matrix of data); >> >> # Make column names 1, 2 and 3 >> colnames(data.normalised)<- seq(1:3); >> > > Ugh. This isn't Perl, or god forbid, SAS. Although the R interpreter is > happy to strip off these trailing semicolons, they are not necessary, and > IMO really ugly. > > Best, > > Jim > > > > >> # Make the row names gene names >> rownames(data.normalised)<- gene_names; >> >> library(impute); >> # Replace nas with numbers >> imputed<-impute.knn(data.normalised,k=3); >> >> # Take the raw data from impute object >> x<-imputed$data; >> >> # One class, so a list of 1's >> y<-c(rep(1,3)); >> >> # Prepare the samr data >> data.sam<- list(x=x, y=y, genenames=rownames(data.normalised), >> geneid=gene_names,logged2=T); >> >> library(samr); >> >> # Do the one class SAMR analysis >> samr.obj<- samr(data.sam,resp.type="One class",nperms=100); >> >> # Set up a delta value >> delta=0.4 >> >> # Compute the delta table >> delta.table<- samr.compute.delta.table(samr.obj) >> >> # Collect the significant genes into a table >> siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, data.sam, >> delta.table); >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD REPLY • link 15.3 years ago Johnny H ▴ 80

0

Entering edit mode

Hmm. Did Brian Snail morph into Johnny H? On 9/14/2010 11:01 AM, Johnny H wrote: > Hi James > thanks for the answer, :-) Perl is not so bad No, Perl is fine. I wasn't bashing. Just noting that R doesn't use semicolons for line delimiters, so adding them is IMO ugly. > > With my data, I get this.... > > colnames(siggenes.table$genes.lo) > [1] "Row" "Gene ID" "Gene Name" > "Score(d)" > [5] "Numerator(r)" "Denominator(s+s0)" "q-value(%)" > > Did I run SAMR in the wrong manner? I dunno. I assume since you got output that things are at least semi OK. You don't give the output of sessionInfo(), so I don't know if you are using some old version that doesn't give fold change yet. Anyway, it's irrelevant I suppose, because you don't need fold change to be given explicitly. Note that the numerator of a t-statistic is simply the difference in the mean between two groups. In your case (if in fact your case is the same as Brian Snail's case), this difference is intrinsic to the ratios you fed into samr. Assuming you took logs, what you fed into samr was log2(R/G), which is the same as log2(R) - log2(G), e.g., the difference between the red and green channel, on the log scale. The numerator of the resulting t-statistic is the mean of these ratios. So depending on what you mean by fold change, you already have it (I prefer to work in the log space) or you can exponentiate to get the fold change on the natural scale. Best, Jim > > > > > On Tue, Sep 14, 2010 at 2:36 PM, James W. MacDonald > <jmacdon at="" med.umich.edu="">wrote: > >> Hi Brian, >> >> >> On 9/14/2010 4:30 AM, brian snail wrote: >> >>> Hi. >>> Below is an outline of the code I am using to analyse two colour array >>> data. >>> Normal = Cy3 >>> Case = Cy5 >>> There are 3 replicates. >>> >>> Can the fold change be reported back in the siggenes.table? From the >>> documentation, it is not obvious how this is done. >>> >> >> I'm not sure what you are asking here. If I run the example for >> samr.compute.siggenes.table, and then look at the colnames for the two >> relevant list members I get: >> >>> colnames(siggenes.table$genes.lo) >> [1] "Row" "Gene ID" "Gene Name" >> [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" >> [7] "Fold Change" "q-value(%)" >>> colnames(siggenes.table$genes.up) >> [1] "Row" "Gene ID" "Gene Name" >> [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" >> [7] "Fold Change" "q-value(%)" >> >> But maybe I misunderstand your question? >> >> >> >>> Thank you for any help. >>> >>> >>> >>> >>> >>> # Code for Two colour data normalised with maNorm. ######## >>> >>> data.normalised<- (two colour array matrix of data); >>> >>> # Make column names 1, 2 and 3 >>> colnames(data.normalised)<- seq(1:3); >>> >> >> Ugh. This isn't Perl, or god forbid, SAS. Although the R interpreter is >> happy to strip off these trailing semicolons, they are not necessary, and >> IMO really ugly. >> >> Best, >> >> Jim >> >> >> >> >>> # Make the row names gene names >>> rownames(data.normalised)<- gene_names; >>> >>> library(impute); >>> # Replace nas with numbers >>> imputed<-impute.knn(data.normalised,k=3); >>> >>> # Take the raw data from impute object >>> x<-imputed$data; >>> >>> # One class, so a list of 1's >>> y<-c(rep(1,3)); >>> >>> # Prepare the samr data >>> data.sam<- list(x=x, y=y, genenames=rownames(data.normalised), >>> geneid=gene_names,logged2=T); >>> >>> library(samr); >>> >>> # Do the one class SAMR analysis >>> samr.obj<- samr(data.sam,resp.type="One class",nperms=100); >>> >>> # Set up a delta value >>> delta=0.4 >>> >>> # Compute the delta table >>> delta.table<- samr.compute.delta.table(samr.obj) >>> >>> # Collect the significant genes into a table >>> siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, data.sam, >>> delta.table); >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> ********************************************************** >> Electronic Mail is not secure, may not be read every day, and should not be >> used for urgent or sensitive issues >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD REPLY • link 15.3 years ago James W. MacDonald 68k

0

Entering edit mode

Dear James, Sorry, yes;(the other way around) Brian the snail is from the famous UK TV program as I want a dedicated mail account for Bioconductor (tired of sorting personal emails), JH is now unsubscribed. Thanks for the clear answer. Interestingly I ran again at home using the One class example in the samr documentation, which gives the same output. Row Gene ID Gene Name Score(d) Numerator(r) Denominator(s+s0) q-value(%) [1,] "271" "g270" "270" "3.95220187465345" "1.28892719353853" "0.326128885724379" "0" [2,] "961" "g960" "960" "3.58479471552993" "2.33829323731544" "0.652280931788244" "0" [3,] "647" "g646" "646" "3.51526436711204" "2.00153278609765" "0.569383288728867" "0" [4,] "761" "g760" "760" "3.43862390232555" "1.53773350792769" "0.447194445105705" "0" [5,] "338" "g337" "337" "3.06940912129119" "1.85532829874772" "0.604457804558568" "0" But if the numerator is the equivalent of log2(fold change) then it is only a column name difference. Great. Thanks again, John On Tue, Sep 14, 2010 at 7:22 PM, James W. MacDonald <jmacdon@med.umich.edu>wrote: > Hmm. Did Brian Snail morph into Johnny H? > > > On 9/14/2010 11:01 AM, Johnny H wrote: > >> Hi James >> thanks for the answer, :-) Perl is not so bad >> > > No, Perl is fine. I wasn't bashing. Just noting that R doesn't use > semicolons for line delimiters, so adding them is IMO ugly. > > > >> With my data, I get this.... >> >> colnames(siggenes.table$genes.lo) >> [1] "Row" "Gene ID" "Gene Name" >> "Score(d)" >> [5] "Numerator(r)" "Denominator(s+s0)" "q-value(%)" >> >> Did I run SAMR in the wrong manner? >> > > I dunno. I assume since you got output that things are at least semi OK. > You don't give the output of sessionInfo(), so I don't know if you are using > some old version that doesn't give fold change yet. Anyway, it's irrelevant > I suppose, because you don't need fold change to be given explicitly. > > Note that the numerator of a t-statistic is simply the difference in the > mean between two groups. In your case (if in fact your case is the same as > Brian Snail's case), this difference is intrinsic to the ratios you fed into > samr. Assuming you took logs, what you fed into samr was log2(R/G), which is > the same as log2(R) - log2(G), e.g., the difference between the red and > green channel, on the log scale. > > The numerator of the resulting t-statistic is the mean of these ratios. So > depending on what you mean by fold change, you already have it (I prefer to > work in the log space) or you can exponentiate to get the fold change on the > natural scale. > > Best, > > Jim > > > > >> >> >> >> On Tue, Sep 14, 2010 at 2:36 PM, James W. MacDonald >> <jmacdon@med.umich.edu>wrote: >> >> Hi Brian, >>> >>> >>> On 9/14/2010 4:30 AM, brian snail wrote: >>> >>> Hi. >>>> Below is an outline of the code I am using to analyse two colour array >>>> data. >>>> Normal = Cy3 >>>> Case = Cy5 >>>> There are 3 replicates. >>>> >>>> Can the fold change be reported back in the siggenes.table? From the >>>> documentation, it is not obvious how this is done. >>>> >>>> >>> I'm not sure what you are asking here. If I run the example for >>> samr.compute.siggenes.table, and then look at the colnames for the two >>> relevant list members I get: >>> >>> colnames(siggenes.table$genes.lo) >>>> >>> [1] "Row" "Gene ID" "Gene Name" >>> [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" >>> [7] "Fold Change" "q-value(%)" >>> >>>> colnames(siggenes.table$genes.up) >>>> >>> [1] "Row" "Gene ID" "Gene Name" >>> [4] "Score(d)" "Numerator(r)" "Denominator(s+s0)" >>> [7] "Fold Change" "q-value(%)" >>> >>> But maybe I misunderstand your question? >>> >>> >>> >>> Thank you for any help. >>>> >>>> >>>> >>>> >>>> >>>> # Code for Two colour data normalised with maNorm. ######## >>>> >>>> data.normalised<- (two colour array matrix of data); >>>> >>>> # Make column names 1, 2 and 3 >>>> colnames(data.normalised)<- seq(1:3); >>>> >>>> >>> Ugh. This isn't Perl, or god forbid, SAS. Although the R interpreter is >>> happy to strip off these trailing semicolons, they are not necessary, and >>> IMO really ugly. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> >>> # Make the row names gene names >>>> rownames(data.normalised)<- gene_names; >>>> >>>> library(impute); >>>> # Replace nas with numbers >>>> imputed<-impute.knn(data.normalised,k=3); >>>> >>>> # Take the raw data from impute object >>>> x<-imputed$data; >>>> >>>> # One class, so a list of 1's >>>> y<-c(rep(1,3)); >>>> >>>> # Prepare the samr data >>>> data.sam<- list(x=x, y=y, genenames=rownames(data.normalised), >>>> geneid=gene_names,logged2=T); >>>> >>>> library(samr); >>>> >>>> # Do the one class SAMR analysis >>>> samr.obj<- samr(data.sam,resp.type="One class",nperms=100); >>>> >>>> # Set up a delta value >>>> delta=0.4 >>>> >>>> # Compute the delta table >>>> delta.table<- samr.compute.delta.table(samr.obj) >>>> >>>> # Collect the significant genes into a table >>>> siggenes.table<-samr.compute.siggenes.table(samr.obj,delta, data.sam, >>>> delta.table); >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> Douglas Lab >>> University of Michigan >>> Department of Human Genetics >>> 5912 Buhl >>> 1241 E. Catherine St. >>> Ann Arbor MI 48109-5618 >>> 734-615-7826 >>> ********************************************************** >>> Electronic Mail is not secure, may not be read every day, and should not >>> be >>> used for urgent or sensitive issues >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD REPLY • link 15.3 years ago brian snail ▴ 20

Login before adding your answer.