LogFC query in Limma
1
0
Entering edit mode
@roopa-subbaiaih-5490
Last seen 10.1 years ago
United States
Hi James, Am I still making any silly mistake over here. I am comparing normal (21) with lesionous patient samples (34). This is what I get in R console. Is there a problem with the script? Any advice would greatly help for further analysis.Thanks, Roopa > getwd() [1] "C:/Documents and Settings/rsubbaiaih/My Documents" > setwd(dir="/CRSP 406-11/DEMOS/GSE14905-a") > ls() character(0) > #-----------------------------------------------# > library(affy) > eset = justRMA() Loading required package: AnnotationDbi > eset ExpressionSet (storageMode: lockedEnvironment) assayData: 54675 features, 55 samples element names: exprs, se.exprs protocolData sampleNames: GSM372286.CEL GSM372287.CEL ... GSM372367.CEL (55 total) varLabels: ScanDate varMetadata: labelDescription phenoData sampleNames: GSM372286.CEL GSM372287.CEL ... GSM372367.CEL (55 total) varLabels: sample varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hgu133plus2 > f <- factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, + 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2), labels=c("Healthy", "affected")) > design <- model.matrix(~ 0 + f) > design fHealthy faffected 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 1 0 8 1 0 9 1 0 10 1 0 11 1 0 12 1 0 13 1 0 14 1 0 15 1 0 16 1 0 17 1 0 18 1 0 19 1 0 20 1 0 21 1 0 22 0 1 23 0 1 24 0 1 25 0 1 26 0 1 27 0 1 28 0 1 29 0 1 30 0 1 31 0 1 32 0 1 33 0 1 34 0 1 35 0 1 36 0 1 37 0 1 38 0 1 39 0 1 40 0 1 41 0 1 42 0 1 43 0 1 44 0 1 45 0 1 46 0 1 47 0 1 48 0 1 49 0 1 50 0 1 51 0 1 52 0 1 53 0 1 54 0 1 55 0 1 attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$f [1] "contr.treatment" > colnames(design) <-c("Healthy","affected") > library(limma) > fit <- lmFit(eset, design) > fit An object of class "MArrayLM" $coefficients Healthy affected 1007_s_at 10.081309 9.548286 1053_at 6.807501 7.482849 117_at 5.969921 6.147594 121_at 7.403842 7.733666 1255_g_at 2.804475 2.827041 54670 more rows ... $rank [1] 2 $assign [1] 1 1 $qr $qr Healthy affected 1 -4.5825757 0.000000 2 0.2182179 -5.830952 3 0.2182179 0.000000 4 0.2182179 0.000000 5 0.2182179 0.000000 50 more rows ... $qraux [1] 1.218218 1.000000 $pivot [1] 1 2 $tol [1] 1e-07 $rank [1] 2 $df.residual [1] 53 53 53 53 53 54670 more elements ... $sigma 1007_s_at 1053_at 117_at 121_at 1255_g_at 0.2866489 0.4801618 0.3499880 0.2332555 0.1053397 54670 more elements ... $cov.coefficients Healthy affected Healthy 0.04761905 0.00000000 affected 0.00000000 0.02941176 $stdev.unscaled Healthy affected 1007_s_at 0.2182179 0.1714986 1053_at 0.2182179 0.1714986 117_at 0.2182179 0.1714986 121_at 0.2182179 0.1714986 1255_g_at 0.2182179 0.1714986 54670 more rows ... $pivot [1] 1 2 $genes [1] "1007_s_at" "1053_at" "117_at" "121_at" [5] "1255_g_at" 54670 more rows ... $Amean 1007_s_at 1053_at 117_at 121_at 1255_g_at 9.751804 7.224988 6.079756 7.607733 2.818425 54670 more elements ... $method [1] "ls" $design Healthy affected 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 50 more rows ... > contrast.matrix <-makeContrasts(affected-Healthy,levels = design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > results <- decideTests(fit2, adjust="fdr",lfc=1) > summary(results) affected - Healthy -1 1187 0 52504 1 984 > results TestResults matrix 1007_s_at 1053_at 117_at 121_at 1255_g_at 0 0 0 0 0 54670 more rows ... > On Thu, Jan 31, 2013 at 4:48 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > If you just use the expression values from the original authors, I get > just under 9K probesets for this comparison at an FDR of 0.05 and no fold > change criterion. It drops to just under 900 with a 2-fold difference added > in. > > So yeah, seems like a lot to me as well. > > Best, > > Jim > > > On 1/31/2013 4:02 PM, Steve Lianoglou wrote: > >> ... what Jim said. >> >> But also, this 20k differentially expressed (likely probe sets, not >> genes) is raising a red flag for me, no? Am I alone here? >> >> That's .. what's the word I'm looking for ... "a lot". >> >> -steve >> >> On Thu, Jan 31, 2013 at 3:56 PM, James W. MacDonald<jmacdon@uw.edu> >> wrote: >> >>> Hi Roopa, >>> >>> >>> On 1/31/2013 3:45 PM, Roopa Subbaiaih wrote: >>> >>>> Hi Steve, >>>> >>>> This was the script I used- >>>> getwd() >>>> setwd(dir="/CRSP 406-11/DEMOS/GSE14905-a") >>>> ls() >>>> #-----------------------------**------------------# >>>> library(affy) >>>> eset = justRMA() >>>> f<- factor(c(1,1,1,1,1,1,1,1,1,1,**1,1,1,1,1,1,1,1,1,1,1, >>>> >>>> 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,**2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,** >>>> 2,2,2,2), >>>> labels=c("Healthy", "unaffected")) >>>> design<- model.matrix(~ 0 + f) >>>> design >>>> colnames(design)<-c("Healthy",**"unaffected") >>>> design >>>> library(limma) >>>> fit<- lmFit(eset, design) >>>> library(hgu133plus2.db) >>>> fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"**hgu133plus2.db") >>>> contrast.matrix<-**makeContrasts(affected-**Healthy,levels = design) >>>> fit2<- contrasts.fit(fit, contrast.matrix) >>>> fit2<- eBayes(fit2) >>>> topTable(fit2,coef=1,p=0.05, adjust="fdr") >>>> results<- decideTests(fit2, adjust="fdr", p=0.05) >>>> summary(results) >>>> write.table(results,file="**myresults.txt") >>>> write.fit(). >>>> >>>> I had identified ~54,000 genes of which ~ 20K were differentially >>>> expressed. >>>> >>>> But when I use these genes for pathway analysis the software asks for >>>> fold >>>> change values but not p value so it is easier to analyze the data. >>>> >>>> What I did was - I used the differentially expressed gene table for >>>> further >>>> analysis. That is I converted logFC values to FC(test/control) assuming >>>> that >>>> >>>> FC= FCmean(test)-FCmean(blank) and LogFC is log2 of FC values. >>>> >>>> Once I got test/control values I converted them to fold changes using >>>> "IF" >>>> function in excel sheet to eliminate genes with fold changes between -2 >>>> to >>>> +2. >>>> >>>> Once I did this the number of significant genes drastically reduced to ~ >>>> 2,000. >>>> >>>> Is this the right method? >>>> >>> >>> No. Note that the range of fold changes after 'unlogging' will be 0-INF, >>> and >>> the down-regulated genes will be in the range 0-1 whereas the upregulated >>> genes will be in the range 1-INF. (e.g. two fold up will be 2, whereas 2 >>> fold down will be 1/2 or 0.5). >>> >>> The easiest way to filter is to keep the logFC and filter on -1 and 1. Or >>> you can use the lfc argument to decideTests(). >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> Please advice, thanks, Roopa >>>> >>>> On Thu, Jan 31, 2013 at 3:23 PM, Steve Lianoglou< >>>> mailinglist.honeypot@gmail.com**> wrote: >>>> >>>> Hi, >>>>> >>>>> On Thu, Jan 31, 2013 at 2:54 PM, Roopa Subbaiaih<rss115@case.edu> >>>>> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> Thanks for the reply, I could pull out the the whole information for >>>>>> differentially expressed genes. The criteria used was adjust="fdr", >>>>>> >>>>> p=0.05. >>>>> >>>>>> I came up with ~ 20,000 genes to be differentially expressed. >>>>>> >>>>> Hmm ... surely 20k cannot be correct? >>>>> >>>>> Since I wanted to analyze these genes for deregulated pathways I had to >>>>>> come up with fold change values for further analysis. >>>>>> >>>>>> I assume that for each gene FC= FCmean(test)-FCmean(blank). LogFC is >>>>>> log2 >>>>>> of FC values. >>>>>> >>>>>> When I convert the FC values (test/blank) to foldchanges using IF >>>>>> >>>>> function >>>>> >>>>>> I get lesser number of genes to be deregulated. The criteria was =>2 >>>>>> foldchanges and =<-2 fold changes. >>>>>> >>>>> I'm missing previous context to this email, so -- not sure what the >>>>> "IF function" is, but if you're using limma, the log2fold changes are >>>>> reported for you in the logFC column that is returned from >>>>> `topTable(...)` >>>>> >>>>> -steve >>>>> >>>>> -- >>>>> Steve Lianoglou >>>>> Graduate Student: Computational Systems Biology >>>>> | Memorial Sloan-Kettering Cancer Center >>>>> | Weill Medical College of Cornell University >>>>> Contact Info: http://cbio.mskcc.org/~lianos/**contact<http: cbi="" o.mskcc.org="" ~lianos="" contact=""> >>>>> >>>>> >>>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> >>> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- --------------------------------------- Roopa Shree Subbaiaih Post Doctoral Fellow Department of Dermatology School of Medicine Case Western Reserve University Cleveland, OH-44106 Tel:+1 216 368 0211 [[alternative HTML version deleted]]
Pathways Cancer probe limma convert Pathways Cancer probe limma convert • 1.3k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

Dear Roopa,

Perhaps you have just misread the limma output. This thread has been mainly concerned with your statement that you found ~20,000 genes to be differentially expressed:

  LogFC query in Limma

But the limma output shows 984 up-regulated and 1187 down-regulated genes, in other words about 2k total rather than 20k.

Best wishes
Gordon

ADD COMMENT
0
Entering edit mode

Dear Gordon,

Thanks, I could complete the analysis. Roopa

 

ADD REPLY

Login before adding your answer.

Traffic: 679 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6