Affymetrix fold change [newbie question]
2
0
Entering edit mode
fandresp • 0
@fandresp-9665
Last seen 8.2 years ago

First of all I've searched bioconductor support and other sites without success, if I just didn't find the answer I would love if someone could give me a link.

 

I'm trying to get into bioconductor, I do statistical work in R and I would love to fully migrate to this platform but I've hit a rock, probably because I'm really new to genetic-related-statistics.

 

I've performed analysis using "Transcriptome Analysis Console" an Affymetrix software and I'm unable to reproduce the results. This is a sample of the outcome I'm looking for:

ID              TUMOR                   NORMAL                   Fold Change                ANOVA

                  Bi-weight                 Bi-weight                   (linear)                           p-value

                Avg Signal (log2)    Avg Signal (log2)     (TUMOR vs. NORMAL)    
201890_at       9                              4.8                             18.42                     1.82E-14   

My fold change there is 18.42

 

Now what I'm doing using Bioconductor:

raw.data <- read.affy("covdesc")
x.mas <- call.exprs(raw.data,"rma")

x.filter <- nsFilter(x.mas, remove.dupEntrez = FALSE, var.cutof = 0.5)$eset

x.t_test <- rowttests(x.filter, "Treatment")

tt.data <- as.data.frame(x.t_test)
tt.data$log <- -log10(tt.data$p.value)

The result of that script is:

ID                   statistic             dm             p.value                  log

201890_at   -11.09443     -3.846618       1.81999e-14     13.73993

As you can see the ANOVA p-value is pretty much spot on (and this result is consistent across the dataset), but I fail to reproduce the fold-change. The rest of the numbers are totally off too and I'm not sure if I'm doing something wrong or just asking for different statistics

 

Sorry if my english is bad, it's not my first language. Thanks

affy • 2.6k views
ADD COMMENT
1
Entering edit mode
b.nota ▴ 360
@bnota-7379
Last seen 3.6 years ago
Netherlands

Why do you use anova/t-test when there is limma?

ADD COMMENT
0
Entering edit mode

I'm lost and I would love to learn enough but I don't seem to find a place where to start.

Regarding your comment, I've used genefilter's rowttest because seemed the logical way to replicate Affymetrix results.

If you could help me with how to use limma to find the fold change it would be great.

ADD REPLY
1
Entering edit mode
b.nota ▴ 360
@bnota-7379
Last seen 3.6 years ago
Netherlands

Why don't you start with reading the limma userguide?

ADD COMMENT
0
Entering edit mode

I've been giving it a try, this is my code after the rma normalization:

tratamiento <- factor(x.mas$Treatment, levels=c("TUMOR","NORMAL"))

design<- model.matrix(~0+tratamiento)
colnames(design) <- c("TUMOR", "NORMAL")

The design looks fine

   TUMOR NORMAL
1      1      0
2      1      0
3      1      0
4      1      0
5      1      0
6      1      0
7      1      0
8      1      0
9      1      0
10     1      0
11     1      0
12     1      0
13     1      0
14     1      0
15     1      0
16     1      0
17     1      0
18     1      0
19     1      0
20     1      0
21     1      0
22     1      0
23     1      0
24     1      0
25     1      0
26     1      0
27     1      0
28     1      0
29     1      0
30     1      0
31     1      0
32     1      0
33     1      0
34     1      0
35     1      0
36     1      0
37     1      0
38     1      0
39     1      0
40     1      0
41     0      1
42     0      1
43     0      1
44     0      1
45     0      1
46     0      1
47     0      1

But I'm at a lost after that, I'm trying the following:

fit <- lmFit(x.mas, design)
fit2 <- eBayes(fit)

fold <- as.data.frame(topTable(fit2, n=40, adjust="BH"))
fold$ID <- rownames(fold)

But the results are weird... all p-values are below e-80 and the averages are all around 13 (all)

I'm pretty sure that I have a basic concept problem, don't know if I'm in the right path.

ADD REPLY
0
Entering edit mode

You'll have to use the adjusted p-values. The (log2) Fold change are in the first column: fold$logFC

 

ADD REPLY
1
Entering edit mode

That's not the problem. When you fit a cell means model you also have to specify a contrast. Right now the OP is doing an F-test to see if the expression of either group is greater than zero, which obviously. So you need to add

contrast <- matrix(c(1,-1), nc = 1)
fit2 <- contrasts.fit(fit, contrast)
fit2 <- eBayes(fit2)
topTable(fit2, 1, 40)
ADD REPLY
0
Entering edit mode

1st thank you (both fo you as a matter of fact) for the help. Now I understand what I was doing and why is not meant to show a Fold Change.

Now... about the logFC. It's way off what Affymetrix it's telling me:

Affymetrix (linear) TUMOR vs NORMAL --> -7.36

limma (log2?) --> -2.811211

 

 

ADD REPLY
1
Entering edit mode

To un-log and get fold change in R:

> 1/(2^-2.811211)
[1] 7.018735

Or if you prefer to have the minus with it:

> -1*1/(2^-2.811211)
[1] -7.018735

 

 

ADD REPLY
0
Entering edit mode

First thank you both so much for the help, the FC seems ok and I've checked a few genes and there seems to be some error but I think it has to be rounding errors on Affymetrix side of things.

 

Now I would like to take a chance and ask you again. I've seen other gene expression files that present data in a different way. I'm interested in Agilents because seems to be the next step. Let me show you an expression example.

Composite Element REF    log2 lowess normalized (cy5/cy3) 

                                            collapsed by gene symbol
ELMO2                               -0.30325
CREB3L1                            0.95

 

I've been reading limma manual and some post of the author of limma and to my understanding the fold change is calculated as the difference of the log2 expression of the genes. In that file I seem to already have that expression. Could I calculate the fold change with mean(Group1) - mean(Group2) since I already have the lo2 expression?

I understand that to do this I would have to merge several files together and differentiate between tumor samples and non-tumor samples, but would that difference of means give me the Fold Change?

Thank you both again for the help

ADD REPLY
0
Entering edit mode

I don't seem to have a logFC column:

> names(fold)
[1] "TUMOR"     "NORMAL"    "AveExpr"   "F"         "P.Value"   "adj.P.Val" "ID"  

 

ADD REPLY
1
Entering edit mode

Yeah you'll have to make a contrast first, like james commented above.

ADD REPLY

Login before adding your answer.

Traffic: 486 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6