Question: Log2FC values very small with SCAN
0
4.4 years ago by
Vani20
United States
Vani20 wrote:

Hi,

​I am using the SCAN method to normalize several geo datasets. The resulting Log2FC values of the normalized eset are very small (between -.5 and .5). Is this normal? Not sure why the values are so small.

scan log2fc • 801 views
modified 4.4 years ago by Gordon Smyth39k • written 4.4 years ago by Vani20
Answer: Log2FC values very small with SCAN
1
4.4 years ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

I downloaded the series data matrix for GSE21610 (which is already normalized using MAS5) and did a quick analysis using limma. There are plenty of large fold changes, some with log2FC > 3.

Even using SCAN normalization, there are a number of large fold changes, so your apparent claim that the log2FC are all between -0.5 and 0.5 is not actually true.

Whatever normalization method you use, the data analysis seems to be me to require more attention than this. This study has three possible values for the disease status: "none", "dilated cardiomyopathy" and "ischemic cardiomyopathy". There are other variables that should be adjusted for in the limma linear model (particularly gender and age). There are other important analysis steps that should be done to address data quality, especially filtering out unexpressed probes. I would rather see you giving attention to these fundamental analysis issues instead of worrying so much about the size of the fold changes.

Answer: Log2FC values very small with SCAN
0
4.4 years ago by
United States
Stephen Piccolo560 wrote:

Hi Vani,

It's hard to know what could cause this without knowing more about the data set and analysis you are doing. Can you provide a few more details (array type, sample size, method of calculating Log2FC, etc.)? Also, have you tried it with any other normalization methods?

Thanks,

-Steve

I am getting small values for the Affymetrix Human Genome U133 Plus 2.0 Array. The sample size is around 68. I am using limma's lmFit and toptable to generate the log2FC. I tried FRMA and the values ranged from -1.3 to 1.3.

Here is my code:

#Load data using InSilicoDb
eset21610 <- getDataset("GSE21610","GPL570",  format = "CURESET",norm = "SCAN", features = "GENE")

design1 <- model.matrix(~ Heart_Failure, pData(eset21610))

afterLimma <- lmFit(eset21610, design = design1)

e4 <- eBayes(afterLimma)

impdata <- topTable(e4,number = 19528,sort.by="logFC")

plot(impdata$logFC, -log10(impdata$P.Value),
xlim=c(-.6, .6), ylim=c(-1, 10),
xlab="log2 fold change", ylab="-log10 p-value")

Vani,

Sorry for the late reply. I looked at the data and did some simple simulations to make sure I understand what is going on. It appears this is because the variance is larger for the fRMA data than for the SCAN data. I don't know enough about how limma works to know how this affects the logFC values. Perhaps the authors of that tool could shed some light on this...