Question: How to get a list of differentialy expressed genes with data from Illumina array?
0
3.7 years ago by
kanacska10
Hungary
kanacska10 wrote:

Hi,

I want to do a list for differencialy expressed genes with txt data from this experiment: GSE38376

> idata <- read.ilmn("GSE38376_non-normalized.txt",probeid = "PROBE_ID",expr="SKBR")

> data_nobg <- backgroundCorrect(idata,method="normexp")

> data_norm <- normalizeBetweenArrays(data_nobg,method="quantile")

data_ norm is in Elist class

Now i would like to select the differentially expressed genes with creating a model.matrix and then do the lmFit but do i have to do the step ExpressionSetillumina? if i do how do i do do it , with the class Elist? If id ont how do i continue??

Thanks,

Anna

modified 3.7 years ago by James W. MacDonald49k • written 3.7 years ago by kanacska10
Answer: How to get a list of differentialy expressed genes with data from Illumina array
2
3.7 years ago by
United States
James W. MacDonald49k wrote:

You can just define a design matrix and fit a model using the limma package. The EList class is part of limma, so will be handled appropriately.

Answer: How to get a list of differentialy expressed genes with data from Illumina array
1
3.7 years ago by
United States
James W. MacDonald49k wrote:

You can answer this question yourself by simply reading the help page for topTable(). That's the first thing you should do when you get a result you don't understand. Here are your hints:

Arguments:

fit: list containing a linear model fit produced by lmFit,
lm.series, gls.series or mrlm.  For topTable, it
should be an object of class MArrayLM as produced by
lmFit and eBayes.

coef: column number or column name specifying which coefficient or
contrast of the linear model is of interest. For topTable,
can also be a vector of column subscripts, in which case the
gene ranking is by F-statistic for that set of contrasts.

Question: Did you specify a coefficient?

Further, under the Details section:

     topTableF ranks genes on the basis of moderated F-statistics for
one or more coefficients. If topTable is called and coef has
two or more elements, then the specified columns will be extracted
from fit and topTableF called on the result. topTable with
coef=NULL is the same as topTableF, unless the fitted model
fit has only one column.

Thanks: so is this right then (to get what i want a list of differentialy expressed genes:

# select differentially expressed genes
design_matrix <- as.data.frame(unlist(lapply(colnames(data_norm), function(x) substr(x,1,(nchar(x)-2)))))
colnames(design_matrix) <- "cellline"
rownames(design_matrix) <- colnames(data_norm)
design <- model.matrix(~design_matrix$cellline) fit <- lmFit(data_norm, design) fit2 <- eBayes(fit, trend=TRUE) # plot & write out results results <- decideTests(fit2) colnames(fit2) [1] "(Intercept)" "design_matrix$celllineKBR3-R 1uM Lapatinib"
[3] "design_matrix$celllineKBR3-R Control" "design_matrix$celllineKBR3 0.1uM Lapatinib"
[5] "design_matrix$celllineKBR3 1uM Lapatinib" "design_matrix$celllineKBR3 Control"

#So there is six columns, six samples (x 3 biological repeats) the "(Intercept)"= KBR3-R 0,1 uM Lapatinib and because of six samples coef=6 (?)

top <- topTable(fit2, number=length(data_norm[,1]), sort.by="t", coef= 6)

Or is it coef=5 cause when it said error: "Removing intercept from test coefficients..." cause it takes out the intercepot column

The way you have specified your design matrix sets (I'm guessing here) cellineKBR3-R 0.1 µM Lapatinib as the baseline (intercept term), and all the other coefficients are estimating the log fold change between the given sample type and the baseline. In other words, the second coefficient is estimating the difference between 1 µM Lapatinib and 0.1 µM Lapatinib in KBR3-R cells. That is probably not what you want, especially given the two different cell lines. You should probably instead fit a model without an intercept.

design <- model.matrix(~ 0 + celline, design_matrix)

And then use a contrasts matrix to make whatever comparisons you care about. This is all covered in the limma User's Guide, which if you plan on doing analyses yourself you should read quite thoroughly.

Answer: How to get a list of differentialy expressed genes with data from Illumina array
0
3.7 years ago by
kanacska10
Hungary
kanacska10 wrote:

Thanks i did it:

design_matrix <- as.data.frame(unlist(lapply(colnames(data_norm), function(x) substr(x,1,(nchar(x)-2)))))
colnames(design_matrix) <- "cellline"
rownames(design_matrix) <- colnames(data_norm)
design <- model.matrix(~design_matrix\$cellline)
fit <- lmFit(data_norm, design)
fit2 <- eBayes(fit, trend=TRUE)

Now i want to do a Table of Top Genes from Linear Model Fit:

> top <- topTable(fit2, number=length(data_norm[,1]), sort.by="t")

But i get an error message
Removing intercept from test coefficients
Error in match.argsort.by, c("F", "none")) :
'arg' should be one of “F”, “none”

why doesnt it work?