Search
Question: Adapting old Workshop to recent changes of the Limma package
0
2.5 years ago by
giroudpaul30
France
giroudpaul30 wrote:

Hi,

I am a Student currently trying to learn and to understand how to analyse microarray data with R and Bioconductor.

I am following this course : https://www.bits.vib.be/training/111-bits/training/previous-trainings/177-microarray-bioconductor

I am currently in the second part, but I already noticed some problems when doing the following :

tab = topTable(data.fit.eb,coef=2,number=2000,adjust.method="BH")
topups = tab[tab[, "logFC"] > 1, ]
colnames(topups)
IDsup = topups$ID The last line does not work since ID column has been remplaced by the ID in the rownames, so I worked this out alone. But now I have trouble with this part : data.fit.con = contrasts.fit(data.fit,contrast.matrix) data.fit.con.eb = eBayes(data.fit.con) results = decideTests(data.fit.con.eb,method='global',adjust.method="BH",p.value=0.05,lfc=1) res.IDsExvsCup = data.fit.con.eb$genes[results[,1]==1,1]
res.IDsExvsCdown = data.fit.con.eb$genes[results[,1]==-1,1] Because I get this : > res.IDsExvsCup NULL > res.IDsExvsCdown NULL mostly because "genes" doesn't exist in the MArrayLM data.fit.con.eb : > names(data.fit.con.eb) [1] "coefficients" "rank" "assign" "qr" [5] "df.residual" "sigma" "cov.coefficients" "stdev.unscaled" [9] "Amean" "method" "design" "contrasts" [13] "df.prior" "s2.prior" "var.prior" "proportion" [17] "s2.post" "t" "df.total" "p.value" [21] "lods" "F" "F.p.value"  I don't see what they want to refer to here ? Is it again to be replaced by rownames() ? but how ? Please help :) Thanks ADD COMMENTlink modified 2.5 years ago by Aaron Lun21k • written 2.5 years ago by giroudpaul30 1 2.5 years ago by Aaron Lun21k Cambridge, United Kingdom Aaron Lun21k wrote: It's probably best if you contact the person who manages the course, as it's their code and they'll be better placed to explain what they were trying to do. Nonetheless, here's some general comments: • We don't recommend filtering on the log-fold change directly. If you want to find genes that are above a particular log-fold change, we would suggest using treat. • The genes element is designed to hold a data frame containing information for each gene (e.g., gene symbols, descriptions) in each row. If you want downstream functions to make use of this, you should assign an appropriate data frame to data.fit.con.eb$genes.
• That said, if all you want is the identity of the DE genes, you can just do:
res.IDsExvsCup <- rownames(data.fit.con.eb)[results[,1]==1]

The treat method was presented in the "Compare two groups of sample" tutorial (as alternative to eBayes if I understood well). However, in the "Comparing Multiple groups" part, he just use eBayes. Here, my first example was just to show that the topups$ID did'nt worked. Afterward, he filter first on the adj.P.val and then on the LogFC. Do you have any link or explanation for the gene data frame and how to assign it ? All I want is indeed identify the DE genes, and I tried that solution, however, I get a dim error : > rownames(data.fit.con.eb)[results[,1]==1,1] Error in rownames(data.fit.con.eb)[results[, 1] == 1, 1] : incorrect number of dimensions > dim(data.fit.con.eb) [1] 45101 3 > dim(results) [1] 45101 3 Any idea ? Thank for your help ADD REPLYlink written 2.5 years ago by giroudpaul30 1 Read my code carefully. There is no second index in the subsetting, as the rownames produces a vector, not a matrix. Also, if you want to assign something to genes, just do: data.fit.con.eb$genes <- data.frame(ID=my.ids, Symbol=my.symbols)

... where you supply a vector of gene IDs and (optionally) symbols in my.ids and my.symbols, respectively.