Question

Problems with using Limma for DEG

0

Entering edit mode

Nithisha ▴ 10

@nithisha-14272

Last seen 6.2 years ago

Hi all,

I downloaded the data for an experiment from GEO. It contains information regarding 70 samples and I have a dataframe with the sample names as the columns and the gene IDs as the rows. I believe this data is signal intensities that have been log2 transformed and normalized using RMA. In this case, do I need to first tranform them to CPM?

I also have another dataframe that contains metadata and it contains the row names that correspond to the count data' s columns.

The problem is that the samples contain 3 different types of treatments for 5 different time points. This information is contained under the time point column of the metadata df. I wish to carry out Limma anaysis for one time point at a time, and compare the control to each of the treatments.

Should I alter this line of code to reflect that I want to look into the differentially expressed genes among each treatment group vs the control at different timepoints?

# Create design matrix
design <- model.matrix(~ pData(bottomly.eset)$strain)

I'm really new to this and would appreciate any help I can get.

Thanks.

limma • 465 views

ADD COMMENT • link updated 6.5 years ago by Aaron Lun ★ 28k • written 6.5 years ago by Nithisha ▴ 10

score 0 · Answer 1 · 2017-10-26

If I may be frank: you would be better served by finding a bioinformatician or computational biologist in your institution (or somewhere nearby) who can show you the ropes. The BioC support site is intended to help users with specific questions about their code, while it seems that you are asking for general help with an entire analysis. To be sure, the DE analysis in question is unlikely to be particularly difficult, but to (mis)quote a post I once saw: "Giving statistical advice over the internet is the moral equivalent of an electrician helping someone wire their house over the phone." So you had better know what you are doing - and it seems that you don't. For example, RMA is a microarray pre-processing technique, while CPM is a concept that only exists in sequencing data, and it doesn't make sense to mix the two.

If there is no one around to help you, I would suggest grabbing a hot cup of coffee/tea/soup and settling in for a long night of reading the limma user's guide. It's only 150 pages - less than any of the Lord of the Rings books, and probably more exciting, to be honest.