Entering edit mode
Hi all,
I am using below model for BMI and CpG array methylation association using limma? Could you please suggest how I can confirm whether this model is good or bad and is there any way to check for batch effect in this model?
#model matrix
var<-model.matrix(~BMI + as.factor(Gender) + Age +CD8T +CD4T +NK + Bcell +Mono ,data=targets2)
fit<-lmFit(mval,var)
fit2<-eBayes(fit,trend=TRUE, robust=TRUE)
probe<-topTable(fit2,adjust="BH",coef=2,num=Inf)
#sig.probe<-probe[which(probe$adj.P.Val<=0.05),]
write.table(sig.probe,file="Result.BMI.associated.probe.txt",sep="\t",quote=TRUE)
Dear James, Thank you so much, I got sample sample annotation and methylation Mvlaue like this not sure how to plot MDS? by using this command;
plotMDS(mvals.filt, col = batch, labels = outcome, cex = 0.7,cex.axis = 1, cex.lab = 1.6, main = "MDS batch")
I was assuming that you were using the
minfi
package, in which case it would bemdsPlot
. But I imagine any method that is meant to do an MDS plot should work.Thanks James, I am using limma. not sure how to merge sample info with count data.
You don't merge them. You should have as many rows for your sample data as you have columns for your methylation data (It's not counts!), and ideally they are in the same order. Then you can color the MDS plotting symbols using a column of your sample data.
Ok I got it now, I need to first order samples in same order in both files; sample info and mehtylation data and then can use mds plot command.
Dear James, I am trying to plot mval on MDS plot but it shows R memory issue;
You don't need to use all of the CpGs to do an MDS. For example,
mdsPlot
inminfi
, which you could use rather than doing it by hand, by default only uses 1000 CpG sites.Dear James, I am not using minfi. Is there any other way to find most 1000 variable CpG sites and use those for mds plot? Thanks for all help and time.
You could look at the code in
minfi
that does it.Dear James thanks, minfi used rowVars to select top 1000 rows, not sure how should I modify it for my data.
Will this below work?
But then why they used this below code; in my MDS tutorial they don't use this code (https://www.statmethods.net/advstats/mds.html)
You have a dataset that likely has hundreds of thousands of CpG measurements, and you don't have the memory to run
cmdscale
on that much data. Which is why you are filtering to a subset. You shouldn't expect a random example on the internet to need to filter, because conventionally one doesn't have that much data per subject.thanks James for your kind time and help; I am not sure now how to give col command to color point data based on Batch sample info column.
Thank you so much James for all help and time. I will try to use ggplot to improve figure quality. Could you also suggest a good resources to read about linear regression like in this code how I can be sure that coef 2 belongs to BMI
You are now asking questions that you could just as easily answer yourself using Google searches, or by reading the limma User's Guide. If you expect to get anywhere using R, you will need to learn how to answer simple questions by using Google and reading the relevant vignettes.
Thank you so much James, I understand it now from here; https://stackoverflow.com/questions/48564316/meaning-of-coef-in-limma, but could you please now help how to check linear model by plotting residuals to see if this model is good or bad?