The error when using beadarry and limma packages for analysing illumina chip
2
0
Entering edit mode
261092095 ▴ 10
@261092095-16052
Last seen 6.3 years ago

When I used beadarry and limma packages to perform differential gene expression analysis on the GSE49454 data set, I stuck to this group name and tried a lot of methods.

url<-"ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE49nnn/GSE49454/matrix/"

filenm<-"GSE49454_series_matrix.txt.gz"
if(!file.exists("GSE49454_series_matrix.txt.gz")) download.file(paste(url, filenm, sep=""), destfile=filenm)
gse <- getGEO(filename=filenm)

summaryData <- as(gse, "ExpressionSetIllumina")
rna <- factor(pData(summaryData)[,"characteristics_ch1"])
fData(summaryData)$Status <-
ifelse(fData(summaryData)$PROBEQUALITY=="No match","negative","regular" )
Detection(summaryData) <- calculateDetection(summaryData,
status=fData(summaryData)$Status)
#normalization
summaryData.norm <- normaliseIllumina(summaryData,method="neqc",
status=fData(summaryData)$Status)
group <- pData(summaryData.norm)[ ,"characteristics_ch1"]

limmaRes <- limmaDE(summaryData, SampleGroup="characteristics_ch1")

design <- model.matrix(~0+rna)

colnames(design) <- levels(rna)
aw <- arrayWeights(exprs(summaryData.norm), design)
aw
fit <- lmFit(exprs(summaryData.norm), design, weights=aw)
contrasts <- makeContrasts(group: SLE-group: Healthy control of SLE, levels=design)
contr.fit <- eBayes(contrasts.fit(fit, contrasts))
topTable(contr.fit, coef=1)

Error in makeContrasts(SLE - control, levels = design) : 
  The levels must by syntactically valid names in R, see help(make.names).  Non-valid names: rnagroup: Healthy control of SLE,rnagroup: SLE

limma beadarray illumina • 1.5k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 23 minutes ago
United States

One of the things you have to learn if you want to use R successfully is to pay attention to error messages. This one seems particularly clear to me:

Error in makeContrasts(SLE - control, levels = design) : 
  The levels must by syntactically valid names in R, see help(make.names).  Non-valid names: rnagroup: Healthy control of SLE,rnagroup: SLE

Which says that the levels for your groups have to be syntactically valid names, and says to look at the help for make.names. It even tells you which of the names aren't syntactically valid. Another way to put it would be to say that you have names that R can't use, and to use make.names to fix them so R can use them.

Do note that make.names is just going to take out all the spaces and replace with a '.', so it's often easier to just do it yourself, or to change to names that are syntactically valid. You can always see if a name is syntactically valid using make.names.

> make.names("is this syntactically valid?")
[1] "is.this.syntactically.valid."
> make.names("I_bet_this_one_is")
[1] "I_bet_this_one_is"
ADD COMMENT
0
Entering edit mode

Sorry, I am a newbie. I would also like to ask how I use the make.names function to change "group: SLE", "group: Healthy control of SLE" to "SLE" and "control" in the entire summaryData.norm dataset.

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

There are lots and lots of problems with your code. Many of the code steps that you give would give errors. For example,

as(gse, "ExpressionSetIllumina")

will fail because this is no method for coercing an ExpressionSet into an ExpressionSetIllumina (nor is there any need to convert the object anyway). Later on,

normaliseIllumina(summaryData,method="neqc"

will fail because "neqc" is not a normalization method provided by the normaliseIllumina() function (as the help page for that function will tell you!).

I don't know where you are getting code from, but you are making the whole analysis process unnecessarily complicated. The expression data provided by getGEO() is already background corrected and normalized, so there is no need for the beadarray package at all. beadarray is designed for low-level processing of raw Illumina data, but such processing has already been done. It is not correct to reprocess normalized data as if it was raw data.

The analysis you are trying to do is actually reasonably easy. You can delete all the code you have now and replace with:

> gse <- getGEO(filename=filenm)
> group <- pData(gse)$characteristics_ch1
> levels(group) <- c("Control","SLE")
> design <- model.matrix(~group)
> fit <- lmFit(gse,design)
> fit <- eBayes(fit,robust=TRUE)
> topTable(fit,coef=2)

James MacDonald will be cross with me because I have given you the solution instead of encouraging you to solve it for yourself, but you are so far down the wrong road that I feel I need to force you back in one big step.

ADD COMMENT
0
Entering edit mode

Thank you very much, my question has been solved according to your suggestion. Your code is very concise to use. I hope I can be like you one day.

ADD REPLY

Login before adding your answer.

Traffic: 1616 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6