Question

The error when using beadarry and limma packages for analysing illumina chip

0

Entering edit mode

261092095 ▴ 10

@261092095-16052

Last seen 5.6 years ago

When I used beadarry and limma packages to perform differential gene expression analysis on the GSE49454 data set, I stuck to this group name and tried a lot of methods.

url<-"ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE49nnn/GSE49454/matrix/"

filenm<-"GSE49454_series_matrix.txt.gz"
if(!file.exists("GSE49454_series_matrix.txt.gz")) download.file(paste(url, filenm, sep=""), destfile=filenm)
gse <- getGEO(filename=filenm)

summaryData <- as(gse, "ExpressionSetIllumina")
rna <- factor(pData(summaryData)[,"characteristics_ch1"])
fData(summaryData)$Status <-
ifelse(fData(summaryData)$PROBEQUALITY=="No match","negative","regular" )
Detection(summaryData) <- calculateDetection(summaryData,
status=fData(summaryData)$Status)
#normalization
summaryData.norm <- normaliseIllumina(summaryData,method="neqc",
status=fData(summaryData)$Status)
group <- pData(summaryData.norm)[ ,"characteristics_ch1"]

limmaRes <- limmaDE(summaryData, SampleGroup="characteristics_ch1")

design <- model.matrix(~0+rna)

colnames(design) <- levels(rna)
aw <- arrayWeights(exprs(summaryData.norm), design)
aw
fit <- lmFit(exprs(summaryData.norm), design, weights=aw)
contrasts <- makeContrasts(group: SLE-group: Healthy control of SLE, levels=design)
contr.fit <- eBayes(contrasts.fit(fit, contrasts))
topTable(contr.fit, coef=1)

Error in makeContrasts(SLE - control, levels = design) :
The levels must by syntactically valid names in R, see help(make.names). Non-valid names: rnagroup: Healthy control of SLE,rnagroup: SLE

limma beadarray illumina • 1.3k views

ADD COMMENT • link updated 5.9 years ago by Gordon Smyth 50k • written 5.9 years ago by 261092095 ▴ 10

Gordon Smyth · Answer 1 · 2018-06-07

One of the things you have to learn if you want to use R successfully is to pay attention to error messages. This one seems particularly clear to me:

Error in makeContrasts(SLE - control, levels = design) : 
  The levels must by syntactically valid names in R, see help(make.names).  Non-valid names: rnagroup: Healthy control of SLE,rnagroup: SLE

Which says that the levels for your groups have to be syntactically valid names, and says to look at the help for make.names. It even tells you which of the names aren't syntactically valid. Another way to put it would be to say that you have names that R can't use, and to use make.names to fix them so R can use them.

Do note that make.names is just going to take out all the spaces and replace with a '.', so it's often easier to just do it yourself, or to change to names that are syntactically valid. You can always see if a name is syntactically valid using make.names.

> make.names("is this syntactically valid?")
[1] "is.this.syntactically.valid."
> make.names("I_bet_this_one_is")
[1] "I_bet_this_one_is"

score 0 · Answer 2 · 2018-06-08

There are lots and lots of problems with your code. Many of the code steps that you give would give errors. For example,

as(gse, "ExpressionSetIllumina")

will fail because this is no method for coercing an ExpressionSet into an ExpressionSetIllumina (nor is there any need to convert the object anyway). Later on,

normaliseIllumina(summaryData,method="neqc"

will fail because "neqc" is not a normalization method provided by the normaliseIllumina() function (as the help page for that function will tell you!).

I don't know where you are getting code from, but you are making the whole analysis process unnecessarily complicated. The expression data provided by getGEO() is already background corrected and normalized, so there is no need for the beadarray package at all. beadarray is designed for low-level processing of raw Illumina data, but such processing has already been done. It is not correct to reprocess normalized data as if it was raw data.

The analysis you are trying to do is actually reasonably easy. You can delete all the code you have now and replace with:

> gse <- getGEO(filename=filenm)
> group <- pData(gse)$characteristics_ch1
> levels(group) <- c("Control","SLE")
> design <- model.matrix(~group)
> fit <- lmFit(gse,design)
> fit <- eBayes(fit,robust=TRUE)
> topTable(fit,coef=2)

James MacDonald will be cross with me because I have given you the solution instead of encouraging you to solve it for yourself, but you are so far down the wrong road that I feel I need to force you back in one big step.