Hi there,
I've got an Agilent expression dataset which I'm trying to analyse using limma.
Here are the issues:
1. I cannot read.maimages(agilentdataset) as suggested in the user guide. That prompts the following error:
Error in readGenericHeader(fullname, columns = columns, sep = sep) : Specified column headings not found in file In addition: Warning message: In readLines(con, n = 1) : incomplete final line found on 'GSE85426_raw_data_original2.txt'
2. I seem to be able to only add the annotation from a separate annotation file by merging the data.frames. However, then I'm stuck at n.sv and lmfit() which it won't let me do, because the data.frame isn't a suitable format.
n.sv = num.sv(norm_merged,mod,method="leek") Error in diag(dims[2]) - mod %*% solve(t(mod) %*% mod) %*% t(mod) : non-conformable arrays
3. I've successfully turned the data set into an ExpressionSet using
eset <- ExpressionSet(exprs, phenoData=pdata)
However, olgio::rma() won't work on this. If I normalise and bg correct it using limma's features, I get data frames which once again won't work with lmfit()
4. I've also tried library(coRNAi) to turn it into a working file for lmfit.
norm_merged_d2 <- df2lmFit(norm_merged) Error in df2lmFit(norm_merged) : no residuals, need to estimate main effects and update dataframe
All in all, I cannot seem to find a suitable workaround to normalize and subsequently combat and lmfit this dataset, let alone successfully add the annotation file. The annotation is however the least of my worries right now, since merging the data frames by row.names does work and hence could be done after analysis to find out the names of the top 1000 DE genes.
Does anyone have a suitable workaround of this? I apologise for the long description. I've been trying stuff on this for 3 days now, so appreciate any and all hints/help/links.
Hi Axel, thanks for the info above.
Let's start with just reading the data in. This is what I do and get:
GSE85426 appears to be Cy3/green only. Have you tried
x <- read.maimages("GSE85426_raw_data.txt",source="agilent", green.only = TRUE)
?
Yes, same error code.
Ok, I have downloaded that file and taken a look at it and it is not Agilent format at all :-( but
rather a tab-delimited textfile with one column <slide_id>_<array_position>.txt:gProcessedSignal
per sample.
You *could* try to read it using read.delim() but I do not see any advantage over using
library("GEOquery")
x <- getGEO("GSE85426")
and then x[[1]] is a nice and useful ExpressionSet (thanks to Sean Davis).
Hi Axel, first of all thanks for taking the time to check this out.
The above did work for me, but downloaded just the SDRF file. There's actually no expression data in that expression set. When I try to fetch the actual data set it returns:
Just FYI, this seemed to work:
eset <- ExpressionSet(assayData=eset, phenotData=pheno)
It then just leaves the question of how to annotate given that there is not annotation package for this Agilent chip and all I have is a nice txt file with the IDs and matching Gene names. All package builders require some sort of db package which I'm not sure exists...