I am trying to sub-select bunch of CEL files whilst creating a manifest file(this is also a phenotype data). The way I am going about doing is as follows;
The gds downloads about 536 files but I wish to read only say 119 files and given by "sub-set.tsv"
gds <- getGEO('GSE86952', destdir=".")
tab <- read.delim("sub-set.tsv", check.names = FALSE, as.is = TRUE)
rownames(tab) <- tab$filenames
fns <- list.celfiles()
fns %in% tab[, 1] ##check
rawdata<- ReadAffy(phenoData = tab)
I am unable to just sub-select my required data. Any suggestion is much appreciated. .
An example, say I just want to use the following CEL files and not the entire 536. I tried to save the given below as files.tsv instead of sub-set.tsv and tried to load but it didnt work.
The error is:
> rawdata<- ReadAffy(phenoData = tab)
Error in `sampleNames<-`(`*tmp*`, value = c("1", "2", "3", "4", "5", "6", :
number of new names (118) should equal number of rows in AnnotatedDataFrame (536)
In addition: Warning messages:
1: Mismatched phenoData and celfile names!
Please note that the row.names of your phenoData object should be identical to what you get from list.celfiles()!
Otherwise you are responsible for ensuring that the ordering of your phenoData object conforms to the ordering of the celfiles as they are read into the AffyBatch!
If not, errors may result from using the phenoData for subsetting or creating linear models, etc.
2: In read.affybatch(filenames = l$filenames, phenoData = l$phenoData, :
Incompatible phenoData object. Created a new one.