Search
Question: Iteration through a list failing
0
gravatar for maxglycine
17 months ago by
maxglycine0
maxglycine0 wrote:

All:

This is probably a newbe rookie question but I am trying to iterate through a list (matrix?) and it fails with:

"Error in gsedf[i, 1] : object of type 'S4' is not subsettable"

This is strange because I have another script which works perfectly with almost the same code.  I am using GEOquery to extract the list of GSM files that makeup a GSE set.  The gsedf is a list of GSE files to download and process.  This works in my other script but fails with the error message above in the script below.  I don't know if this is really due to iterating through the list or to processing each GSM file from the GSE set to extract the data table. The script does process the first GSE accession in the list, just it fails apparently when it starts on the second GSE accession in the list.  I have re-ordered the list and it processes the first GSE in the list but fails with the same error before going on to the second.  Thanks for any help. 

Code follows:

#!/usr/bin/Rscript
library("GEOquery")
library("Biobase")
#options(warn = 1)
# get the filename of GSE numbers as argument
args = commandArgs(trailingOnly = TRUE)
# assign filename to f
f = args[1]
f
# read in the table of filenames
gselist <- list(read.table(f, header=FALSE, sep="", quote=""));
gselist
# make a dataframe from gselist
gsedf <- data.frame(gselist)
gsedf
y=nrow(gsedf)
print(paste("GSElist Length", y))

#this works perfectly to see if it was an iteration problem
for(i in 1:y){
  gsename <- gsedf[i,]
  print(paste("gsename=",gsename,sep=""))
}

# This is the part that fails
# iterate through each GSE
for (i in 1:y){
        gsename <- gsedf[i,1]
        print(paste("GSEno=",i,"Name=",gsename))
        # get the gds object
        gsedf<-getGEO(gsename, GSEMatrix=FALSE, destdir=".")
        gsmnamesdf<-data.frame(names(GSMList(gsedf))) #get list of sample names
                                                      #and make a data frame
        gsefile=paste(gsename, "-samples.txt", sep="")#make output file name for sample
                                                      #list
        print("Writing Samplename file")
        write.table(gsmnamesdf, file=gsefile, sep="\t") #write list of samples
         z=nrow(gsmnamesdf) #get number of sample names
         print(paste("Number of Samples=",z))
        for (j in 1:z){ #iterate through each sample name
          print(paste("SampleNum=",j,sep=""))
          gsm = gsmnamesdf[j,] #sample name of j sample
          print(paste("Sample Name=",gsm))
          gsmdf <- getGEO(gsm, destdir=".") #get sample
          outfile=paste(gsm,"table.txt",sep="-")#make sample output file name
          tabledf <- Table(gsmdf)#extract data table from the GSM accession
          print(paste("Printing",outfile))
          write.table(tabledf, file=outfile, sep="\t")#write the table to file
          # clean up
          softfile=paste(gsm,".soft", sep="")# the softfile name
          gzfile=paste(gsename, ".soft.gz", sep=" ")# the zippped softfile name
          print(paste("Deleting",softfile, sep=" "))
          file.remove(softfile) #delete the soft file
          file.remove(gzfile) #delete the zipped file
        }

}
print("Ended processing")
q()

 

ADD COMMENTlink modified 17 months ago by James W. MacDonald45k • written 17 months ago by maxglycine0
0
gravatar for James W. MacDonald
17 months ago by
United States
James W. MacDonald45k wrote:

The obvious answer is that you are ending up with an S4 object that doesn't have a '[' function specified, where in fact you are expecting something else. This is really a programming problem, not a Bioconductor support site issue. In other words, it's your script that is failing, and it's not because GEOquery has a bug - your script has the bug.

But there are some weird things here. For instance

gselist <- list(read.table(f, header=FALSE, sep="", quote=""));
gselist
# make a dataframe from gselist
gsedf <- data.frame(gselist)

You read in a file (into a data.frame), convert it immediately to a list, and then convert back into a data.frame. All this coercion may well be doing something you aren't expecting, and appears unnecessary.

Then you 'test' your loop doing

#this works perfectly to see if it was an iteration problem
for(i in 1:y){
  gsename <- gsedf[i,]
  print(paste("gsename=",gsename,sep=""))
}

Which is close, but not exactly the same as

for (i in 1:y){
        gsename <- gsedf[i,1]

and the error you get is when you do gsedf[i,1], but not when you do gsedf[i,]. So the former doesn't test the latter. But it seems like the easy play is to just make the gsedf and see what you have in each row, and if those things are unsubsettable S4 objects.

Anyway, given that this is your code, you shouldn't expect any support here. If you are going to write scripts, you have to learn to debug them yourself.

ADD COMMENTlink written 17 months ago by James W. MacDonald45k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 149 users visited in the last hour