Using function select
1
0
Entering edit mode
ekl • 0
@ekl-22229
Last seen 5.1 years ago

I am using Affy to annotate feature data in R after downloading Affy. When I type the following it says the select function cannot be found. Thank you.

anno<-select(tomatocdf.db, 
+ keys= (featureNames(eset)),
+ columns= c("SYMBOL", "GENENAME"),
+ keytype="PROBEID")
Error in select(tomatocdf.db, keys = (featureNames(eset)), columns = c("SYMBOL",  : 
  could not find function "select"
Affy • 1.8k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 32 minutes ago
United States

Any time you get an error that says

could not find function "select"

That means the package that contains the function has not been loaded yet. In addition, you appear to be trying to use a non-existent package called 'tomatocdf.db', which is a concatenation of an actual package (tomatocdf) and .db, which isn't a thing. There isn't an annotation package for that array, so you will have to decide how interested you are in getting annotations.

The cheap and easy way to do it is to leverage somebody else's work:

> library(GEOquery)
> z <- getGEO("GSE125476")[[1]]
Found 1 file(s)
GSE125476_series_matrix.txt.gz
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE125nnn/GSE125476/matrix/GSE125476_series_matrix.txt.gz'
Content type 'application/x-gzip' length 1247686 bytes (1.2 MB)
downloaded 1.2 MB

> z
ExpressionSet (storageMode: lockedEnvironment)
assayData: 10209 features, 30 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM3574770 GSM3574771 ... GSM3574799 (30 total)
  varLabels: title geo_accession ... tissue:ch1 (35 total)
  varMetadata: labelDescription
featureData
  featureNames: AFFX-BioB-3_at AFFX-BioB-5_at ...
    RPTR-Les-XXU09476-1_at (10209 total)
  fvarLabels: ID GB_LIST ... SPOT_ID (17 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL4741 

> fData(z)[5000:5002,c(1,2,9,10,11,12)]
                               ID    GB_LIST Representative Public ID
Les.5430.1.S1_at Les.5430.1.S1_at BT013901.1               BT013901.1
Les.5431.1.S1_at Les.5431.1.S1_at BT013903.1               BT013903.1

                                                                                                                                                                      Gene Title
Les.5430.1.S1_at                                                                                                                                    Clone 132898F, mRNA sequence
Les.5431.1.S1_at                                                                                                                                    Clone 132900F, mRNA sequence

                 Gene Symbol ENTREZ_GENE_ID
Les.5430.1.S1_at                           
Les.5431.1.S1_at                           

So you could pop the fData slot out of that GEO dataset and put it in your ExpressionSet, making sure that you have the same row order (so your annotation lines up with the existing data).

But those annotations are from 2006, so there might be better data out there. You could hypothetically get the annotation CSV from Affy's website and use that as well:

> library(AffyCompatible)
## You need a username and pwd for Affy to download stuff
> rsrc <- NetAffxResource("jmacdon@med.umich.edu", password)
> affxDescription(rsrc[["Tomato"]])
[1] "Annotations, CSV format"         "CDF Library File"               
[3] "CIF Library File"                "PSI Library File"               
[5] "Probe Sequences, FASTA format"   "Probe Sequences, tabular format"
[7] "TAC 4.x Configuration file"      "tac_qcc file" 

> df <- readAnnotation(rsrc, annotation = rsrc[["Tomato", "Annotations, CSV format"]], comment.char = "#")
> names(df)
 [1] "Probe.Set.ID"                     "GeneChip.Array"                  
 [3] "Species.Scientific.Name"          "Annotation.Date"                 
 [5] "Sequence.Type"                    "Sequence.Source"                 
 [7] "Transcript.ID.Array.Design."      "Target.Description"              
 [9] "Representative.Public.ID"         "Archival.UniGene.Cluster"        
[11] "UniGene.ID"                       "Genome.Version"                  
[13] "Alignments"                       "Gene.Title"                      
[15] "Gene.Symbol"                      "Chromosomal.Location"            
[17] "Unigene.Cluster.Type"             "Ensembl"                         
[19] "Entrez.Gene"                      "SwissProt"                       
[21] "EC"                               "OMIM"                            
[23] "RefSeq.Protein.ID"                "RefSeq.Transcript.ID"            
[25] "FlyBase"                          "AGI"                             
[27] "WormBase"                         "MGI.Name"                        
[29] "RGD.Name"                         "SGD.accession.number"            
<snip>
> df[5010:5012,c(1,9,19)]
         Probe.Set.ID Representative.Public.ID Entrez.Gene
5010  Les.544.1.A1_at                 BG630730   101249417
5011 Les.5440.1.S1_at               BT013926.1   101267119

ADD COMMENT
0
Entering edit mode

Thank you so much!

ADD REPLY
0
Entering edit mode

What is the package that contains the function select?

ADD REPLY
0
Entering edit mode

select comes from the AnnotationDbi package, which will be loaded whenever you load any annotation package. So for example, let's say there was an annotation package for the tomato array. It would be called tomato.db, and if you were to load that package, AnnotationDbi would automatically get loaded as well, and then select would be available.

ADD REPLY

Login before adding your answer.

Traffic: 1240 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6