How to annotate mir4.1 arrays?
Entering edit mode
Last seen 3 months ago
United States

Dear list.

I am analyzing an Affymetrix mir 4.1 dataset using the pd.mirna.4.1 file obtained by the instructions in the following post:

Affymetrix miRNA4.1 / oligo package / pd.mirna.4.1


I am getting only probeset ids but not mir names or ENTREZ gene ids. Here is my session

> library(oligo)
Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, aperm, append,, basename, cbind, colnames, dirname,,
    duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map,
    mapply, match, mget, order, paste, pmax,, pmin,, Position, rank, rbind, Reduce,
    rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which.max, which.min

Loading required package: oligoClasses
Welcome to oligoClasses version 1.60.0
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: Biostrings
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:base’:

    expand.grid, I, unname

Loading required package: IRanges
Loading required package: XVector
Loading required package: GenomeInfoDb

Attaching package: ‘Biostrings’

The following object is masked from ‘package:base’:


Welcome to oligo version 1.62.2
> library(affycoretools)
Registered S3 method overwritten by 'GGally':
  method from   ggplot2

> library(limma)

Attaching package: 'limma'

The following object is masked from 'package:oligo':


The following object is masked from 'package:BiocGenerics':

> library(pd.mirna.4.1)
Loading required package: RSQLite
Loading required package: DBI
> celfiles  <-  list.celfiles("data",full.names=TRUE)
> raw<-  read.celfiles(celfiles,pkgname="pd.mirna.4.1")
Platform design info loaded.
Reading in : data/a1.ctr.exo.fadu.CEL

Reading in : data/e3.tgfb.exo.fadu.CEL
> probeset.eset<-annotateEset(probeset.eset, pd.mirna.4.1, columns = c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"))
Error: There is no annotation object provided with the pd.mirna.4.1 package.

> sessionInfo( )
R version 4.2.3 (2023-03-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pd.mirna.4.1_0.1     DBI_1.1.3            RSQLite_2.3.1        limma_3.54.2         affycoretools_1.70.0
 [6] oligo_1.62.2         Biostrings_2.66.0    GenomeInfoDb_1.34.9  XVector_0.38.0       IRanges_2.32.0      
[11] S4Vectors_0.36.2     Biobase_2.58.0       oligoClasses_1.60.0  BiocGenerics_0.44.0 

loaded via a namespace (and not attached):
  [1] backports_1.4.1             GOstats_2.64.0              Hmisc_5.0-1                
  [4] BiocFileCache_2.6.1         plyr_1.8.8                  lazyeval_0.2.2             
  [7] GSEABase_1.60.0             splines_4.2.3               BiocParallel_1.32.6        
 [10] ggplot2_3.4.2               digest_0.6.31               foreach_1.5.2              
 [13] ensembldb_2.22.0            htmltools_0.5.5             GO.db_3.16.0               
 [16] fansi_1.0.4                 magrittr_2.0.3              checkmate_2.2.0            
 [19] memoise_2.0.1               BSgenome_1.66.3             cluster_2.1.4              
 [22] gcrma_2.70.0                annotate_1.76.0             matrixStats_0.63.0         
 [25] R.utils_2.12.2              ggbio_1.46.0                prettyunits_1.1.1          
 [28] colorspace_2.1-0            blob_1.2.4                  rappdirs_0.3.3             
 [31] xfun_0.39                   dplyr_1.1.2                 crayon_1.5.2               
 [34] RCurl_1.98-1.12             jsonlite_1.8.4              graph_1.76.0               
 [37] genefilter_1.80.3           survival_3.5-5              VariantAnnotation_1.44.1   
 [40] iterators_1.0.14            glue_1.6.2                  gtable_0.3.3               
 [43] zlibbioc_1.44.0             DelayedArray_0.24.0         Rgraphviz_2.42.0           
 [46] scales_1.2.1                GGally_2.1.2                edgeR_3.40.2               
 [49] Rcpp_1.0.10                 xtable_1.8-4                progress_1.2.2             
 [52] htmlTable_2.4.1             foreign_0.8-84              bit_4.0.5                  
 [55] OrganismDbi_1.40.0          preprocessCore_1.60.2       Formula_1.2-5              
 [58] AnnotationForge_1.40.2      htmlwidgets_1.6.2           httr_1.4.5                 
 [61] gplots_3.1.3                RColorBrewer_1.1-3          ff_4.0.9                   
 [64] R.methodsS3_1.8.2           pkgconfig_2.0.3             reshape_0.8.9              
 [67] XML_3.99-0.14               nnet_7.3-19                 dbplyr_2.3.2               
 [70] locfit_1.5-9.7              utf8_1.2.3                  tidyselect_1.2.0           
 [73] rlang_1.1.1                 reshape2_1.4.4              AnnotationDbi_1.60.2       
 [76] munsell_0.5.0               tools_4.2.3                 cachem_1.0.8               
 [79] cli_3.6.1                   generics_0.1.3              evaluate_0.20              
 [82] stringr_1.5.0               fastmap_1.1.1               yaml_2.3.7                 
 [85] knitr_1.42                  bit64_4.0.5                 caTools_1.18.2             
 [88] KEGGREST_1.38.0             AnnotationFilter_1.22.0     RBGL_1.74.0                
 [91] R.oo_1.25.0                 xml2_1.3.4                  biomaRt_2.54.1             
 [94] compiler_4.2.3              rstudioapi_0.14             filelock_1.0.2             
 [97] curl_5.0.0                  png_0.1-8                   affyio_1.68.0              
[100] PFAM.db_3.16.0              tibble_3.2.1                geneplotter_1.76.0         
[103] stringi_1.7.12              Glimma_2.8.0                GenomicFeatures_1.50.4     
[106] lattice_0.21-8              ProtGenerics_1.30.0         Matrix_1.5-4               
[109] vctrs_0.6.2                 pillar_1.9.0                lifecycle_1.0.3            
[112] BiocManager_1.30.20         data.table_1.14.8           bitops_1.0-7               
[115] rtracklayer_1.58.0          GenomicRanges_1.50.2        affy_1.76.0                
[118] hwriter_1.3.2.1             R6_2.5.1                    BiocIO_1.8.0               
[121] KernSmooth_2.23-21          gridExtra_2.3               affxparser_1.70.0          
[124] codetools_0.2-19            dichromat_2.0-0.1           gtools_3.9.4               
[127] SummarizedExperiment_1.28.0 DESeq2_1.38.3               Category_2.64.0            
[130] rjson_0.2.21                ReportingTools_2.38.0       GenomicAlignments_1.34.1   
[133] Rsamtools_2.14.0            GenomeInfoDbData_1.2.9      parallel_4.2.3             
[136] hms_1.1.3                   grid_4.2.3                  rpart_4.1.19               
[139] rmarkdown_2.21              MatrixGenerics_1.10.0       biovizBase_1.46.0          
[142] base64enc_0.1-3             restfulr_0.0.15            

How do I get the miR symbols and ENTREZ GENEIDS corrsponding to the probe ids?

Thanks and best wishes,

Richard Friedman.

Columbia University Cancer Center

AffymetrixChip miRNA oligo • 436 views
Entering edit mode

Dear List,

I ended up reading in the annotation csv file from Affy and subsetting it, and merging it with the toptable file from limma.

Best wishes,


Entering edit mode
Last seen 2 hours ago
United States

You haven't yet run rma on your data, so you cannot annotate the data yet. Once you have run rma, you can annotate using the csv file from ThermoFisher. (you will need a login for this).

> eset <- rma(raw)
## note that you need to specify coment.char!
> anno <- read.csv(("TFS-Assets_LSG_Support-Files_miRNA-4_1-st-v1-annotations-20160922-csv/miRNA-4_1-st-v1.annotations.20160922.csv", comment.char = "#")
> anno <- anno[,2:4]
> eset <- annotateEset(eset, anno, 1, 2:3)
## et voila!

As an aside, this is all documented in the help page for annotateEset


     annotateEset(object, x, ...)

     ## S4 method for signature 'ExpressionSet,ChipDb'
       columns = c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"),
       multivals = "first"

     ## S4 method for signature 'ExpressionSet,AffyGenePDInfo'
     annotateEset(object, x, type = "core", ...)

     ## S4 method for signature 'ExpressionSet,AffyHTAPDInfo'
     annotateEset(object, x, type = "core", ...)

     ## S4 method for signature 'ExpressionSet,AffyExonPDInfo'
     annotateEset(object, x, type = "core", ...)

     ## S4 method for signature 'ExpressionSet,AffyExpressionPDInfo'
     annotateEset(object, x, type = "core", ...)

     ## S4 method for signature 'ExpressionSet,character'
     annotateEset(object, x, ...)

     ## S4 method for signature 'ExpressionSet,data.frame'
     annotateEset(object, x, probecol = NULL, annocols = NULL, ...) <------------- This part here


  object: An ExpressionSet to which we want to add annotation.

       x: Either a ChipDb package (e.g.,
          hugene10sttranscriptcluster.db), or a pdInfoPackage object

     ...: Allow users to pass in arbitrary arguments. Particularly
          useful for passing in columns, multivals, and type arguments
          for methods.

 columns: For ChipDb method; what annotation data to add. Use the
          'columns' function to see what choices you have. By default
          we get the ENTREZID, SYMBOL and GENENAME.

multivals: For ChipDb method; this is passed to 'mapIds' to control how
          1:many mappings are handled. The default is 'first', which
          takes just the first result. Other valid values are 'list'
          and 'CharacterList', which return all mapped results.

    type: For pdInfoPackages; either 'core' or 'probeset',
          corresponding to the 'target' argument used in the call to

probecol: Column of the data.frame that contains the probeset IDs. Can <---------------- As well as this entry and the following one
          be either numeric (the column number) or character (the
          column header).

annocols: Column(x) of the data.frame to use for annotating. Can be a
          vector of numbers (which column numbers to use) or a
          character vector (vector of column names).
Entering edit mode

A useful thing to do is to include the species as well.

> anno <- read.csv("TFS-Assets_LSG_Support-Files_miRNA-4_1-st-v1-annotations-20160922-csv/miRNA-4_1-st-v1.annotations.20160922.csv", comment.char = "#")
> eset <- annotateEset(eset, anno, 2, c(3,4,6))
> head(fData(eset))
            Accession Transcript.ID.Array.Design. Species.Scientific.Name
14q0_st          14q0                        14q0            Homo sapiens
14qI-1_st      14qI-1                      14qI-1            Homo sapiens
14qI-1_x_st    14qI-1                      14qI-1            Homo sapiens
14qI-2_st      14qI-2                      14qI-2            Homo sapiens
14qI-3_x_st    14qI-3                      14qI-3            Homo sapiens
14qI-4_st      14qI-4                      14qI-4            Homo sapiens

## assuming you care only about Homo sapiens
> esetsmall <- eset[fData(eset)[,3] %in% "Homo sapiens",]
> esetsmall
ExpressionSet (storageMode: lockedEnvironment)
assayData: 6631 features, 4 samples 
  element names: exprs 
  rowNames: GSM4509143_CAMC_V_Replicate1.CEL.gz
  varLabels: exprs dates
  varMetadata: labelDescription channel
  rowNames: GSM4509143_CAMC_V_Replicate1.CEL.gz
  varLabels: index
  varMetadata: labelDescription channel
  featureNames: 14q0_st 14qI-1_st ... Z17B_st (6631 total)
  fvarLabels: Accession Transcript.ID.Array.Design.
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation: pd.mirna.4.0
Entering edit mode


I just saw this.

Thanks as always,


Login before adding your answer.

Traffic: 400 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6