Filtering of lowly expressed probes in HTA 2.0 using new pd.hta.2.0 version 3.12.2
1
1
Entering edit mode
relathman ▴ 20
@relathman-11472
Last seen 6.4 years ago
Germany

Dear Community,

as described in this post (C: Appropriate pre-processing pipeline for Human Transcriptome Array HTA 2.0 with o), I would like to plot the distribution for main, antigenomic and intronic probesets in an HTA 2.0 in order to decide on an appropriate expression cutoff to separate expressed from unexpressed probesets.

According to the following type definition of pd.hta.2.0, main probesets are annotated as type 1, antigenomic probesets as type 2 and intronic probesets as type 7:

> dbGetQuery(db(pd.hta.2.0), "select * from type_dict;")

  type                                              type_id
1    1                                                 main
2    2                       Antigenomic background control
3    3                             control->affx->bac_spike
4    4                           control->affx->polya_spike
5    5 ERCC (External RNA Controls Consortium) step control
6    6      Exonic normalization control (Positive Control)
7    7    Intronic normalization control (Negative Control)
8    8                                     Positive Control

However, there seems to be a problem with the current version of the pd.hta.2.0 package (version 3.12.1) because when I use affycoretools::getMainProbes(), the only available annotation is type 1 and everything else is annotated with NA despite there being antigenomic probesets (whose transcript cluster id starts with "AFFX").

> z <- getMainProbes("pd.hta.2.0")
> table(z$type)
    1
67516
> z[z$type %in% 2,]
[1] transcript_cluster_id type                
<0 rows> (or 0-length row.names)

I read in this post (C: problems filtering antigenomic probes from HTA 2.0 , written 5 months ago), that there will be an updated version of the pd.hta.2.0 package (version 3.12.2) where this is fixed and I wondered when it will be released/whether it is possible to get a pre-release version?

I would be very grateful for any help.

Best

Rukeia

pd.hta.2.0 hta2.0 affycoretools • 1.6k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 13 hours ago
United States

I think this fell through the cracks. The updated pdInfo package has been pushed to the devel server, and is being pushed to the release server. It should appear tomorrow.

ADD COMMENT
0
Entering edit mode

The corrected package is available now:

 table(getMainProbes("pd.hta.2.0")$type)

    1     2     3     4     5     6     7
67516    23     4     4   155   698   646
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

Matrix products: default
BLAS: /data/oldR/R-3.4.0/lib64/R/lib/libRblas.so
LAPACK: /data/oldR/R-3.4.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] affycoretools_1.49.4 pd.hta.2.0_3.12.2    DBI_0.7       
ADD REPLY
0
Entering edit mode

Great, it works now! Thank you very much for your help.

ADD REPLY

Login before adding your answer.

Traffic: 784 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6