Question: Filtering of lowly expressed probes in HTA 2.0 using new pd.hta.2.0 version 3.12.2
1
gravatar for relathman
2.3 years ago by
relathman20
Germany
relathman20 wrote:

Dear Community,

as described in this post (C: Appropriate pre-processing pipeline for Human Transcriptome Array HTA 2.0 with o), I would like to plot the distribution for main, antigenomic and intronic probesets in an HTA 2.0 in order to decide on an appropriate expression cutoff to separate expressed from unexpressed probesets.

According to the following type definition of pd.hta.2.0, main probesets are annotated as type 1, antigenomic probesets as type 2 and intronic probesets as type 7:

> dbGetQuery(db(pd.hta.2.0), "select * from type_dict;")

  type                                              type_id
1    1                                                 main
2    2                       Antigenomic background control
3    3                             control->affx->bac_spike
4    4                           control->affx->polya_spike
5    5 ERCC (External RNA Controls Consortium) step control
6    6      Exonic normalization control (Positive Control)
7    7    Intronic normalization control (Negative Control)
8    8                                     Positive Control

However, there seems to be a problem with the current version of the pd.hta.2.0 package (version 3.12.1) because when I use affycoretools::getMainProbes(), the only available annotation is type 1 and everything else is annotated with NA despite there being antigenomic probesets (whose transcript cluster id starts with "AFFX").

> z <- getMainProbes("pd.hta.2.0")
> table(z$type)
    1
67516
> z[z$type %in% 2,]
[1] transcript_cluster_id type                
<0 rows> (or 0-length row.names)

I read in this post (C: problems filtering antigenomic probes from HTA 2.0 , written 5 months ago), that there will be an updated version of the pd.hta.2.0 package (version 3.12.2) where this is fixed and I wondered when it will be released/whether it is possible to get a pre-release version?

I would be very grateful for any help.

Best

Rukeia

ADD COMMENTlink modified 2.3 years ago by James W. MacDonald52k • written 2.3 years ago by relathman20
Answer: Filtering of lowly expressed probes in HTA 2.0 using new pd.hta.2.0 version 3.12
1
gravatar for James W. MacDonald
2.3 years ago by
United States
James W. MacDonald52k wrote:

I think this fell through the cracks. The updated pdInfo package has been pushed to the devel server, and is being pushed to the release server. It should appear tomorrow.

ADD COMMENTlink written 2.3 years ago by James W. MacDonald52k

The corrected package is available now:

 table(getMainProbes("pd.hta.2.0")$type)

    1     2     3     4     5     6     7
67516    23     4     4   155   698   646
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

Matrix products: default
BLAS: /data/oldR/R-3.4.0/lib64/R/lib/libRblas.so
LAPACK: /data/oldR/R-3.4.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] affycoretools_1.49.4 pd.hta.2.0_3.12.2    DBI_0.7       
ADD REPLYlink written 2.3 years ago by James W. MacDonald52k

Great, it works now! Thank you very much for your help.

ADD REPLYlink modified 2.0 years ago • written 2.3 years ago by relathman20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 150 users visited in the last hour