Question: AnnotationHub: A RPKM data.frame of Epigenomics RoadMap Project seems strange
gravatar for wcstcyx
3.0 years ago by
wcstcyx30 wrote:

Dear All,

I found that a RPKM data.frame seems strange. This data.frame is obtained from AnnotationHub and the source is from Epigenomics RoadMap Project. The below is codes help you see the problem.

ah <- AnnotationHub()
epiFiles <- query(ah, "EpigenomeRoadMap")
dfs <- subset(epiFiles, rdataclass == "data.frame")
# View(data.frame(dfs$title, dfs$description, dfs$sourceurl))
rpkm <- dfs[[8]]
# View(rpkm) # the title seems not right, and the last column are all NAs
# download it by myself
url <- dfs$sourceurl[8]
filename <-  basename(url)
download.file(url, destfile=filename)
if (file.exists(filename))
  myrpkm <- read.table(filename, header = TRUE, row.names = 1)
# View(myrpkm) # it seems right
# See
# =========================
# =========================
# in

My sessionInfo is

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936 
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936   
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C                                                   
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936    

attached base packages:
[1] parallel  stats    
[3] graphics  grDevices
[5] utils     datasets 
[7] methods   base     

other attached packages:
[1] AnnotationHub_2.5.12
[2] BiocGenerics_0.19.2 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7                  
 [2] IRanges_2.7.17               
 [3] digest_0.6.10                
 [4] mime_0.5                     
 [5] R6_2.2.0                     
 [6] xtable_1.8-2                 
 [7] DBI_0.5-1                    
 [8] stats4_3.3.1                 
 [9] RSQLite_1.0.0                
[10] BiocInstaller_1.23.9         
[11] httr_1.2.1                   
[12] curl_2.1                     
[13] S4Vectors_0.11.18            
[14] tools_3.3.1                  
[15] Biobase_2.33.4               
[16] shiny_0.14.1                 
[17] httpuv_1.3.3                 
[18] AnnotationDbi_1.35.4         
[19] htmltools_0.3.5              
[20] interactiveDisplayBase_1.11.3

After I read 



I think the first column should be gene id and the first numeric column should be expression index of sample E000. So I load it by read.table(filename, header = TRUE, row.names = 1).

I found that more than one data.frame with this problem. Hope this kind of data could be reloaded appropriately by AnnotationHub.

Thanks in advance,
Can Wang

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by wcstcyx30
Answer: AnnotationHub: A RPKM data.frame of Epigenomics RoadMap Project seems strange
gravatar for Valerie Obenchain
3.0 years ago by
United States
Valerie Obenchain6.7k wrote:

Hi Can,

Thanks for reporting this bug. As you described, the problem was how the data were read in, the gene_id column was not being used as the row names. This has been fixed in AnnotationHub 2.5.13 (devel) and 2.4.3 (release). Both should be available via biocLite() Thursday Oct. 13 after noon PST or from svn immediately.


ADD COMMENTlink written 3.0 years ago by Valerie Obenchain6.7k

Thank you! I have checked that. It's OK now.


ADD REPLYlink written 3.0 years ago by wcstcyx30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 136 users visited in the last hour