Zero counts for all samples for all genes downloaded from recount3
1
0
Entering edit mode
urwah ▴ 10
@5a24aca2
Last seen 12 weeks ago
Australia

Hey everyone,

I'm having issues with fetching data from recount3.

I downloaded two datasets that were listed on the recount table from here (https://rna.recount.bio), but when I fetch these datasets and load the counts, I get 0 counts for all samples!

All the metadata columns do exist though.

Here's the code I use to retrieve these:

library(recount3)

ramakar = recount3::create_rse_manual(
  project = "SRP073813",
  project_home = "data_sources/sra",
  organism = "human",
  annotation = "gencode_v29",
  type = "gene"
)

ramakar.exp <- assay(ramakar, "raw_counts") 


all(apply(apply(ramakar.exp, 2, function(x) x==0), 2, any))
> TRUE
`

With this dataset, I've managed to find a different source (refine bio: https://www.refine.bio/experiments/SRP073813/rna-sequencing-of-human-post-mortem-brain-tissues) that has the processed counts so I can confirm that not all samples have 0 counts across all genes.

I've also gotten zero counts for another dataset:

hdbr = recount3::create_rse_manual(
  project = "ERP016243",
  type = "gene"
)

hdbr.exp <- assay(hdbr, "raw_counts") 

all(apply(apply(hdbr.exp, 2, function(x) x==0), 2, any))
> TRUE

I'm also happy to provide an Rmarkdown for this. I just want to know in case I fetched the data wrong, or if there's any issue with the annotation I'm using.

Thank you!

recount • 469 views
ADD COMMENT
2
Entering edit mode
ATpoint ★ 4.1k
@atpoint-13662
Last seen 4 hours ago
Germany

What your apply construct is checking is whether each column (=sample) contains at least one zero, which is TRUE, and normal/expected since no tissue (to my knowledge) would express all annotated genes.

The data are fine. While there is at least one sample with no counts at all, the majority has (a lot of) counts:

# Use colSums to count the total number of raw counts per sample
summary(colSums(ramakar.exp))
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
0.000e+00 2.136e+09 4.074e+09 4.037e+09 5.711e+09 8.699e+09

An all-zero test (that could handle NAs in the rare case of presence) could be all(ramakar.exp %in% c(0, NA)) and this returns FALSE.

ADD COMMENT

Login before adding your answer.

Traffic: 690 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6