Downloading count matrix for full recount3 dataset
1
0
Entering edit mode
Ben • 0
@2b32c5d6
Last seen 3.4 years ago
United States

Is there an aggregated file for downloading all the metadata or all gene-level counts for the entire recount3 dataset? Or is the recommended way to do so just to iterate over all the projects in available_projects()?

If the latter, is there a rate limit guideline for requests that should be followed?

recount3 • 1.6k views
ADD COMMENT
2
Entering edit mode
@lcolladotor
Last seen 4 weeks ago
United States

Hi,

We decided against providing a full gene count matrix object (or exon or exon-exon junction) since you have to remake them every time you update the resource, plus they'd be huge and likely not many would use them. For example, many users asked us to split the GTEx and TCGA data in recount2 and provide ways to download the data from just 1 tissue.

So yes, you'll have to iterate over all the projects.

As for downloading metadata, that's what I did myself at https://github.com/LieberInstitute/recount3-docs/blob/master/study-explorer/obtain_abstracts.R. You might want to use https://github.com/LieberInstitute/recount3-docs/blob/master/study-explorer/projects_meta.Rdata.

As for rate limitations, let's try and see what happens. I can forward this request to SciServer (the data host) if you run into issues.

Best, Leo

ADD COMMENT
0
Entering edit mode

Perhaps we should reconsider this. We could provide a single file as a collection. Let’s talk offline.

ADD REPLY
0
Entering edit mode

Hi, did you reach an agreement? Is it still not achievable?

ADD REPLY
0
Entering edit mode

AFAIK it's not.

ADD REPLY
0
Entering edit mode

That's fair, thanks for your help!

ADD REPLY

Login before adding your answer.

Traffic: 721 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6