Data format format for summarising each NGS read without collapsing into a count
0
1
Entering edit mode
Rick ▴ 10
@43009098
Last seen 2.2 years ago
Australia

Hi, I'm developing an R package (exSTRa) and I need to work out if there is a data class that fits my data type. I'm looking for something similar to SummarizedExperiment (rectangular feature x sample, and I think a meta data table for samples), except it is:

  • Feature table of one next-generation sequencing read per row, with multiple columns, including sample name and locus, with a summary statistic for each read sequence, and usually dropping the read sequence itself
  • Sample table describing samples, including if they are a case or control
  • Loci table describing meta-information of each locus (here a repeat expansion locus), it's genetic location and motif

I developed an S3 class containing three data.table objects, which appropriate methods similar to data.table's selection syntax to select the locus, properties from the loci meta table, or by sample or sample property (such as case-control status or sex).

I don't know if I need to rewrite my package to fit in with a Bioconductor data class, and if so I don't know which data class suits my needs.

DataRepresentation exSTRa Rsamtools • 667 views
ADD COMMENT

Login before adding your answer.

Traffic: 544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6