Data format format for summarising each NGS read without collapsing into a count
Entering edit mode
Rick ▴ 10
Last seen 19 months ago

Hi, I'm developing an R package (exSTRa) and I need to work out if there is a data class that fits my data type. I'm looking for something similar to SummarizedExperiment (rectangular feature x sample, and I think a meta data table for samples), except it is:

  • Feature table of one next-generation sequencing read per row, with multiple columns, including sample name and locus, with a summary statistic for each read sequence, and usually dropping the read sequence itself
  • Sample table describing samples, including if they are a case or control
  • Loci table describing meta-information of each locus (here a repeat expansion locus), it's genetic location and motif

I developed an S3 class containing three data.table objects, which appropriate methods similar to data.table's selection syntax to select the locus, properties from the loci meta table, or by sample or sample property (such as case-control status or sex).

I don't know if I need to rewrite my package to fit in with a Bioconductor data class, and if so I don't know which data class suits my needs.

DataRepresentation exSTRa Rsamtools • 534 views

Login before adding your answer.

Traffic: 447 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6