small RNAs quantification
Entering edit mode
ribioinfo ▴ 90
Last seen 13 months ago

Hi all, I would like to know if it there was a Bioconductor package that could help me in the small RNAs quantifications. In particular I would use bam files as input and obtain a table with the read counts for each small RNA.



smallrna counts • 627 views
Entering edit mode
Last seen 18 hours ago
United States

EDIT: It's not clear what you mean by 'small RNAs quantifications'. I suppose that could mean lots of different things. Assuming you mean something like 'I have some aligned data in bam files, and I want to count the number of reads that overlap just the small RNAs', then The conventional way to do that would be to use summarizeOverlaps from the GenomicAlignments package. You need a GRanges or GRangesList that identifies the genomic regions that the small RNAs come from, for which you can use an EnsDb package. Using the EnsDb.Hsapiens.v79 package as an example:

> tx <- transcripts(EnsDb.Hsapiens.v79, filter = list(TxbiotypeFilter(c("miRNA","snRNA","snoRNA"))))
> tx
GRanges object with 7597 ranges and 5 metadata columns:
                  seqnames               ranges strand |           tx_id
                     <Rle>            <IRanges>  <Rle> |     <character>
  ENST00000619216        1     [ 17369,  17436]      - | ENST00000619216
  ENST00000607096        1     [ 30366,  30503]      + | ENST00000607096
  ENST00000410691        1     [157784, 157887]      - | ENST00000410691
  ENST00000612080        1     [187891, 187958]      - | ENST00000612080
  ENST00000611868        1     [200880, 201017]      + | ENST00000611868
              ...      ...                  ...    ... .             ...
  ENST00000516617        Y [25723342, 25723495]      + | ENST00000516617
  ENST00000516816        Y [25928979, 25929142]      + | ENST00000516816
  ENST00000515987        Y [26247384, 26247521]      + | ENST00000515987
  ENST00000517139        Y [26360989, 26361092]      + | ENST00000517139
  ENST00000620883        Y [26411059, 26411158]      - | ENST00000620883
                   tx_biotype tx_cds_seq_start tx_cds_seq_end         gene_id
                  <character>        <numeric>      <numeric>     <character>
  ENST00000619216       miRNA             <NA>           <NA> ENSG00000278267
  ENST00000607096       miRNA             <NA>           <NA> ENSG00000274890
  ENST00000410691       snRNA             <NA>           <NA> ENSG00000222623
  ENST00000612080       miRNA             <NA>           <NA> ENSG00000273874
  ENST00000611868       miRNA             <NA>           <NA> ENSG00000275135
              ...         ...              ...            ...             ...
  ENST00000516617       snRNA             <NA>           <NA> ENSG00000252426
  ENST00000516816       snRNA             <NA>           <NA> ENSG00000252625
  ENST00000515987      snoRNA             <NA>           <NA> ENSG00000251796
  ENST00000517139       snRNA             <NA>           <NA> ENSG00000252948
  ENST00000620883       miRNA             <NA>           <NA> ENSG00000275510
  seqinfo: 193 sequences from GRCh38 genome

You can read the GenomicAlignments vignette and the help page for summarizeOverlaps to figure out the remaining steps.

Entering edit mode
ribioinfo ▴ 90
Last seen 13 months ago
Thank you! Riccardo

Login before adding your answer.

Traffic: 330 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6