Hi all, I would like to know if it there was a Bioconductor package that could help me in the small RNAs quantifications. In particular I would use bam files as input and obtain a table with the read counts for each small RNA.
Thanks.
Riccardo
Hi all, I would like to know if it there was a Bioconductor package that could help me in the small RNAs quantifications. In particular I would use bam files as input and obtain a table with the read counts for each small RNA.
Thanks.
Riccardo
EDIT: It's not clear what you mean by 'small RNAs quantifications'. I suppose that could mean lots of different things. Assuming you mean something like 'I have some aligned data in bam files, and I want to count the number of reads that overlap just the small RNAs', then The conventional way to do that would be to use summarizeOverlaps from the GenomicAlignments package. You need a GRanges or GRangesList that identifies the genomic regions that the small RNAs come from, for which you can use an EnsDb package. Using the EnsDb.Hsapiens.v79 package as an example:
> tx <- transcripts(EnsDb.Hsapiens.v79, filter = list(TxbiotypeFilter(c("miRNA","snRNA","snoRNA"))))
> tx
GRanges object with 7597 ranges and 5 metadata columns:
seqnames ranges strand | tx_id
<Rle> <IRanges> <Rle> | <character>
ENST00000619216 1 [ 17369, 17436] - | ENST00000619216
ENST00000607096 1 [ 30366, 30503] + | ENST00000607096
ENST00000410691 1 [157784, 157887] - | ENST00000410691
ENST00000612080 1 [187891, 187958] - | ENST00000612080
ENST00000611868 1 [200880, 201017] + | ENST00000611868
... ... ... ... . ...
ENST00000516617 Y [25723342, 25723495] + | ENST00000516617
ENST00000516816 Y [25928979, 25929142] + | ENST00000516816
ENST00000515987 Y [26247384, 26247521] + | ENST00000515987
ENST00000517139 Y [26360989, 26361092] + | ENST00000517139
ENST00000620883 Y [26411059, 26411158] - | ENST00000620883
tx_biotype tx_cds_seq_start tx_cds_seq_end gene_id
<character> <numeric> <numeric> <character>
ENST00000619216 miRNA <NA> <NA> ENSG00000278267
ENST00000607096 miRNA <NA> <NA> ENSG00000274890
ENST00000410691 snRNA <NA> <NA> ENSG00000222623
ENST00000612080 miRNA <NA> <NA> ENSG00000273874
ENST00000611868 miRNA <NA> <NA> ENSG00000275135
... ... ... ... ...
ENST00000516617 snRNA <NA> <NA> ENSG00000252426
ENST00000516816 snRNA <NA> <NA> ENSG00000252625
ENST00000515987 snoRNA <NA> <NA> ENSG00000251796
ENST00000517139 snRNA <NA> <NA> ENSG00000252948
ENST00000620883 miRNA <NA> <NA> ENSG00000275510
-------
seqinfo: 193 sequences from GRCh38 genome
You can read the GenomicAlignments vignette and the help page for summarizeOverlaps to figure out the remaining steps.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.