featureCounts from Rsubread
2
0
Entering edit mode
Emiliano • 0
@426be483
Last seen 4 months ago
Italy

Hello,

I am analysing RNAseq data of sequences human samples starting from FASTQ files. I did all the quality controls, trimming ect and aligned my pair-end reds with STAR. I am now trying to obtain a count matrix in SummariedExperiment, and I am counting with featureCounts from Rsubread. I get back a list with counts and annotation. The problem is that in my annotations, Chr, Start and End and Strand have multiple values and are codified as character (I attach a picture to show), and for this reason I cannot generate the GRanges() for RowData as this requires a unique value fro Chr, Start and End. How can I get around this problem? Thank you enter image description here

Rsubread • 332 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Convert the featureCounts output to a DGEList:

library(Rsubread)
library(edgeR)
fc <- featureCounts( ... )
y <- featureCounts2DGEList(fc)

Then y$genes will have columns Chr, Start, End, Strand and Length, all with unique values.

Note that there are some human genes that are both the X and Y chromosomes. If your annotation includes such genes on both chromosomes, then you may need to mask the PAR region of the Y chromosome if you wish to create a unique genomic range for each geneID. You can easily type

table(y$genes$Chr)

to check whether this is an issue for your data.

ADD COMMENT

Login before adding your answer.

Traffic: 656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6