I work on plant species. I have used STAR to map RNAseq reads and featureCounts to get expression values. I would like to counts the number of reads map to the genes and outside of genes. Is there any tool or script to get these estimates?
So you have the total number of reads which you mapped, and you have the counts generated by featureCounts. I guess you can just make the sum of those assigned to a gene and consider those as counted, and subtract that from the total and take those as not-in-a-gene. But that's an oversimplification, since multimapping reads can be in a gene and will not be counted. Depends on your application what you want to do with those.
Since featureCounts gives you a matrix of reads counts with genomic feature as the row name and sample as the column name, you can calculate the number of reads that have been mapped to the features defined in your annotation by calculating the sum of your matrix.
It's well worth throughly reading the documentation for featureCounts since there are a lot of parameters and how you specify these will make a big difference to your end result.
Did I understood correctly that you want 2 numbers: the number of reads mapping to a gene feature and those which don't?
Yeah, that's true. I would like to get this estimate for each gene and each of my Individual.
Thanks
So you have the total number of reads which you mapped, and you have the counts generated by featureCounts. I guess you can just make the sum of those assigned to a gene and consider those as counted, and subtract that from the total and take those as not-in-a-gene. But that's an oversimplification, since multimapping reads can be in a gene and will not be counted. Depends on your application what you want to do with those.
So now you want it per gene and per individual? So that's just what featureCounts does?!
Thanks!