Question: Reads mistakenly being assigned to Unassigned_NoFeatures category when using featureCounts
0
gravatar for jma1991
2.4 years ago by
jma199130
jma199130 wrote:

:: I originally posted my question on Biostars, but according to the featureCounts website, I should post on this forum for help ::

I am using featureCounts (version 1.5.2) to count the number of reads within bins along the genome:

    featureCounts -R -F SAF --fracOverlap 1 -Q 1 --primary -T 20 -a Bins.saf -o readCounts.txt Input.bam

A small percentage of my reads are not being counted:

    Status    Input.bam
    Assigned    25677313
    Unassigned_Ambiguity    0
    Unassigned_MultiMapping    0
    Unassigned_NoFeatures    344088
    Unassigned_Unmapped    0
    Unassigned_MappingQuality    0
    Unassigned_FragmentLength    0
    Unassigned_Chimera    0
    Unassigned_Secondary    0
    Unassigned_Nonjunction    0
    Unassigned_Duplicate    0

However, when I remove the --fracOverlap command, 100% of my reads are assigned. When I view the location of the unassigned reads in IGV alongside my bins, they are mapped completely within the bins. Therefore shouldn't they be counted?

Here is the SAM output of some of the unassigned reads:

    HISEQ:132:C87JAANXX:8:2102:21334:46258    1040    chr1    64313502    60    47M3S    *    0    0    GCCTGTAGAGCATTTTCTTAAATAGTGATTGATGGGGGAGAGCCCAGANA    DGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGBGE<<3#3    MD:Z:47    PG:Z:MarkDuplicates    RG:Z:C87JAANXX.8    NM:i:0    AS:i:47    XS:i:32
    HISEQ:287:C8785ANXX:2:2108:10903:32326    16    chr1    64348161    60    46M4S    *    0    0    TAATATATCAATCTTGTTTTTCATAGGTAAACAGGGTGAAAGATTTACTC    GGGEGGGFGGGGGBEGBGGGEEG>GGGGGGGGEGGGGGGGGGGGGCCCCC    MD:Z:46    PG:Z:MarkDuplicates    RG:Z:C8785ANXX.2    NM:i:0    AS:i:46    XS:i:19
    HISEQ:132:C87JAANXX:8:2306:5138:50069    0    chr1    64416635    60    48M2S    *    0    0    AGTCACAAGTGCACAGGTTCCAGAGGTCAGGCCTTGAGCAGTTTGGGGCT    CBBBBFGGGGGFGGGGGGGGGEG1FGGGGGG1=FGGEGGGGGGGGGGFGG    MD:Z:31A16    PG:Z:MarkDuplicates    RG:Z:C87JAANXX.8    NM:i:1    AS:i:43    XS:i:19
    HISEQ:132:C87JAANXX:8:1310:2745:62463    1040    chr1    64441532    60    4S46M    *    0    0    AGTTCTCTCAAAAACAACATCTGTAATGTTGTTAGGACCTGGGTCCTAAC    :11:E::=/0=1E<1=1111>F:E1=0<1E=1@GGGF0>GBGFGGBBBBB    MD:Z:46    PG:Z:MarkDuplicates    RG:Z:C87JAANXX.8    NM:i:0    AS:i:46    XS:i:0
    HISEQ:287:C8785ANXX:1:2301:12294:65428    1024    chr1    64540417    14    2S48M    *    0    0    GTTGTGTGTGTGTGTGTGGGTGTGTGTGTGTGTGTTTGTGTTATATGTGT    BABABGGGGGGGGGGGGG/=CCGGGGGGGFGGG>FGGGGF?==F:FGGGG    MD:Z:16T31    PG:Z:MarkDuplicates    RG:Z:C8785ANXX.1    NM:i:1    AS:i:43    XS:i:39
    HISEQ:287:C8785ANXX:5:2304:7860:85658    0    chr1    64540417    18    2S48M    *    0    0    GTTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTTGTGTTATATGTGT    BCCBBGGGGEGFGGGGGGGGFDGGGGGGGGGGGGGBGGGGGGGD@GGGFG    MD:Z:48    PG:Z:MarkDuplicates    RG:Z:C8785ANXX.5    NM:i:0    AS:i:48    XS:i:43
    HISEQ:287:C8785ANXX:5:2101:11743:10794    16    chr1    64575425    60    3M1I46M    *    0    0    TGATTTTTTTTTTTCACATATGGAGATTGTATGGTGAGGAAAACCCAAAC    CEFEE<BGGGGEEGEGEGGGGGGGGGEGGGGGGGGGGGGGGGGGGCCCCC    MD:Z:49    PG:Z:MarkDuplicates    RG:Z:C8785ANXX.5    NM:i:1    AS:i:46    XS:i:21

And here is the flagstat output from the BAM file:

   26021401 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 secondary
    0 + 0 supplementary
    8931396 + 0 duplicates
    26021401 + 0 mapped (100.00% : N/A)
    0 + 0 paired in sequencing
    0 + 0 read1
    0 + 0 read2
    0 + 0 properly paired (N/A : N/A)
    0 + 0 with itself and mate mapped
    0 + 0 singletons (N/A : N/A)
    0 + 0 with mate mapped to a different chr
    0 + 0 with mate mapped to a different chr (mapQ>=5)


  [1]: https://preview.ibb.co/fOGhT5/igv_snapshot.png

featurecounts subread • 879 views
ADD COMMENTlink modified 2.4 years ago by Wei Shi3.2k • written 2.4 years ago by jma199130
Answer: Reads mistakenly being assigned to Unassigned_NoFeatures category when using fea
3
gravatar for Wei Shi
2.4 years ago by
Wei Shi3.2k
Australia
Wei Shi3.2k wrote:

Your command requires each read to 100% overlap with a feature ('--fracOverlap 1') but the reads you provided do not 100% overlap with any feature. These reads have soft-clipped bases ('S' section in CIGAR) or inserted bases ('I' section in CIGAR), making them have the fraction of overlapping bases being less than 1. This is the reason why they were assigned to the "Unassigned_NoFeatures" category. 

ADD COMMENTlink written 2.4 years ago by Wei Shi3.2k

Ah that makes sense, thank you for replying

ADD REPLYlink written 2.4 years ago by jma199130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 198 users visited in the last hour