Question

featureCounts Returns 0.0% successfully assigned fragments

0

Entering edit mode

connor.geraghty • 0

@connorgeraghty-16065

Last seen 6.1 years ago

Hello,

I have been working my way through learning Rsubread, and I am stuck on the featureCounts() command. The data maps to the reference genome above 90% for these 5 test samples:

 NumMapped PropMapped
1  20005679   0.936811
2  15387615   0.909452
3  17158660   0.915605
4  16636690   0.955880
5  16891823   0.951170

But, when I use featureCounts:

fc_PE <- featureCounts(bam.files, annot.inbuilt="mm10", isPairedEnd=TRUE, nthreads = 14)

it returns 0.0% successfully assigned fragments.

NCBI RefSeq annotation for mm10 (build 38.1) is used.

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.28.1

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 5 BAM files                                      ||
||                           P /users/PAS1346/osu9792/testrna/CGL1607_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1608_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1609_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1610_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1612_R1. ... ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 14                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||         Strand specific : no                                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /users/PAS1346/osu9792/R/x86_64-pc-linux-gnu-libr ... ||
||    Features : 222996                                                       ||
||    Meta-features : 27179                                                   ||
||    Chromosomes/contigs : 43                                                ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1607_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 21355089                                              ||
||    Successfully assigned fragments : 6919 (0.0%)                           ||
||    Running time : 0.08 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1608_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 16919664                                              ||
||    Successfully assigned fragments : 6175 (0.0%)                           ||
||    Running time : 0.06 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1609_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 18740233                                              ||
||    Successfully assigned fragments : 8017 (0.0%)                           ||
||    Running time : 0.07 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1610_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 17404589                                              ||
||    Successfully assigned fragments : 5861 (0.0%)                           ||
||    Running time : 0.06 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1612_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 17758996                                              ||
||    Successfully assigned fragments : 5128 (0.0%)                           ||
||    Running time : 0.06 minutes                                             ||
||                                                                            ||
||                         Read assignment finished.                          ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

Do you have any ideas as to why the features aren't producing meanigful counts? Could I have done something upstream that would have compromised the featureCounts() ability?

Any help is greatly appreciated!

Connor

rsubread featurecounts • 3.9k views

ADD COMMENT • link 6.5 years ago connor.geraghty • 0

0

Entering edit mode

Could you show the counting summary by issuing the following command? This will tell you the reasons why most reads were not counted.

fc_PE$stat

ADD REPLY • link 6.5 years ago Wei Shi ★ 3.6k

0

Entering edit mode

Here is the output from fc_PE$stat:

Status X.users.PAS1346.osu9792.testrna.CGL1607_R1.fastq.gz.subread.BAM 
1                       Assigned                                                            6919                                                            
2            Unassigned_Unmapped                                                         1349410                                                         
3      Unassigned_MappingQuality                                                               0                                                               
4             Unassigned_Chimera                                                               0                                                              
5      Unassigned_FragmentLength                                                               0                                                               
6           Unassigned_Duplicate                                                               0                                                               
7        Unassigned_MultiMapping                                                               0                                                               
8           Unassigned_Secondary                                                               0                                                               
9         Unassigned_Nonjunction                                                               0                                                               
10         Unassigned_NoFeatures                                                        19998750                                                       
11 Unassigned_Overlapping_Length                                                               0                                                               
12          Unassigned_Ambiguity                                                              10

All samples follow this same pattern with most unassigned due to being either unmapped or no features.

ADD REPLY • link 6.5 years ago connor.geraghty • 0

0

Entering edit mode

Did you map your reads to mm10? Have you checked if the chr names in your reference genome match those in the annotation?

ADD REPLY • link 6.5 years ago Wei Shi ★ 3.6k

0

Entering edit mode

Yes, the reads were mapped to mm10 (GRCm38.p6 to be exact). What is the best way to check if chr names match between reference genome and annotation?

ADD REPLY • link 6.5 years ago connor.geraghty • 0

0

Entering edit mode

Here is the line of code to build the reference index based on the download from NCBI Genome:

buildindex(basename="ref_index",reference="GCF_000001635.26_GRCm38.p6_genomic.fna", memory = 48000).

By default would the chromosome names be different between the built index from the NCBI, and the inbuilt mm10 annotation?

ADD REPLY • link 6.5 years ago connor.geraghty • 0

0

Entering edit mode

The inbuilt annotation was built based on NCBI RefSeq annotation so chr names should match between your mapped reads and the inbuilt annotation. What kind of sequencing data you are analyzing?

ADD REPLY • link 6.5 years ago Wei Shi ★ 3.6k

0

Entering edit mode

These are colon tissue RNA samples from mouse proximal colon. So, some of the transcripts could probably be microbial, but the majority should be mouse.

ADD REPLY • link 6.5 years ago connor.geraghty • 0

0

Entering edit mode

Your analysis seems to be fine. You probably need to dig deeply into your reads to see where your reads are mapped to. Artifact/contamination in your sequencing or sample prep might cause an issue like this.

ADD REPLY • link 6.5 years ago Wei Shi ★ 3.6k