Question: featureCounts Returns 0.0% successfully assigned fragments
0
gravatar for connor.geraghty
18 months ago by
connor.geraghty0 wrote:

Hello,

I have been working my way through learning Rsubread, and I am stuck on the featureCounts() command. The data maps to the reference genome above 90% for these 5 test samples:

 NumMapped PropMapped
1  20005679   0.936811
2  15387615   0.909452
3  17158660   0.915605
4  16636690   0.955880
5  16891823   0.951170

But, when I use featureCounts: 

fc_PE <- featureCounts(bam.files, annot.inbuilt="mm10", isPairedEnd=TRUE, nthreads = 14)

it returns 0.0% successfully assigned fragments.

NCBI RefSeq annotation for mm10 (build 38.1) is used.

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.28.1

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 5 BAM files                                      ||
||                           P /users/PAS1346/osu9792/testrna/CGL1607_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1608_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1609_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1610_R1. ... ||
||                           P /users/PAS1346/osu9792/testrna/CGL1612_R1. ... ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 14                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||         Strand specific : no                                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /users/PAS1346/osu9792/R/x86_64-pc-linux-gnu-libr ... ||
||    Features : 222996                                                       ||
||    Meta-features : 27179                                                   ||
||    Chromosomes/contigs : 43                                                ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1607_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 21355089                                              ||
||    Successfully assigned fragments : 6919 (0.0%)                           ||
||    Running time : 0.08 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1608_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 16919664                                              ||
||    Successfully assigned fragments : 6175 (0.0%)                           ||
||    Running time : 0.06 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1609_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 18740233                                              ||
||    Successfully assigned fragments : 8017 (0.0%)                           ||
||    Running time : 0.07 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1610_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 17404589                                              ||
||    Successfully assigned fragments : 5861 (0.0%)                           ||
||    Running time : 0.06 minutes                                             ||
||                                                                            ||
|| Process BAM file /users/PAS1346/osu9792/testrna/CGL1612_R1.fastq.gz.su ... ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 17758996                                              ||
||    Successfully assigned fragments : 5128 (0.0%)                           ||
||    Running time : 0.06 minutes                                             ||
||                                                                            ||
||                         Read assignment finished.                          ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

 

Do you have any ideas as to why the features aren't producing meanigful counts? Could I have done something upstream that would have compromised the featureCounts() ability?

Any help is greatly appreciated!

Connor

rsubread featurecounts • 544 views
ADD COMMENTlink modified 18 months ago • written 18 months ago by connor.geraghty0

Could you show the counting summary by issuing the following command? This will tell you the reasons why most reads were not counted.

fc_PE$stat

ADD REPLYlink written 18 months ago by Wei Shi3.2k

Here is the output from fc_PE$stat: 

Status X.users.PAS1346.osu9792.testrna.CGL1607_R1.fastq.gz.subread.BAM 
1                       Assigned                                                            6919                                                            
2            Unassigned_Unmapped                                                         1349410                                                         
3      Unassigned_MappingQuality                                                               0                                                               
4             Unassigned_Chimera                                                               0                                                              
5      Unassigned_FragmentLength                                                               0                                                               
6           Unassigned_Duplicate                                                               0                                                               
7        Unassigned_MultiMapping                                                               0                                                               
8           Unassigned_Secondary                                                               0                                                               
9         Unassigned_Nonjunction                                                               0                                                               
10         Unassigned_NoFeatures                                                        19998750                                                       
11 Unassigned_Overlapping_Length                                                               0                                                               
12          Unassigned_Ambiguity                                                              10                                                              

All samples follow this same pattern with most unassigned due to being either unmapped or no features. 

ADD REPLYlink modified 18 months ago • written 18 months ago by connor.geraghty0

Did you map your reads to mm10? Have you checked if the chr names in your reference genome match those in the annotation?

ADD REPLYlink written 18 months ago by Wei Shi3.2k

Yes, the reads were mapped to mm10 (GRCm38.p6 to be exact). What is the best way to check if chr names match between reference genome and annotation?

ADD REPLYlink written 18 months ago by connor.geraghty0

Here is the line of code to build the reference index based on the download from NCBI Genome:

buildindex(basename="ref_index",reference="GCF_000001635.26_GRCm38.p6_genomic.fna", memory = 48000). 

​By default would the chromosome names be different between the built index from the NCBI, and the inbuilt mm10 annotation?

ADD REPLYlink written 18 months ago by connor.geraghty0

The inbuilt annotation was built based on NCBI RefSeq annotation so chr names should match between your mapped reads and the inbuilt annotation. What kind of sequencing data you are analyzing?
 

ADD REPLYlink written 17 months ago by Wei Shi3.2k

These are colon tissue RNA samples from mouse proximal colon. So, some of the transcripts could probably be microbial, but the majority should be mouse.

ADD REPLYlink written 17 months ago by connor.geraghty0

Your analysis seems to be fine. You probably need to dig deeply into your reads to see where your reads are mapped to. Artifact/contamination in your sequencing or sample prep might cause an issue like this.
 

ADD REPLYlink written 17 months ago by Wei Shi3.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 370 users visited in the last hour