Question: featureCounts error : SAM_pairer_iterate_tags
0
gravatar for leo_CD
10 weeks ago by
leo_CD0
leo_CD0 wrote:

Hi,

I'm using featureCounts to count reads in bam files from RNAseq. Those bam contain the header "@CO This BAM file is processed by rsem-tbam2gam to convert from transcript coordinates into genomic coordinates."

for few of those bams I have an error :

        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
          v1.6.3

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           P 0de57a4b-f35d-4a47-b4e5-f8261f98decd.bam       ||
||                                                                            ||
||             Output file : 0de57a4b-f35d-4a47-b4e5-f8261f98decd.fc.txt      ||
||                 Summary : 0de57a4b-f35d-4a47-b4e5-f8261f98decd.fc.txt. ... ||
||              Annotation : Homo_sapiens.GRCh37.75.gtf (GTF)                 ||
||      Dir for temp files : /data/tmp                                        ||
||                                                                            ||
||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file Homo_sapiens.GRCh37.75.gtf ...                        ||
||    Features : 1306656                                                      ||
||    Meta-features : 63677                                                   ||
||    Chromosomes/contigs : 265                                               ||
||                                                                            ||
|| Process BAM file 0de57a4b-f35d-4a47-b4e5-f8261f98decd.bam...               ||
||    Paired-end reads are included.                                          ||
||    Assign alignments (paired-end) to features...                           ||
||                                                                            ||
||    WARNING: reads from the same pair were found not adjacent to each       ||
||             other in the input (due to read sorting by location or         ||
||             reporting of multi-mapping read pairs).                        ||
||                                                                            ||
||    Pairing up the read pairs.                                              ||
||                                                                            ||
UnknownTag=
featureCounts: input-files.c:3352: SAM_pairer_iterate_tags: Assertion `0' failed.
/cm/local/apps/torque/var/spool/mom_priv/jobs/58383.bright70.cm.cluster.SC: line 7: 141652 Aborted

Here is the output for one of the bam where everything goes fine.

        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
          v1.6.3

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           P 55001aff-7b0a-42c9-8727-78f36c944761.bam       ||
||                                                                            ||
||             Output file : 55001aff-7b0a-42c9-8727-78f36c944761.fc.txt      ||
||                 Summary : 55001aff-7b0a-42c9-8727-78f36c944761.fc.txt. ... ||
||              Annotation : Homo_sapiens.GRCh37.75.gtf (GTF)                 ||
||      Dir for temp files : /data/tmp                                        ||
||                                                                            ||
||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file Homo_sapiens.GRCh37.75.gtf ...                        ||
||    Features : 1306656                                                      ||
||    Meta-features : 63677                                                   ||
||    Chromosomes/contigs : 265                                               ||
||                                                                            ||
|| Process BAM file 55001aff-7b0a-42c9-8727-78f36c944761.bam...               ||
||    Paired-end reads are included.                                          ||
||    Assign alignments (paired-end) to features...                           ||
||                                                                            ||
||    WARNING: reads from the same pair were found not adjacent to each       ||
||             other in the input (due to read sorting by location or         ||
||             reporting of multi-mapping read pairs).                        ||
||                                                                            ||
||    Pairing up the read pairs.                                              ||
||                                                                            ||
||    Total alignments : 141650182                                            ||
||    Successfully assigned alignments : 62790118 (44.3%)                     ||
||    Running time : 41.85 minutes                                            ||
||                                                                            ||
||                                                                            ||
|| Summary of counting results can be found in file "/data/tmp/55001aff-7b0a  ||
|| -42c9-8727-78f36c944761.fc.txt.summary"                                    ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

I'm running -p -t exon - gene_id options and launching the job on our cluster with nodes=1:ppn=8,mem=16G.

All those bams are suppose to have been process the same way so I don't understand why some of them are failing and I didn't found any info about that error SAM pairer iterate tags.

featurecounts subread • 142 views
ADD COMMENTlink modified 10 weeks ago by Gordon Smyth37k • written 10 weeks ago by leo_CD0
  1. Is this a question about the C version of subread on sourceforge.net or about the Bioconductor package Rsubread? It seems to be the former.

  2. Please give the complete subread command that you used.

  3. You say that you have used rsem-tbam2gam but you probably mean rsem-tbam2gbam. Please give the actual command you used.

  4. How were the transcript BAM files created? Were they created by RSEM or by something else?

  5. What is that you are trying to achieve? If you already have transcript-level expression results, why do you want to run featureCounts? If you do want regular gene-level counts, my suggestion would be to go back to the beginning and align your FASTQ reads directly to the genome with an RNA-seq aligner like Subread or STAR.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by Gordon Smyth37k

1) yes it's about the C version sorry for the confusion.

2)Using subread1.6.3 the command i've use is : featureCounts -p -a Homo_sapiens.GRCh37.75.gtf -t exon -g gene_id -o /data/tmp/${PBS_JOBNAME}.fc.txt bamfile.bam

3) I said those bam contain in their header :

@CO This BAM file is processed by rsem-tbam2gam to convert from transcript coordinates into genomic coordinates.

I got the bam already proceed like this.

4)To my knowledge they were created with RSEM but I don't have all the details unfortunately.

5) I don't have transcript level expression results, I just have the bam file and trying to get simple count to gene level cause I don't need transcript level expression. I don't have access to the FASTQ if I had them I would have indeed just align with STAR and get the count with STAR at the same time or run Salmon.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by leo_CD0

From the error message in your post, it is a "general format error" encountered by featureCounts. In other words, featureCounts couldn't correctly parse the current alignment record in the BAM file.

I generated a BAM file myself using rsem-tbam2bgam but featureCounts ran correctly on it. Can you please share with us the BAM file that caused the error? If you'd like to do so, please send me a mail (liao at wehi.edu.au ).

ADD REPLYlink written 10 weeks ago by Yang Liao100

I will reach you by email and send you one of the bam that fail. Thank for your help.

ADD REPLYlink written 10 weeks ago by leo_CD0

Thanks, Leo. I've received the BAM file, and featureCounts (v.1.6.3) ran smoothly on it, with the read-count table generated. I've sent you some further suggestions to try, and if everything just doesn't work, I hope to build a special version of featureCounts with running details written into logs, so we can see what was the cause.

ADD REPLYlink written 9 weeks ago by Yang Liao100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 86 users visited in the last hour