Behaviour of featureCounts when quantifying multi-mapping reads
1
0
Entering edit mode
Rachael • 0
@6f00b97d
Last seen 3 days ago
United Kingdom

Hi,

I am having some problems understanding the settings for counting multi-mapping reads using featureCounts

Previously we did the following:

  1. Aligned with HISAT2
  2. Used featureCounts from Subread 1.5.0 (for consistency with another paper)
featureCounts -p -C -B -f -T 5 --primary \
    -a examples.gtf \
    -o output.counts \
    input.bam >> featurecounts.txt 2>&1

This did count multimapping reads once each

Alternatively using

featureCounts -p -C -B -f -T 5 \
    -a examples.gtf \
    -o output.counts \
    input.bam >> featurecounts.txt 2>&1

Did not count multimapping reads

(We know that they were/were not being counted from a small amplicon sequencing dataset with few reads)



Now due to some strange gzip error that suddenly appeared using Subread 1.5.0 when the cluster hardware was changed by our admin (GZIP error 2 with any BAM file) we have upgraded to Subread 1.6.4

So now we do the following:

  1. Align with HISAT2
  2. Use featureCounts from Subread 1.6.4

However if we run

featureCounts -p -C -B -f -T 5 --primary \
    -a examples.gtf \
    -o output.counts \
    input.bam >> featurecounts.txt 2>&1

OR

featureCounts -p -C -B -f -T 5 \
    -a examples.gtf \
    -o output.counts \
    input.bam >> featurecounts.txt 2>&1

We get the same output...

  1. Are the multi-mapping reads being counted when --primary is used in version 1.6.4?
  2. Has there been a change in the usage of --primary in different versions or has something else happened?
  3. How do we specify that we do or do not wish to count multi-mapping reads in 1.6.4? - Some research has lead us to the -M option but the manual says that --primary disregards it
  4. Why are we getting the GZIP error 2 with 1.5.0? - we have tried freshly downloaded TCGA BAMs
Bioconductor featurecounts subread • 159 views
ADD COMMENT
4
Entering edit mode
Yang Liao ▴ 440
@yang-liao-6075
Last seen 1 day ago
Australia

Both v1.5.0 and v1.6.4 versions are very old. I suggest upgrading to v2.0.6 (the latest version).

There was a change regarding the --primary option between v1.5.0 and v1.6.4. And it seems that the manual has some discordant descriptions regarding the usage of --primary. If you read section 6.2.6, it says:

When multi-mapping reads are reported with primary and secondary alignments and both -M and --primary are specified, only primary alignments will be considered in counting and secondary alignments will be ignored. If -M is specified but --primary is not specified, both primary and secondary alignments will be considered in counting.

This is the correct description for the --primary option. You must specify the -M option in v1.6.4 (and versions thereafter) to allow multimapping reads (even if you only want to count their primary alignments). I hope this can answer your first three questions.

For your fourth question, I have no idea about why gzip error 2 suddenly appeared after a hardware change was made. But because v1.6.4 works on the BAM files, it may be caused by a upgraded zlib version in your system with the hardware change.

ADD COMMENT
0
Entering edit mode

Thank you so much for your help!

ADD REPLY

Login before adding your answer.

Traffic: 672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6