featureCounts running too long
0
1
Entering edit mode
demirkalecy ▴ 10
@36481c1d
Last seen 3.1 years ago

I have totally 32 samples, each sample has paired end reads.

I used STAR for alignment.

The size of the bam files, generated by STAR , ranges from 1G to 27GB.

Now I am using featureCounts to count reads to genes.

I run featurecounts in shell command line with 24 thread machine.

This is code that I used

sinteractive --cpus-per-task=24 --mem=120g module load subread featureCounts -p -T 24 -t exon -g gene_id -a genes.gtf -o P11_AM_Left_BAL_6h.txt P11_AM_Left_BAL_6h.Aligned.sortedByCoord.out.bam

I am using this annotations: ftp://ftp.ensembl.org/pub/release-101/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_35/gencode.v35.annotation.gtf.gz

For 28 samples, featureCounts completed the counting within from 1 min to 68 min run time.

But for four samples, it has been running for more than 15 hours and haven't finished yet.

I checked the sizes of bamfiles for these four files: ranges from 21G to 27G.

Among 28 samples, there is one sample with a similar size (25G), featureCounts completed counting for that sample in about 28minutes.

So the reason of running too long may not be the size of the bam file.

I am wondering what could be the reason of very long run time for those four samples?

RNAseq123 • 1.3k views
ADD COMMENT
0
Entering edit mode

I am having the same issue here. I don't think is the .gtf file either, since I am using same gtf for multiple samples and some of them work just fine... I wonder what might happen. Did you fixed the issue ?

ADD REPLY

Login before adding your answer.

Traffic: 664 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6