Question

featureCounts running too long

1

Entering edit mode

demirkalecy ▴ 10

@36481c1d

Last seen 3.1 years ago

I have totally 32 samples, each sample has paired end reads.

I used STAR for alignment.

The size of the bam files, generated by STAR , ranges from 1G to 27GB.

Now I am using featureCounts to count reads to genes.

I run featurecounts in shell command line with 24 thread machine.

This is code that I used

sinteractive --cpus-per-task=24 --mem=120g module load subread featureCounts -p -T 24 -t exon -g gene_id -a genes.gtf -o P11_AM_Left_BAL_6h.txt P11_AM_Left_BAL_6h.Aligned.sortedByCoord.out.bam

I am using this annotations: ftp://ftp.ensembl.org/pub/release-101/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_35/gencode.v35.annotation.gtf.gz

For 28 samples, featureCounts completed the counting within from 1 min to 68 min run time.

But for four samples, it has been running for more than 15 hours and haven't finished yet.

I checked the sizes of bamfiles for these four files: ranges from 21G to 27G.

Among 28 samples, there is one sample with a similar size (25G), featureCounts completed counting for that sample in about 28minutes.

So the reason of running too long may not be the size of the bam file.

I am wondering what could be the reason of very long run time for those four samples?

RNAseq123 • 1.3k views

ADD COMMENT • link updated 11 weeks ago by difraiadomenico • 0 • written 3.1 years ago by demirkalecy ▴ 10

0

Entering edit mode

I am having the same issue here. I don't think is the .gtf file either, since I am using same gtf for multiple samples and some of them work just fine... I wonder what might happen. Did you fixed the issue ?

ADD REPLY • link 11 weeks ago difraiadomenico • 0