featureCounts chrAliases not working as expected
1
0
Entering edit mode
@kathleenegrogan-23397
Last seen 4.0 years ago

Using the chrAliases flag in Rsubread featureCounts is not working as expected. Single-end reads mapped to genome file ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA000001405.15GRCh38/GCA000001405.15GRCh38_genomic.fna.gz

& annotation file ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA000001405.15GRCh38/seqsforalignmentpipelines.ucscids/GCA000001405.15GRCh38fullanalysisset.refseqannotation.gtf.gz

featureCounts code as follows:

file.output<-featureCounts(files="input.bam", 
+      annot.ext="GCA_000001405.15_GRCh38_full_analysis_set.refseq_annotation.gtf.gz", 
+      isGTFAnnotationFile = TRUE, chrAliases = "chrAliaseskg.csv",
+      useMetaFeatures=TRUE, nthreads=8, allowMultiOverlap=TRUE, strandSpecific=2, verbose=TRUE)

chrAliaseskg.csv file extracted from ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA000001405.15GRCh38/GCA000001405.15GRCh38assemblyreport+ucsc_names.txt

head chrAliaseskg.csv
chr1,CM000663.2
chr2,CM000664.2
chr3,CM000665.2
chr4,CM000666.2
chr5,CM000667.2
chr6,CM000668.2
chr7,CM000669.2
chr8,CM000670.2
chr9,CM000671.2
chr10, CM000672.2

Error file reads:

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           S Hr02_ASO_kiga006_L001_acceptedhitssrt.bam      ||
||                                                                            ||
||              Annotation : GCA_000001405.15_GRCh38_full_analysis_set.re ... ||
||   Chromosome alias file : chrAliaseskg.csv                                 ||
||      Dir for temp files : .                                                ||
||                 Threads : 8                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : no                                               ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : counted                                          ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| 455 chromosome name aliases are loaded.                                    ||
|| Load annotation file GCA_000001405.15_GRCh38_full_analysis_set.refseq_ ... ||
||    Features : 1935025                                                      ||
||    Meta-features : 43388                                                   ||
||    Chromosomes/contigs : 303                                               ||
||                                                                            ||
|| Process BAM file Hr02_ASO_kiga006_L001_acceptedhitssrt.bam...              ||
||    Single-end reads are included.                                          ||
||    Strand specific : reversely stranded                                    ||
||    Assign alignments to features...                                        ||
||                                                                            ||
|| Chromosomes/contigs in annotation but not in input file                    ||
||    chr14_KI270847v1_alt                                                    ||
||    chr1_GL383520v2_alt                                                     ||
||    chr19_GL949748v2_alt                                                    ||
||    chr4_KI270785v1_alt                                                     ||
||    chr15_KI270906v1_alt                                                    ||
||    chr19_KI270889v1_alt                                                    ||
.............
|| Chromosomes/contigs in input file but not in annotation                    ||
||    KI270795.1                                                              ||
||    KI270853.1                                                              ||
||    KI270425.1                                                              ||
||    J01415.2                                                                ||
||    KI270731.1                                                              ||
||    KI270303.1                                                              ||
||    GL383530.1                                                              ||
||    KI270849.1                                                              ||
............
||    Total alignments : 3151547                                              ||
||    Successfully assigned alignments : 0 (0.0%)                             ||
||    Running time : 0.01 minutes                                             ||
||                                                                            ||
Rsubread • 1.5k views
ADD COMMENT
0
Entering edit mode

It looks the way you ran featureCounts is correct. What it is the end-of-line character in chrAliaseskg.csv? It is '\n' or '\r\n'?

ADD REPLY
0
Entering edit mode

Looks like it ends with '\r\n', not '\n'. There's nothing in the Rsubread instructions to indicate that it matters. Does it?

ADD REPLY
1
Entering edit mode

The carriage return may affect things - you can try dos2unix chrAliaseskg.csv to see if that fixes it. Or, via sed:

sed 's/\\r//g' chrAliaseskg.csv

Also, I don't know if this entry in your file will affect anything:

chr10, CM000672.2

That extra space should not be there.

ADD REPLY
0
Entering edit mode

Sorry for having this issue... Indeed it wasn't documented, and it causes problems to Windows users. I think a better way is to make it compatible with both "\n" and "\r\n" systems.

ADD REPLY
2
Entering edit mode
Yang Liao ▴ 380
@yang-liao-6075
Last seen 4 weeks ago
Australia

I have fixed the "\r\n" end-of-line character issue in the "chrAliases" file for featureCounts, and the fix is included in the 2.3.1 version of Rsubread (the in-develop version).

ADD COMMENT

Login before adding your answer.

Traffic: 738 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6