Hi,
I am trying to run QoRTs on my STAR aligned data however I have been getting errors I dont understand. I have attempted to re-sort my files with samtools and have increased the memort to no avail.
I have included my full pipeline below:
#STAR alignment
#Indexing
STAR --runThreadN 6 \
--runMode genomeGenerate \
--genomeDir STARindex \
--genomeFastaFiles /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa \
--sjdbGTFfile /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf \
--sjdbOverhang 149
#cd to location of raw data
#aligning
for f in `ls *.fq.gz | sed 's/_[12].fq.gz//g' | \
sort -u`; do echo $f && \
STAR --genomeDir STARindex \
--runThreadN 6 \
--readFilesIn <(gunzip -c ${f}_1.fq.gz ${f}_2.fq.gz) \
--outFileNamePrefix /mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/${f}.sorted \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--outSAMattributes Standard; done
#sorting again to make sure
for f in `ls *.bam | sed 's/.bam//g' | \
sort -u`; do echo $f && \
samtools sort -@ 4 ${f}.bam -o ${f}.samtoolsort.bam;
done
#JunctionSeq
#cd to analysis output location
java -Xmx16G -jar /usr/local/QoRTs.jar QC --stranded --runFunctions writeKnownSplices,writeNovelSplices,writeSpliceExon /mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/C18.sortedAligned.sortedByCoord.out.samtoolsort.bam /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf RawCounts/
#I have done this with both ensembl and gencode alignments with and without samtools sorting with the same errors
QoRTs .log file:
Starting QoRTs v1.3.6 (Compiled Tue Sep 25 11:21:46 EDT 2018)
Starting time: (Tue Feb 16 17:38:53 GMT 2021)
INPUT_COMMAND(QC)
INPUT_ARG(infile)=/mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/C18.sortedAligned.sortedByCoord.out.samtoolsort.bam
INPUT_ARG(gtffile)=/mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf
INPUT_ARG(outdir)=RawCounts/
INPUT_ARG(stranded)=true
INPUT_ARG(runFunctions)=List(writeKnownSplices, writeNovelSplices, writeSpliceExon)
Created Log File: RawCounts//QC.jpnPkkWzfKwU.log
Warning: run-in-progress file "RawCounts//QC.QORTS_RUNNING" already exists. Is there another QoRTs job running?
Starting QC
[Time: 2021-02-16 17:38:53] [Mem usage: [21MB / 536MB]] [Elapsed Time: 00:00:00.0001]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
No parameter --genomeFA found.
NOTE: Function "writeKnownSplices" requires function "JunctionCalcs". Adding "JunctionCalcs" to the active function list...
Running functions: JunctionCalcs, writeKnownSplices, writeNovelSplices, writeSpliceExon
Checking first 10000 reads. Checking SAM file for formatting errors...
Stats on the first 10000 reads:
Num Reads Primary Map: 3159
Num Reads Paired-ended: 0
Num Reads mapped pair: 0
Num Pair names found: 0
Num Pairs matched: 0
Read Seq length: 150 to 150
Unclipped Read length: 150 to 150
Final maxReadLength: 150
maxPhredScore: 37
minPhredScore: 2
WARNING WARNING WARNING! Running in paired-end mode, but reads appear to be single-end! Errors may follow.
Strongly recommend using the '--singleEnded' option
Note: Data appears to be paired-ended.
Warning: Have not found any matched read-pairs in the first 10000 reads. Is data paired-end? Is data sorted?
Sorting Note: Reads appear to be grouped by read-pair, probably sorted by name
Sorting Note: Reads are sorted by position
Done checking first 10000 reads. WARNINGS FOUND!
SAMRecord Reader Generated. Read length: 150.
[Time: 2021-02-16 17:38:56] [Mem usage: [315MB / 536MB]] [Elapsed Time: 00:00:02.0624]
> Init JunctionCalcs utility
Compiling flat feature annotation, internally in memory...
FlatteningGtf: starting...(2021-02-16 17:38:56)
FlatteningGtf: gtf file read complete.(2021-02-16 17:40:24)
FlatteningGtf: Splice Junction Map read.(2021-02-16 17:40:26)
FlatteningGtf: gene Sets generated.(2021-02-16 17:40:28)
FlatteningGtf: Aggregate Sets built.
FlatteningGtf: Compiling Aggregate Info . . . (2021-02-16 17:40:28)
FlatteningGtf: Finished Compiling Aggregate Info. (2021-02-16 17:40:28)
FlatteningGtf: Iterating through the step-vector...(2021-02-16 17:40:28)
FlatteningGtf: Adding the aggregate genes themselves...(2021-02-16 17:40:30)
FlatteningGtf: Iterating through the splice junctions...(2021-02-16 17:40:31)
FlatteningGtf: Sorting the aggregate genes...(2021-02-16 17:40:33)
FlatteningGtf: Folding the FlatGtfLine iterator...(2021-02-16 17:40:34)
FlatteningGtf: Features Built.(2021-02-16 17:40:34)
Internal flat feature annotation compiled!
length of knownSpliceMap after instantiation: 393388
length of knownCountMap after instantiation: 393388
> Init MinorUtils Utility
QC Utilities Generated!
[Time: 2021-02-16 17:40:39] [Mem usage: [1914MB / 9GB]] [Elapsed Time: 00:01:46.0350]
============================FATAL_ERROR============================
QoRTs encountered a FATAL ERROR. For general help, use command:
java -jar path/to/jar/QoRTs.jar --man
============================FATAL_ERROR============================
Error info:
Exception in thread "main" java.lang.IllegalStateException: Inappropriate call if not paired read
at net.sf.samtools.SAMRecord.requireReadPaired(SAMRecord.java:655)
at net.sf.samtools.SAMRecord.getFirstOfPairFlag(SAMRecord.java:713)
at internalUtils.commonSeqUtils$$anon$4.next(commonSeqUtils.scala:1020)
at internalUtils.commonSeqUtils$$anon$4.next(commonSeqUtils.scala:978)
at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.next(stdUtils.scala:969)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.foreach(stdUtils.scala:963)
at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1300)
at qcUtils.runAllQC$.run(runAllQC.scala:970)
at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:680)
at runner.runner$.main(runner.scala:98)
at runner.runner.main(runner.scala)
If anyone can assist me I would be very gateful
Nathan
This anwer solved my problem! Thank you for the insight and help!