I am working with RNA-Seq on very long reference chromosomes (T. aestivum). In the mapping stage I have used STAR to generate SAM files instead of BAM files because it seems that BAM files cannot represent reference positions greater than 2^29-1 (512Mb) and the chromosomes I have used are longer (~700-800Mb).
In the next stage, I want to use DESeq2 to obtain DE genes and, previously, I need to use summarizeOverlaps to calculate the counts. Unfortunately, summarizeOverlaps allowes only BAM files and, when SAM files are specified as parameters, errors are shown:
> se <- summarizeOverlaps(features=ebg, reads=bamfiles, + mode="Union", + singleEnd=FALSE, + ignore.strand=TRUE, + fragments=TRUE ) Error in value[[3L]](cond) : failed to open BamFile: SAM/BAM header missing or empty file: 'G:/Data_JIC/IWGSC/against_WGAv1.0/STARWGAv1.0/Max1.sam' In addition: Warning messages: 1: In doTryCatch(return(expr), name, parentenv, handler) : [bam_header_read] EOF marker is absent. The input is probably truncated. 2: In doTryCatch(return(expr), name, parentenv, handler) : [bam_header_read] invalid BAM binary header (this is not a BAM file).
The question is, how could I overcome this problem, please? I cannot convert SAM files into BAM files because I would lose information... Is there any way to input SAM files into summarizeOverlaps?