I have Illumina mRNA-seq samples where it seems because of low RINs (2-4) in a bunch of them compared to the others, I am getting very widely varying mapping rates (15%-70%) and therefore counts per sample (e.g. 8,000,000 mapped reads vs 40,000,000). Plus I can't really use RIN/mapping rate as a covariate because it is very confounded with a group of interest.
Is there a preferred way of analyzing this type of data? If I do the usual VST through DESEQ2 I get a cluster of samples with irregular high expression of a lot of genes, also the ones with low numbers of overall sample counts, presumably this is because of what I describe above. I was wondering if quantile normalisation would help or are there any other ideas?
I also used Salmon to quantify the data using the gc bias and validate mappings flags. Reads are 150bp.