Hi there,
I'm one of the users of DESeq2. Recently I got a bunch of data with only raw read counts of RNA-seq. The design was quite weird. They applied the same procedure to the replicates but got sequenced in completely different platforms(Solid, Nextseq and Hiseq). If the procedures from fastq files to raw counts are the same, can I simply apply them into DESeq2 and find the results of differential genes?
Thanks.
Thanks for the reply.
In that case, the case would be rows with all genes expressed founded in RNA-seq, the colname would be "untreated1","untreated2","untreated3","treated1","treated2","treated3", platform would be "untreated1" & "treated1" SOLiD Total RNA-Seq kit on the SOLiD System at 50M reads, "untreated2" & "treated2" RNA libraries prepared with the Ultra Directional RNA Library Prep Kit for Illumina on the Illumina NextSeq System at 50M reads, "untreated3" & "treated3" Ultra Directional RNA Library Prep Kit for Illumina on the Illumina HiSeq 4000 PE100 System at 50M reads. (These 3 untreated and treated pairs were biological replicates, which I think sequencing on different platform was a mistake in research design. I don't know whether my design on merging the data together works in this case.)
Thanks.
Here is closer what I was asking for, e.g. a sample table which forms the 'colData' of a Bioconductor object:
You can use a design of
~platform + condition
, and then test on the condition variable. Simply calling DESeq() followed by results() will work if 'condition' is the last variable in the design, but you should make 'untreated' the reference level of 'condition' (see note on factor levels in the vignette). This will control for the differences due to platform in this dataset by fitting a baseline for each platform.Thanks, Michael.
I've proceeded that and got very good results.