Context: I am currently analyzing some Illumina RNA-seq data. These data are from treating two different genotypes (WT & HET) with four different treatments. There are THREE biological replicates for each treatment group.
- WT treated with negative control (NC)
- WT treated with a solution of some concentration (Ace)
- WT treated with the same solution but more concentrated (MAce)
- WT treated with positive control (PC)
Same for HET.
What I have done: I used Salmon to quantify the reads and also create 100 bootstrapped samples, and then I used fishpond to perform differential gene analysis, where I was able to compare within a treatment group. For example, I was able to find differential expressed genes between A and B in the first treatment group, using A as a reference.
Question: The analysis I have done so far does not allow me to compare across treatment groups because the estimates of abundance are normalized within a treatment group. In other words, I cannot compare the log2(foldchange) from Treatment 1 to that from Treatment 2. However, I want to be able to compare across treatments. I want to be able to establish a basal level to which I can compare different treatments. But I do not know how to do so.
Attempt: Upon reading more into the fishpond documentation, I realized I could use genotype to match samples across conditions and then use
interaction = True in swish. In an attempt to follow the Interaction designs in the documentation, I am stuck at creating paired samples, specifically,
y2$pair <- as.numeric(factor(y2$line)). Is line in the macrophage dataset equivalent to genotype in my dataset?
Additional Question: The mapping rates from salmon for all samples range from 68 to 81 with the majority being in the 70s. Is this typical?
R Code: Comparing Treatment Group 1 and Treatment Group 2 based on genotypes
se <- tximeta(coldata) gse <- summarizeToGene(se) gy <- gse gy <- gy[,gy$condition %in% c("WT-NC", "WT-Ace", "HET-NC", "HET-Ace")] gy$genotype <- factor(ifelse(grepl("WT", gy$condition), "WT", "HET")) gy$condition <- factor(ifelse(grepl("NC", gy$condition), "NC", "Ace")) with(colData(gy), table(condition, genotype) )
This is my first time analyzing RNA-seq data, so any additional advice is appreciated! Thank you :)