Hello I'm running an analysis on RNA-seq data from TCGA & need help understanding the coefficients in DESeq2.
Here is my code so far:
library('DESeq2') cts <- read.csv(file='/Users/Corey/Desktop/DESeq2/Final_DESeq2/GeneName.csv') nrow(cts) ncol(cts) colData <- read.csv(file='colData.csv') ncol(colData) nrow(colData) rownames(cts) <- cts$Geneid cts$Geneid <- NULL dds <- DESeqDataSetFromMatrix(countData=cts, colData=colData, design= ~ patient + condition) keep <- rowSums(counts(dds) >= 10) >= 5 dds <- dds[keep,] # This is the issue: dds$condition <- relevel(dds$condition, ref='NT') dds <- DESeq(dds, fitType='local') resultsNames(dds) res <- results(dds) res
I have a few things that I will add like LFCshrink & lfcThreshold. But when I used the resultsNames function I get this:
[1] "Intercept" "patient_TCGA.38.4626_vs_TCGA.38.4625" "patient_TCGA.38.4627_vs_TCGA.38.4625" "patient_TCGA.38.4632_vs_TCGA.38.4625" "patient_TCGA.44.2655_vs_TCGA.38.4625"
[6] "patient_TCGA.44.2657_vs_TCGA.38.4625" "patient_TCGA.44.2661_vs_TCGA.38.4625" "patient_TCGA.44.2662_vs_TCGA.38.4625" "patient_TCGA.44.2665_vs_TCGA.38.4625" "patient_TCGA.44.2668_vs_TCGA.38.4625"
[11] "patient_TCGA.44.3396_vs_TCGA.38.4625" "patient_TCGA.44.3398_vs_TCGA.38.4625" "patient_TCGA.44.5645_vs_TCGA.38.4625" "patient_TCGA.44.6145_vs_TCGA.38.4625" "patient_TCGA.44.6146_vs_TCGA.38.4625"
[16] "patient_TCGA.44.6147_vs_TCGA.38.4625" "patient_TCGA.44.6148_vs_TCGA.38.4625" "patient_TCGA.44.6776_vs_TCGA.38.4625" "patient_TCGA.44.6777_vs_TCGA.38.4625" "patient_TCGA.44.6778_vs_TCGA.38.4625"
[21] "patient_TCGA.49.4490_vs_TCGA.38.4625" "patient_TCGA.49.4512_vs_TCGA.38.4625" "patient_TCGA.49.6742_vs_TCGA.38.4625" "patient_TCGA.49.6743_vs_TCGA.38.4625" "patient_TCGA.49.6744_vs_TCGA.38.4625"
[26] "patient_TCGA.49.6745_vs_TCGA.38.4625" "patient_TCGA.49.6761_vs_TCGA.38.4625" "patient_TCGA.50.5930_vs_TCGA.38.4625" "patient_TCGA.50.5931_vs_TCGA.38.4625" "patient_TCGA.50.5932_vs_TCGA.38.4625"
[31] "patient_TCGA.50.5933_vs_TCGA.38.4625" "patient_TCGA.50.5935_vs_TCGA.38.4625" "patient_TCGA.50.5936_vs_TCGA.38.4625" "patient_TCGA.50.5939_vs_TCGA.38.4625" "patient_TCGA.50.6595_vs_TCGA.38.4625"
[36] "patient_TCGA.55.6968_vs_TCGA.38.4625" "patient_TCGA.55.6970_vs_TCGA.38.4625" "patient_TCGA.55.6971_vs_TCGA.38.4625" "patient_TCGA.55.6972_vs_TCGA.38.4625" "patient_TCGA.55.6975_vs_TCGA.38.4625"
[41] "patient_TCGA.55.6978_vs_TCGA.38.4625" "patient_TCGA.55.6979_vs_TCGA.38.4625" "patient_TCGA.55.6980_vs_TCGA.38.4625" "patient_TCGA.55.6981_vs_TCGA.38.4625" "patient_TCGA.55.6982_vs_TCGA.38.4625"
[46] "patient_TCGA.55.6983_vs_TCGA.38.4625" "patient_TCGA.55.6984_vs_TCGA.38.4625" "patient_TCGA.55.6985_vs_TCGA.38.4625" "patient_TCGA.55.6986_vs_TCGA.38.4625" "patient_TCGA.73.4676_vs_TCGA.38.4625"
[51] "patient_TCGA.91.6828_vs_TCGA.38.4625" "patient_TCGA.91.6829_vs_TCGA.38.4625" "patient_TCGA.91.6831_vs_TCGA.38.4625" "patient_TCGA.91.6835_vs_TCGA.38.4625" "patient_TCGA.91.6836_vs_TCGA.38.4625"
[56] "patient_TCGA.91.6847_vs_TCGA.38.4625" "patient_TCGA.91.6849_vs_TCGA.38.4625" "condition_NT_vs_MPT"
Could some explain why the comparisons are made between different patients when the sample are matched. If needed I can provide my colData object also...
So what I am getting from that example is that it doesn't matter what is the reference because the comparison between samples remains the same?
So when using the results function I can just use:
res <- results(dds, name="condition_NT_vs_MPT")
to call for the comparison via differential expression between my NT & MPT tumors.
Yes you’ve got it