DESeq2 returns results for group with just one replicate
1
0
Entering edit mode
nhaus ▴ 50
@789c70a6
Last seen 12 days ago
Switzerland

Hi all,

I am running DESeq2 (version 1.36.0) for 7 different experimental groups. 2 of the groups only have 1 replicate. I thought that DESeq2 does not work for groups without replicates, but I did not exclude the groups prior to creating the DESeq object, because I thought that they might still provide some information for the dispersion estimation.

To my surprise, after running DESeq2, I got a DESeqResults object for the group with only one replicate which even included some significant genes. Can anyone provide some pointers why this happend? Is it because my other groups have replicates, so the dispersion could be estimated? Also, would you "trust" the results for the group with only one replicate?

Here is the dput for a dds object to reproduce the behavior:

new("DESeqDataSet", design = ~perturbation, dispersionFunction = function () 
NULL, rowRanges = new("CompressedGRangesList", unlistData = new("GRanges", 
    seqnames = new("Rle", values = structure(integer(0), levels = character(0), class = "factor"), 
        lengths = integer(0), elementMetadata = NULL, metadata = list()), 
    ranges = new("IRanges", start = integer(0), width = integer(0), 
        NAMES = NULL, elementType = "ANY", elementMetadata = NULL, 
        metadata = list()), strand = new("Rle", values = structure(integer(0), levels = c("+", 
    "-", "*"), class = "factor"), lengths = integer(0), elementMetadata = NULL, 
        metadata = list()), seqinfo = new("Seqinfo", seqnames = character(0), 
        seqlengths = integer(0), is_circular = logical(0), genome = character(0)), 
    elementMetadata = new("DFrame", rownames = NULL, nrows = 0L, 
        elementType = "ANY", elementMetadata = NULL, metadata = list(), 
        listData = structure(list(), names = character(0))), 
    elementType = "ANY", metadata = list()), elementMetadata = new("DFrame", 
    rownames = NULL, nrows = 4L, elementType = "ANY", elementMetadata = new("DFrame", 
        rownames = NULL, nrows = 0L, elementType = "ANY", elementMetadata = NULL, 
        metadata = list(), listData = list(type = character(0), 
            description = character(0))), metadata = list(), 
    listData = structure(list(), names = character(0))), elementType = "GRanges", 
    metadata = list(), partitioning = new("PartitioningByEnd", 
        end = c(0L, 0L, 0L, 0L), NAMES = c("EB", "EC", "EE", 
        "ISC"), elementType = "ANY", elementMetadata = NULL, 
        metadata = list())), colData = new("DFrame", rownames = c("Control Replicate 1", 
"Control Replicate 2", "NotchRNAi Replicate 1", "CphUp Replicate 1", 
"CphUp Replicate 2", "CphRNAi Replicate 1", "CphRNAi Replicate 2", 
"NotchCphRNAi Replicate 1", "Control Replicate 1 (3')", "Control Replicate 2 (3')", 
"Notch Replicate 1 (3')", "Notch Replicate 2 (3')"), nrows = 12L, 
    elementType = "ANY", elementMetadata = new("DFrame", rownames = NULL, 
        nrows = 3L, elementType = "ANY", elementMetadata = NULL, 
        metadata = list(), listData = list(type = c("input", 
        "input", "intermediate"), description = c("", "", "a scaling factor for columns"
        ))), metadata = list(), listData = list(orig.ident.unique = structure(c(3L, 
    4L, 9L, 5L, 6L, 7L, 8L, 12L, 1L, 2L, 10L, 11L), levels = c("Control Replicate 1 (3')", 
    "Control Replicate 2 (3')", "Control Replicate 1", "Control Replicate 2", 
    "CphUp Replicate 1", "CphUp Replicate 2", "CphRNAi Replicate 1", 
    "CphRNAi Replicate 2", "NotchRNAi Replicate 1", "Notch Replicate 1 (3')", 
    "Notch Replicate 2 (3')", "NotchCphRNAi Replicate 1"), class = "factor"), 
        perturbation = structure(c(3L, 3L, 7L, 2L, 2L, 1L, 1L, 
        6L, 4L, 4L, 5L, 5L), levels = c("CphRNAi", "CphUp", "ctrl", 
        "ctrl_3prime", "notch_3prime", "NotchCphRNAi", "NotchRNAi"
        ), class = "factor"), sizeFactor = c(`Control Replicate 1` = 1.65060334158416, 
        `Control Replicate 2` = 0.525139232673267, `NotchRNAi Replicate 1` = 1.39441522277228, 
        `CphUp Replicate 1` = 0.46875, `CphUp Replicate 2` = 0.217667079207921, 
        `CphRNAi Replicate 1` = 0.738397277227723, `CphRNAi Replicate 2` = 0.216042698019802, 
        `NotchCphRNAi Replicate 1` = 0.254795792079208, `Control Replicate 1 (3')` = 1.78055383663366, 
        `Control Replicate 2 (3')` = 1.04679764851485, `Notch Replicate 1 (3')` = 2.20266089108911, 
        `Notch Replicate 2 (3')` = 1.50417698019802))), assays = new("SimpleAssays", 
    data = new("SimpleList", listData = list(counts = structure(c(4824L, 
    1495L, 77L, 717L, 1073L, 974L, 32L, 184L, 1271L, 576L, 1320L, 
    2842L, 1019L, 775L, 32L, 194L, 143L, 742L, 14L, 39L, 1434L, 
    1484L, 25L, 239L, 341L, 561L, 5L, 24L, 30L, 774L, 11L, 283L, 
    4361L, 2265L, 356L, 691L, 1781L, 2450L, 142L, 138L, 1464L, 
    885L, 1946L, 5197L, 857L, 2348L, 549L, 2728L), dim = c(4L, 
    12L), dimnames = list(c("EB", "EC", "EE", "ISC"), c("Control Replicate 1", 
    "Control Replicate 2", "NotchRNAi Replicate 1", "CphUp Replicate 1", 
    "CphUp Replicate 2", "CphRNAi Replicate 1", "CphRNAi Replicate 2", 
    "NotchCphRNAi Replicate 1", "Control Replicate 1 (3')", "Control Replicate 2 (3')", 
    "Notch Replicate 1 (3')", "Notch Replicate 2 (3')")))), elementType = "ANY", 
        elementMetadata = NULL, metadata = list())), NAMES = NULL, 
    elementMetadata = new("DFrame", rownames = NULL, nrows = 4L, 
        elementType = "ANY", elementMetadata = NULL, metadata = list(), 
        listData = structure(list(), names = character(0))), 
    metadata = list(version = structure(list(c(1L, 36L, 0L)), class = c("package_version", 
    "numeric_version"))))
result <- DESeq(dds, fitType='local') 
results(result, name="perturbation_NotchRNAi_vs_CphRNAi") # NotchRNAi only has one replicate

Any insights are much appreciated!

DESeq2 DifferentialExpression • 142 views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States

There is no problem having only one replicate per group, so long as you have replicates for other groups. The degrees of freedom are based on the model, not the individual groups. That said, the generalizability of your results may be minimal at best, as you cannot say if the single observation for the one group is representative of the underlying population, which to be clear is true of most high-throughput studies with limited replication.

I often compare the N required for, like, any other study type to a high-throughput study. Imagine trying to sell a study of a new drug where you were going to have five subjects in the treatment and five in the placebo group. You'd be laughed out of the room. But publishing an RNA-Seq study with the same N? No problem at all. But the difference is that the RNA-Seq study is meant to be hypothesis generating (even though some people I know think of them as dispositive), whereas the drug study is meant to provide a believable answer, although still not dispositive. That's what meta-analyses are for.

Login before adding your answer.

Traffic: 801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6