Significant reporter activity with mpralm function
0
0
Entering edit mode
ATpoint ★ 1.4k
@atpoint-13662
Last seen 16 hours ago
Germany

Good morning,

the mpra package seems promising to perform differential analysis of massively-parallel reporter assays between conditions, be it genotypes/alleles, tissues or treatments. The comparison takes RNA and DNA counts for both conditions into account and tests for significant differences. So far so good, but how about the identification of significant reporter activity with this framework? Given that I have a DNA library, consisting of 100k genomic elements/regions, which I transfect into cells in several replicates, eventually obtaining counts for RNA and DNA: Can I use this framework to test whether RNA counts are significantly enriched or depleted over DNA counts, distinguishing elements with enhancing or silencing reporter activity from silent elements? If so, could you please recommend a design matrix for this setting?

best wishes,

Alex

mpra mpralm reporter assay linear model • 578 views
1
Entering edit mode

Yes, you can test whether RNA counts are significantly enriched or depleted relative to DNA counts in our framework. This would involve testing whether the mean activity measure (log ratio of RNA counts over DNA counts) is zero within a comparison group. The typical differential analysis linear model setup involves setting up a design matrix with an intercept and a slope term. For the situation of testing whether RNA counts are significantly enriched or depleted, you would use a design matrix with only the intercept term with code along the lines of:

design <- model.matrix(~1, data = my_colData)

Usually in differential analysis between conditions, this is

design <- model.matrix(~1+group, data = my_colData)

For significantly enriched/depleted RNA however, make sure that the samples are subset so that only one condition is present. And you would repeat the analysis in the different comparison groups if present. Otherwise the intercept becomes an overall mean activity measure across groups, which isn't appropriate.

With this design matrix just containing the intercept, you could proceed as usual as shown in the vignette.

Short aside: I've seen some papers tackle this question by performing a count-based differential analysis between RNA and DNA counts (using tools such as edgeR or DESEq2), but I don't think this is exactly what is desired because it compares mean RNA counts to mean DNA counts rather than looking at the RNA/DNA ratios themselves which are of interest.

Best,
Leslie