**0**wrote:

Hello all,

I am curious if the RUVg normalization method (ERCC spike in normalization) is applicable to transcript-level counts for differential transcript expression, I know RUVseq was developed for gene-level counts but it seems like it would still be OK for transcript level count analysis?

My idea is to use Kallisto + Sleuth for DTE analysis coupled with RUVg spike in normalization since my samples are from different RNA library protocols (Poly A+, Poly A-, total RNA).

` so = sleuth_prep(s2c_tmp, transformation_function = function(x) log2(x+0.5))`

Expr = so$obs_norm %>% dplyr::select(target_id, sample, est_counts)
Emat = tidyr::spread(Expr, key = sample, est_counts)
## need to round est_counts to integer for RUVseq
Emat = round(Emat)
# Filter transcripts
filter <- apply(Emat, 1, function(x) length(x[x > 5]) >= 2) ##remove non-expressed transcripts
filtered <- Emat[filter,]
spikes <- rownames(filtered)[grep("^ERCC", rownames(filtered))]
x <- as.factor(s2c_tmp$condition)
set <- newSeqExpressionSet(as.matrix(filtered),
phenoData = data.frame(x, row.names=colnames(filtered)))
## USE RUVg to normalize samples based on control spikeins
set1 <- RUVg(set, spikes, k=1)
## get normalization factors
wts = pData(set1)[ , 2:ncol(pData(set1))]
so$sample_to_covariates$W1 = wts
## Use W1 (RUVg factors) as batch
so <- sleuth_fit(so, ~ W1 + condition, 'full')
so <- sleuth_fit(so, ~ W1, 'reduced')
## Wald test for Log 2 Fold-change
so = sleuth_wt(so, "conditionX")

Does this approach/ code seem reasonable?

**740**• written 8 months ago by Brian Gudenas •

**0**