Question: What does scaling RNASeq data using the area under coverage information (AUC) mean ? - related to recount package
gravatar for elmahy2005
13 months ago by
elmahy20050 wrote:

I am working with the package recount which download preprocessed RNASeq datasets. In the vinette, I am struggling to understand this :

"Downloaded count data are first scaled to take into account differing coverage between samples.

Scale counts by taking into account the total coverage per sample

rse1 <- scale_counts(rse_gene1)



The scale_counts function is as follows:

scale_counts <- function(rse, by = 'auc', targetSize = 4e7, L = 100,

    factor_only = FALSE, round = TRUE) {    


    ## Scale counts

    if(by == 'auc') {

        # L cancels out:

        # have to multiply by L to get the desired library size,

        # but then divide by L to take into account the read length since the

        # raw counts are the sum of base-level coverage.

        scaleFactor <- targetSize / SummarizedExperiment::colData(rse)$auc


        scaleMat <- matrix(rep(scaleFactor, each = nrow(counts)),

            ncol = ncol(counts))

        scaledCounts <- counts * scaleMat

        if(round) scaledCounts <- round(scaledCounts, 0)

        SummarizedExperiment::assay(rse, 1) <- scaledCounts





First I though that auc is the library depth (sum of all read counts in each sample) but I get a different number. What is scaling by auc ? is it an alternative to normalization ?


ADD COMMENTlink modified 12 months ago by Leonardo Collado Torres690 • written 13 months ago by elmahy20050
Answer: What does scaling RNASeq data using the area under coverage information (AUC) me
gravatar for Leonardo Collado Torres
12 months ago by
United States
Leonardo Collado Torres690 wrote:


I don't know why I didn't get an email about this question. In any case, please check the recount workflow ( published at F1000 Research That workflow describes in more detail what are the actual numbers we provide in the RangedSummarizedExperiment objects. The scale_counts() function can be used to go from the numbers we provide to actual read counts.



ADD COMMENTlink written 12 months ago by Leonardo Collado Torres690
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 217 users visited in the last hour