estimateSizeFactors before or after removing non-coding genes and filtering for low counts
1
0
Entering edit mode
Tash. • 0
@tash-17343
Last seen 3.4 years ago
United Kingdom

Hi there,

My initial strategy was to retain the protein-coding genes and filter for low counts before creating the dds object.

        GD = read.delim("mart_export (1).txt", header = T, sep = "\t", stringsAsFactors = F)
        counts = counts %>% inner_join(GD[2]%>%unique, by=c("Gene" = "symbol"))
        counts_protein = counts%>% subset(rowSums(counts)>10)
        dds<-DESeqDataSetFromMatrix(counts_protein, colData, formula(~ sample_type))
        dds<-estimateSizeFactors(dds)

I'm not entirely sure if I should be estimating the size factors before or after removing non-coding genes and filtering for low counts? There are a lot of genes with a rowSums of 0 (majority of these are the non-coding) and should these be included in the estimation; i.e. I'm wondering if this can affect the estimation?

Many thanks.

DESeq2 RNASeq • 833 views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

You can filter first before size factors, but not that row sums of 0 are not included anyway in size factor estimation.

ADD COMMENT

Login before adding your answer.

Traffic: 532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6