ControlGenes/ housekeeping genes Deseq2
1
2
Entering edit mode
Last seen 16 months ago

I had a query regarding what DESeq2 assigns as ControlGenes (housekeeping genes). As per the manual, the Normalized Counts can be determined by estimateSizeFactorsForMatrix(counts, locfunc = stats::median, geoMeans, controlGenes) While, geoMean is explained, controlGenes are the first 200 genes? Could anyone please explain the criteria for selecting the first 200 genes as Control? Is there a way in DESeq2 which can be used to determine what are the housekeeping genes and plot their expression?

normalization deseq2 sizefactors • 534 views
0
Entering edit mode
@mikelove
Last seen 20 hours ago
United States

controlGenes are specified by the user, as to which genes to use for calculating the size factor. By default all genes are used.

0
Entering edit mode

Thanks for the reply, just few follow up questions. Will adding controlGenes also result in change in output of (by itself) dds <- DESeq(dds) or it would have to be defined herein?

0
Entering edit mode

If you run estimateSizeFactors before DESeq it will use those pre-estimated size factors, and it will print a message saying so.

0
Entering edit mode

So, I had uploaded genes to controlGenes <- c("BJA_RS02215", "BJA_RS03430", "BJA_RS04155", "BJA_RS05410", "BJA_RS07010"), but this creates a character class object, which when used to run

dds <- estimateSizeFactors(dds, type = c("ratio","poscounts", "iterate"), locfunc = stats::median, geoMeans,controlGenes)


throws out a very expected error:

Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  :
controlGenes should be either a numeric or logical vector


I have been trying to convert it to either of them using as.is and have failed to accomplish it. Is there another way of converting gene names into a numeric/logic object?

0
Entering edit mode

Whenever trying out something new, you should check the help for the function.

?estimateSizeFactors


will tell you that controlGenes is:

optional, numeric or logical index vector specifying those genes to use for size factor estimation (e.g. housekeeping or spike-in genes)


A logical vector in R is a vector of TRUE or FALSE that can be used to index another vector or matrix-like object. Here you can just do:

isControl <- rownames(dds) %in% <your.gene.names.go.here>

0
Entering edit mode

Thank you so much! That was very very helpful!