I have a total of 8 samples, 4 controls and 4 Foxcut gene over expressed samples. I have a dataframe
data with genes as rows and samples as columns with counts.
The column data for all the 8 samples look like below with replicate and cell-line information:
Samples TYPE Replicate Cell-lines Cell1_HA1 Control 1 1 Cell1_HA2 Control 2 1 Cell1_foxcut11 FOXCUT_OverExpression 1 1 Cell1_foxcut12 FOXCUT_OverExpression 2 1 Cell2_HA1 Control 3 2 Cell2_HA2 Control 4 2 Cell2_foxcut11 FOXCUT_OverExpression 3 2 Cell2_foxcut12 FOXCUT_OverExpression 4 2
I have counts data for all the 8 samples after
star alignment. I'm using
edgeR package for differential analysis. This is the first time I'm doing differential analysis with cell-line data with replicate information. I'm not aware about how to create
design matrix and
contrast.matrix for differential analysis within same cell-line samples.
I wanted to compare the below samples and do differential analysis:
Cell1_foxcut samples vs Cell1_HA samples Cell2_foxcut samples vs Cell2_HA samples
I tried like below, but not sure whether this is right or not.
colnames(data) %in% coldata$Samples coldata <- coldata[match(colnames(data), coldata$Samples),] table(coldata$Type) library(edgeR) group <- factor(paste0(coldata$TYPE)) y <- DGEList(data,group = group) y$samples ## Filtering keep <- rowSums(cpm(y) > 0.5) >= 1 y <- y[keep, , keep.lib.sizes=FALSE] y <- calcNormFactors(y,method = "TMM") ##Normalization ## Create design matrix design2 <- model.matrix(~ 0 + group + coldata$Replicate + coldata$Cell-lines)
And how to give
coef in contrast.matrix for differential analysis between different samples?
If the above
design.matrix is not right could you please help me how to do this. I have seen tutorials and many other questions, but couldn't come to a conclusion, because I'm confused in this type of analysis.
thanks a lot