Hi everyone I am have RNA seq data that I am trying to analyze using EdgeR. Here is the codes I used to analyze the data I just want to make sure that the codes are right and that I can rely on the output for the analysis. The summary of the experiments: I have 4 replicates of infected cells with Cryptosporidium and 4 replicates of control (uninfected)
rawdata <- read.csv("celldata.csv", check.names=FALSE, sep=",", header=T, stringsAsFactors=FALSE) library(edgeR) library(limma) rownames( counts ) <- raw.data[ , 1 ] > y <- DGEList(counts=rawdata[,2:9], genes=rawdata[,1])
filtering
> o <- order(rowSums(y$counts), decreasing=TRUE) > y <- y[o,] > d <- duplicated(y$genes) > y <- y[!d,] > nrow(y) > y$samples$lib.size <- colSums(y$counts) > rownames(y$counts) <- rownames(y$genes) <- y$genes$EntrezGene > y$genes$EntrezGene <- NULL > y <- calcNormFactors(y) > y$samples > plotMDS(y)
Design matrix
cells <- factor(c(1,2,3,4,5,6,7,8)) > status<- factor(c("i","i","i","i","c","c","c","c")) > data.frame(Sample=colnames(y),cells,status) > design <- model.matrix(~cells) > rownames(design) <- colnames(y) > Estimating the dispersion >y d1$common.dispersion > y <- estimateTrendedDisp(y) > y <- estimateTrendedDisp(y) >y <- estimateCommonDisp(y) > y<- estimateCommonDisp(y) > y <- estimateTrendedDisp(y) > y<- estimateTagwiseDisp(y) > plotBCV(y) > fit <- glmFit(y, design) > lrt <- glmLRT(fit) > topTags(lrt)
Really appreciate any feedback
Many thanks for your quick response. I realized there is a problem with filtering since I get the same number before and after filtering. This means filtering did not make any changes or remove any reads. How do you recommend to change this codes? Regarding cells factor I understand the reason why it is wrong but I don't know how to fix it. 1,2,3 nd four belong to the infected group and 5,6,7,8 belong to the control group. Thanks,
Something like the below would be sufficient:
Alternatively, read some of the case studies in the user's guide for an "at least X" approach to filtering.
As for the factor in the design matrix, just replace
cells
withstatus
as I mentioned.