In addition to find DEGs, I was hoping to using RNA-seq count data to do correlation analysis (Pearson correlation) between gene expression level and a specific phenotype across samples. In order to do that, I have to extract count info (as an indicator of gene expression level) of my interest genes. I used EdgeR and after creating the raw count matrix, I followed the steps:
#Filtering keep <- filterByExpr(y) table(keep) y <- y[keep, , keep.lib.sizes=FALSE] dim(y) #Apply TMM (trimmed mean of M-values) normalization to normalise gene expression distributions and eliminate the composition biases between libraries y <- calcNormFactors(y,method = "TMM") y$counts
Is the count table from
y$counts the right one I can use for further correlation analysis?