I am new in this kind of analysis and I have a .csv file containing RNA-Seq data from different cell lines (with at least 3 replicates) normalised to TPM already, unfortunately I cannot access to the raw counts files.
Symbol ID C1 C2 C3 D1 D2 D3 D4
1 TSPAN6 ENSG00000000003.13 133.95 132.07 64.47 54.85 53.65 47.87 56.37
2 TNMD ENSG00000000005.5 10.39 3.47 1.11 0.58 1.74 0.36 1.68
3 DPM1 ENSG00000000419.11 67.67 124.98 33.02 8.35 12.95 12.31 13.33
4 SCYL3 ENSG00000000457.12 2.59 1.40 2.61 5.03 4.70 2.98 3.71
5 C1orf112 ENSG00000000460.15 12.32 46.18 16.49 19.54 19.20 11.72 8.55
6 FGR ENSG00000000938.11 0.00 0.00 0.04 0.36 0.08 0.00 0.00
So my question is: Is there a way I can follow to obtain the p-values, t-values and padj starting from this .csv file in order to perform a differential expression analysis? I read about DESeq, DESeq2, EdgeR, limma and it looks like if all the R packages would ask for the raw counts. I would like to perform a Differential Expression Analysis. And I tried to follow Differential expression of RNA-seq data using limma and voom() but it is not working.
Any suggestions about how to start? Any help is very appreciated.
You will need to be more clear about "not working", the recommendations in that link are the way to go.
Here Differential expression of RNA-seq data using limma and voom() I read that Gordon Smyth does not recommend to use normalised values in DESeq, DESeq2 and edgeR. So I calculated the average of every group (C and D) and then I calculated the log2FC. With those log2FC values, I tried to follow the limma-trend pipeline described in the limma documentation but I always obtain this error "row dimension of design doesn't match column dimension of data object". The syntax I am using is the following: