Hi, I'm working on the DE analysis of my RNA-seq data from the green algae Chlamydomonas, and I'm able to generate a normal DE result by DESeq2 like this:
baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | |
<numeric> | <numeric> | <numeric> | <numeric> | <numeric> | <numeric> | |
Cre01.g000450.v5.5 | 256.1055 | -0.2995 | 0.2954 | -1.0140 | 0.3106 | 0.7465 |
Cre01.g000500.v5.5 | 44.3266 | -0.7029 | 0.3880 | -1.8114 | 0.0701 | 0.3764 |
Cre01.g000600.v5.5 | 2.3502 | 1.5752 | 1.8795 | 0.8381 | 0.4020 | 0.8108 |
Cre01.g000650.v5.5 | 5.7842 | 1.3050 | 0.8817 | 1.4802 | 0.1388 | 0.5241 |
Cre01.g000850.v5.5 | 4.7789 | -0.0103 | 0.7810 | -0.0132 | 0.9895 | 0.9999 |
... | ... | ... | ... | ... | ... | ... |
Cre36.g759647.v5.5 | 10.3085 | 0.2125 | 1.1183 | 0.1900 | 0.8493 | 0.9771 |
Cre39.g760097.v5.5 | 2.7385 | 0.8043 | 1.6105 | 0.4994 | 0.6175 | 0.9069 |
Cre43.g760547.v5.5 | 2.9478 | -2.4908 | 1.6740 | -1.4879 | 0.1368 | 0.5233 |
Cre44.g760747.v5.5 | 633.6948 | -0.0325 | 0.2354 | -0.1380 | 0.8902 | 0.9846 |
Cre48.g761197.v5.5 | 5.6491 | -0.3471 | 1.0296 | -0.3371 | 0.7360 | 0.9423 |
I've also downloaded a text file of gene symbol and transcript ID from JGI (https://phytozome.jgi.doe.gov/pz/portal.html):
Cre01.g000050.t1.1 | RWP14 |
Cre01.g000150.t1.2 | ZRT2 |
Cre01.g000650.t1.1 | AMX2 |
Cre01.g000850.t1.2 | CPLD38 |
Cre01.g000900.t1.2 | CPLD20 |
Cre01.g001400.t1.1 | ZMP1 |
Cre01.g001750.t1.2 | TIG1 |
Cre01.g002200.t1.1 | RPB6 |
Cre01.g002500.t1.2 | COP2 |
Cre01.g003050.t1.2 | SEC8 |
Cre01.g004250.t1.2 | TCTEX1 |
Cre01.g004300.t1.2 | ASN1 |
Cre01.g004450.t1.2 | CPLD42 |
Cre01.g004500.t1.2 | LEU1L |
Cre01.g004550.t1.2 | FAP190 |
Cre01.g004600.t1.1 | RWP12 |
Cre01.g005150.t1.1 | SGA1 |
Cre01.g005450.t1.2 | RSP10 |
Cre01.g005550.t1.2 | ARL2 |
… | … |
I'm wondering if there is a direct way to add a column of gene symbols to my DE result by mapping the transcripts ID to the text file above? I've done some research and I'm not sure if the org.Hs.eg.db package can help me to do it Thanks!
Not an expert, but I had the same question recently and I did it through
where IDcol is the column containing your IDs in your data frame and transID, the IDs column in your list.
Hope that helps!
Hi Rina, thank you for the response, I've successfully joined the two data frames with this package, thanks a lot!