I got a MA-plot which represents the differential expression between 2 conditions. I would like to know if there is an option or parameter which allows to "highlight" (I mean, if I put my cursor on a point, I get the ID of the gene) the MA-plot in the aim to identify genes/transcripts graphically?
library(SummarizedExperiment)
example(SummarizedExperiment) # replace with your DESeqDataSet
# Making up MA values, replace with your own calculations.
rowData(se)$M <- rnorm(nrow(se))
rowData(se)$A <- runif(nrow(se))
library(iSEE)
iSEE(se,
initialPanels=DataFrame(Name=c("Row data plot 1", "Row statistics table 1")),
rowDataArgs=DataFrame(XAxis="Row data", XAxisRowData="A", YAxis="M",
ColorBy="Feature name", ColorByRowTable="Row statistics table 1"),
rowStatArgs=DataFrame(SelectByPlot="Row data plot 1")
)
Each point and row represents a gene. You can click-and-drag to select a region on the plot, which will subset the table. Conversely, you can select a row of the table, and the corresponding point on the plot will be highlighted.
Yes, I'd recommend using one of the pre-built tools like iSEE as Aaron has shown above, or Glimma.
You can go from DESeq2 results object to Glimma with this code:
https://bioconductor.github.io/BiocWorkshops/rna-seq-data-analysis-with-deseq2.html#glimma
Thanks for answering me
I read the code you gave me . But I have a problem to run Glimma.
I ran Salmon between two conditions (A & B), and I get that data.frame :
baseMean log2FoldChange lfcSE
<numeric> <numeric> <numeric>
AT1G01010 58.4313040850511 0.387817382909098 0.415319944650195
AT1G01020 221.929555567517 -0.215831601229426 0.262775633128989
AT1G01030 24.2693564808164 2.08259116874253 0.654976179301218
AT1G01040 804.065439938776 -0.510300448552509 0.167267756234671
AT1G01050 1095.19908132151 -0.888760172606495 0.159163410224384
... ... ... ...
ATMG01350 11.6146233340637 -6.4060020185781 1.54287821788877
ATMG01360 312.329206615952 -6.08716476356416 0.427978738216445
ATMG01370 71.8795298175991 -3.19588443444243 0.510936254993291
ATMG01400 2.89085455189266 -4.42665737631082 2.33210219164214
ATMG01410 7.16022108927287 -3.70237557818368 1.40672285534101
stat pvalue padj
<numeric> <numeric> <numeric>
AT1G01010 0.933779819401013 0.350417481416073 0.424919820701641
AT1G01020 -0.821353177459496 0.411445125677856 0.488129457129386
AT1G01030 3.17964413753858 0.00147456014660824 0.0030268044375256
AT1G01040 -3.05079986746863 0.00228232648131296 0.00454822401655851
AT1G01050 -5.58394778896447 2.35119317247797e-08 8.66510848641381e-08
... ... ... ...
ATMG01350 -4.15198163037384 3.29608724558569e-05 8.55851852981895e-05
ATMG01360 -14.223054137997 6.59123424128476e-46 1.83681547598187e-44
ATMG01370 -6.25495725388364 3.97624804959395e-10 1.72744701555765e-09
ATMG01400 -1.89814039546603 0.0576775900576052 0.0873928053299736
ATMG01410 -2.63191542252022 0.00849049964125083 0.0153469288410473
My problem is that I have only 6 columns. In the Glimma's code, there is a command line :
"symbol" corresponds with a 7th column of the data.frame from the exemple, where each symbol is the abbrevation of the genes.
Can I run Glimma only with my data frame?
I think you can skip symbol, that's just an extra. But if you encounter problems, check the Glimma vignette as well.