Hi there,
I'm trying to draw a volcano plot using some scripts that I found in a very handy manual:
http://www.nathalievilla.org/doc/pdf/tutorial-rnaseq.pdf
The problem is that I get an error that I don't manage to overcome:
> str(resEdgeR)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 1405 obs. of 6 variables:
$ : chr "NMA0001" "NMA0002" "NMA0003" "NMA0004" ...
$ logFC : num -0.27 -6.43 0.56 -4.5 -6.54 ...
$ logCPM: num 7.71 8.58 9.37 3.69 5.74 ...
$ LR : num 0.0521 17.0523 0.2248 8.2597 12.7363 ...
$ PValue: num 0.8194151 0.0000364 0.6354217 0.0040534 0.0003586 ...
$ FDR : num 0.916079 0.000896 0.825109 0.023191 0.004031 ...
> tab = data.frame(logFC = resEdgeR$table[, 1], negLogPval = -log10(resEdgeR$table[, 4]))
Error in log10(resEdgeR$table[, 4]) :
non-numeric argument for a mathematical fonction
The error tells me that my argument is non-numeric, however when I get the structure of my data frame it states that it is numeric though. Any suggestion how to solve this problem?
Many thanks in advance!
It looks like the 1st column of your data corresponds to $ : chr "NMA0001" "NMA0002" "NMA0003" "NMA0004" so you're numeric columns are off by one, is it any better if you try tab = data.frame(logFC = resEdgeR$table[, 2], negLogPval = -log10(resEdgeR$table[, 5]))
That's not the problem since when I type head(tab) I can visualize the columns I selected:
> tab = data.frame(logFC = resEdgeR$table[, 1], negLogPval = -log10(resEdgeR$table[, 4]))
Error in log10(resEdgeR$table[, 4]) :
argument non numérique pour une fonction mathématique
> head(tab)
logFC Pval
1 -0.2698253 0.819415130920
2 -6.4312079 0.000036364860
3 0.5601794 0.635421715093
4 -4.5000469 0.004053371323
5 -6.5425931 0.000358625306
6 -7.8980613 0.000006034407
But I still have the problem of calculating the log10 for one of the variables.
> class(tab[,"Pval"])
[1] "numeric"
> log10("Pval")
Error in log10("Pval") :
argument non numérique pour une fonction mathématique
Did you really mean log10("Pval"), rather than log10(tab[,"Pval"]) ?
You're right, my command was bad. Now if I try your command the calculation is made.
The problem is that I can't manage to create a new data frame with one intact variable (logFC) and the new calculate variable (-log10(Pval)):
> tab = data.frame(logFC = resEdgeR$table[, 1], negLogPval = -log10(tab[,"Pval"]))
Error in data.frame(logFC = resEdgeR$table[, 1], negLogPval = -log10(tab[, :
les arguments impliquent des nombres de lignes différents : 0, 1405
It's probably down to the capitalisation of the table names - (you had a lower-case v in pvalue), and also you were referencing tab in the creation of tab itself: try
tab = data.frame(logFC = resEdgeR$table$logFC, negLogPval = -log10(resEdgeR$table$PValue))
Thanks a lot but I keep having the same error:
> tab = data.frame(logFC = resEdgeR$table$logFC, negLogPval = -log10(resEdgeR$table$PValue))
Error in log10(resEdgeR$table$PValue) :
argument non numérique pour une fonction mathématique
hmm, running out of options without seeing the data; try using 'head', 'summary' and maybe 'plot' on the column resEdgeR$table$PValue to see if there's any clues as to where the non-numeric values are creeping in.
I'll keep trying tomorrow and let you informed of any hopefully advances!
Hi Gavin,
Actually I was doing a very basic mistake not calling the variables properly.
I wrote resEdgeR$table[, 1] or resEdgeR$table$logFC
when I should have actually written: resEdgeR[, 1] or resEdgeR$logFC
Thanks anyway for your help!