I have a matrix.
Dimensions=> rows= 6534, columns= 6
rows are the ORF (yeast)
columns are the name of the vcf files of yeast.
This matrix stores the information of no. of mutations occurring for a particular gene in each file (0 means, there is no mutation of that particular gene in the file)
> head(matrix)
D784G DAS217 dst1 E1224G F1205H rpa12
Q0010 0 0 0 0 0 4
Q0032 0 0 0 0 0 0
Q0055 1 1 1 1 1 1
Q0075 0 0 0 0 0 0
Q0080 0 0 0 0 0 0
Q0085 15 12 13 10 13 15
I want to make a dot plot:
X-axis= genes, Y-axis= No. of mutations for all 6 files in 6 different colors & shapes. Also, there should be a legend for the 6 files.
I am confused. How to write command for this?
You almost surely don't want to do this. A dotplot for 6.5K genes will be either super wide (so you can read the horizontal axis labels), or you will just have an unreadable mess.
In addition, there is more to this story than just a matrix of data. Each of these ORFs has a known position in the yeast genome, so the count of mutations per ORF would be more useful if you first plotted the genome (or a chromosome at a time), and then something showing the counts per ORF.
In addition, you are likely to have any number of ORFs that have either zero or constant numbers of SNPs. Are these really interesting? Should they be filtered out?
Anyway, I would recommend thinking hard about what you are trying to show. You should look at both of Gviz and ggbio to see the sorts of things you can do. Also note that people will be much more likely to help if you show that you have already tried some things, and are genuinely stuck, rather than just giving a request for somebody to give you code.