Dear all,
I would like to get from a GRanges object, where I have stored genomic coordinates of my differentially expressed genes (using biomart and mm9), the closest gene(s) for every DE gene in my list. Using closest-features in unix I could do it easily:
closest-features total_genes_gr.bed all_mouse_genes_gr.bed
and then merge the dataframes.
But I would like to be consistent and do everything in R. Ideally, I would like to have those closest features / genes as a metadata column. Could you please suggest any ways of doing this?
Best wishes,
Dimitris
Could just call nearest with one argument, all the genes. It will ignore self-hits. After annotating, subset to the DE genes.
Many thanks Michael for your suggestion. I could not make it work,though, but I think i found a way using follow. More specifically:
genes_index <- follow(total_DE_genes_gr,all_mouse_genes_gr,select=c("all")) #This gives me the second "best hit"
nearest_genes <- all_mouse_genes_gr[subjectHits(genes_index)] #This is the list of mouse genes close to my DE genes
When I am trying to create a metadata column, though, having those genes side by side, I get the following error:
total_DE_genes_gr$nearest_gene <- all_mouse_genes_gr[subjectHits(genes_index)]
Error in `[[<-`(`*tmp*`, name, value = <S4 object of class "GRanges">) :
20945 elements in value to replace 20956 elements
Probably because not all of my DE genes have a nearest gene, and therefore NAs are introduced? Any ways to circumvent this?
Dimitris