Question

closest-features equivalent in Bioconductor for finding closest gene(s)

1

Entering edit mode

Dimitris Polychronopoulos ▴ 80

@dimitris-polychronopoulos-9192

Last seen 7.6 years ago

United Kingdom

Dear all,

I would like to get from a GRanges object, where I have stored genomic coordinates of my differentially expressed genes (using biomart and mm9), the closest gene(s) for every DE gene in my list. Using closest-features in unix I could do it easily:

closest-features total_genes_gr.bed all_mouse_genes_gr.bed

and then merge the dataframes.

But I would like to be consistent and do everything in R. Ideally, I would like to have those closest features / genes as a metadata column. Could you please suggest any ways of doing this?

Best wishes,

Dimitris

genomicranges granges iranges annotatepeak • 2.8k views

ADD COMMENT • link updated 8.8 years ago by Michael Lawrence ★ 11k • written 8.8 years ago by Dimitris Polychronopoulos ▴ 80

0

Entering edit mode

Michael Lawrence ★ 11k

@michael-lawrence-3846

Last seen 3.2 years ago

United States

Find nearest gene to each gene:

total_DE_genes_gr$nearest_gene <- all_mouse_genes_gr[nearest(all_mouse_genes_gr)[match(total_DE_genes_gr, all_mouse_genes_gr)]]

ADD COMMENT • link 8.8 years ago Michael Lawrence ★ 11k

0

Entering edit mode

Thanks again, Michael. Maybe I miss something, but when I run this, it returns:

total_DE_genes_gr$nearest_gene <- all_mouse_genes_gr[nearest(all_mouse_genes_gr)[match(total_DE_genes_gr, all_mouse_genes_gr)]]
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
subscript contains NAs or out-of-bounds indices

ADD REPLY • link 8.8 years ago Dimitris Polychronopoulos ▴ 80

1

Entering edit mode

Two possibilities:

1) You have genes in the DE set that are not in the total set.

2) There are genes all by themselves on a chromosome/strand.

Btw, you should pass ignore.strand=TRUE to nearest() unless you only want the nearest gene on the same strand.

There is no way to encode an NA GRanges element, so if there are cases where a gene has no neighbor, you will need to think about what to do.

ADD REPLY • link 8.8 years ago Michael Lawrence ★ 11k

0

Entering edit mode

Thanks Michael for the suggestions, it worked!

ADD REPLY • link 8.8 years ago Dimitris Polychronopoulos ▴ 80

score 2 · Accepted Answer · 2016-04-29

2

Entering edit mode

Michael Lawrence ★ 11k

@michael-lawrence-3846

Last seen 3.2 years ago

United States

Should be straight-forward with ?nearest.

ADD COMMENT • link 8.8 years ago Michael Lawrence ★ 11k

score 1 · Accepted Answer · 2016-04-29

1

Entering edit mode

Dimitris Polychronopoulos ▴ 80

@dimitris-polychronopoulos-9192

Last seen 7.6 years ago

United Kingdom

Nearest will give me the actual differentially expressed gene each time if i put as query the granges object of my diff expressed genes and as subject all mouse genes. Unless there is a special argument that gives you the second closest hit i will end up with the same list of genes

ADD COMMENT • link 8.8 years ago Dimitris Polychronopoulos ▴ 80

0

Entering edit mode

Could just call nearest with one argument, all the genes. It will ignore self-hits. After annotating, subset to the DE genes.

ADD REPLY • link 8.8 years ago Michael Lawrence ★ 11k

0

Entering edit mode

Many thanks Michael for your suggestion. I could not make it work,though, but I think i found a way using follow. More specifically:

genes_index <- follow(total_DE_genes_gr,all_mouse_genes_gr,select=c("all")) #This gives me the second "best hit"

nearest_genes <- all_mouse_genes_gr[subjectHits(genes_index)] #This is the list of mouse genes close to my DE genes

When I am trying to create a metadata column, though, having those genes side by side, I get the following error:

total_DE_genes_gr$nearest_gene <- all_mouse_genes_gr[subjectHits(genes_index)]
Error in `[[<-`(`*tmp*`, name, value = <S4 object of class "GRanges">) :
20945 elements in value to replace 20956 elements

Probably because not all of my DE genes have a nearest gene, and therefore NAs are introduced? Any ways to circumvent this?

Dimitris

ADD REPLY • link 8.8 years ago Dimitris Polychronopoulos ▴ 80