Probe sequences to Entrez Ids
1
0
Entering edit mode
theobroma22 ▴ 10
@theobroma22-11920
Last seen 7.9 years ago

Is the a package, or a means by altering any usable package functions to get the Entrez IDs starting with a set of array probe sequences? 

Thanks. 

r • 1.1k views
ADD COMMENT
0
Entering edit mode

hey theobroma22,

can you provide the name of the array platform that you are using? if it is Affymetrix, try 'affycoretools'

ADD REPLY
0
Entering edit mode

It's an apple fruit nimblegen array with little annotation. I have all of the array probe sequences, and files of the DE gene probe sets from a limma result. The ids use the contig number with a subsequent numerical code if such as Contig00001_2_f_100_1_200 which perhaps represents the nucleotide sequence position.  To retrieve the Entrez number using BLAST I have to click a hyperlink to retrieve the Entrez gene ID for windows and Linux systems. I was thinking to hack a current R function in order to parse out the Entrez ID using a sequence. 

Thanks. 

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

The natural solution would be to use a aligner to map the probe sequences to the apple genome. I would use a splice-aware aligner although that might not be essential. You would have to create a FastQ file containing all the probe sequences. You would also have to download the apple genome from NCBI as well as gene annotation for the same species. Then align the probe sequences to the genome and assign each one if possible to a gene. The alignment and assignment can be done efficiently using the Rsubread package for example (using Unix rather than Windows).

ADD COMMENT

Login before adding your answer.

Traffic: 635 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6