Is the a package, or a means by altering any usable package functions to get the Entrez IDs starting with a set of array probe sequences?
Thanks.
Is the a package, or a means by altering any usable package functions to get the Entrez IDs starting with a set of array probe sequences?
Thanks.
The natural solution would be to use a aligner to map the probe sequences to the apple genome. I would use a splice-aware aligner although that might not be essential. You would have to create a FastQ file containing all the probe sequences. You would also have to download the apple genome from NCBI as well as gene annotation for the same species. Then align the probe sequences to the genome and assign each one if possible to a gene. The alignment and assignment can be done efficiently using the Rsubread package for example (using Unix rather than Windows).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
hey theobroma22,
can you provide the name of the array platform that you are using? if it is Affymetrix, try 'affycoretools'
It's an apple fruit nimblegen array with little annotation. I have all of the array probe sequences, and files of the DE gene probe sets from a limma result. The ids use the contig number with a subsequent numerical code if such as Contig00001_2_f_100_1_200 which perhaps represents the nucleotide sequence position. To retrieve the Entrez number using BLAST I have to click a hyperlink to retrieve the Entrez gene ID for windows and Linux systems. I was thinking to hack a current R function in order to parse out the Entrez ID using a sequence.
Thanks.