Question

How to build a non "gene-centered" annotation package using SQLForge ?

0

Entering edit mode

becker.jeremie • 0

@beckerjeremie-7273

Last seen 9.3 years ago

Switzerland

Hi,

I am trying to build an annotation library using SQLForge for a custom microarray that targets endogenous retroviruses in the human genome (HERV). The goal is to access and if possible visualize public annotations (polymorphisms, conservation across species, epigenetic modifications, etc.) associated with a list of retroviruses (whose description and chromosomal position are stored in a local database) within R. After a quick literature review (AnnotationForge vignettes), it appears that custom chip packages can only be built by mapping probeset identifiers with gene accession numbers. Since we are not looking at conventional genes, we cannot map our chip probesets with these gene accession numbers. Is there a another way to link our probeset (using chromosomal location) to public annotations ?

Thanks !

annotation package custom microarray • 1.2k views

ADD COMMENT • link 9.3 years ago • updated 9.2 years ago becker.jeremie • 0

score 1 · Answer 1 · 2015-01-23

Actually that's a great question. And there are some tools in the AnnotationForge to make a generic 'org' style database package from generic data.frames that could conceivably contain anything that could be put into a data.frame. For example that function makeOrgPackage() will do that.

?makeOrgPackage

Please note that this function does not require that your central GID actually *be* a gene ID. It can be any sort of ID that is a legitimate ID (IOW it has to be a good unique identifier). And also this GID has to be shared by each table you want to include. So from your specific example, you could conceivably turn the specific chromosomal locations into an ID. But this might end up being a bad choice since several features could end up all sharing the same location/ID. Thus chromosomal locations might in some cases not be a unique ID...

So I would recommend using something else for your GID (other than chromosomal locations).

And then once you had your 'org' package made you could in principle make a chip package to go with it by using makeChipPackage().

?makeChipPackage

So by using these two things together I think you can get where you want to go. That is, you should be able to make a custom org package and then an associated custom chip package to go with it. Once you have done this, you should use select() and its associated accessors for free on the objects that those custom packages load.

Please let me know if you have further questions,

Marc

score 0 · Answer 2 · 2015-01-27

Many thanks for your prompt and detailed reply and help. As you suggested in your post, I modified the format of my input data.frame so that the first column contains a unique GID. (I simply used the probeset ID) and it seems to work well (the package is created, I now need to install the latest version of R to get it work).

Thanks again !

Jeremie