Definition of probes in lumiHumanAll annotation database
3
0
Entering edit mode
@javier-perez-florido-3121
Last seen 4.5 years ago
Dear list, I've checked all the annotation elements given by lumiHumanAll.db and, as far as I know, none of them provides the definition of probes. If I'm not wrong, this definition can distinguishes isoforms and other factors. How can I get such information from the lumiHumanAll annotation file? For example, gene A1CF has three different probes, corresponding each one to a different transcript variant, but I cannot obtain such information from the lumiHumanAll database. Thanks, Javier
Annotation Annotation • 1.1k views
0
Entering edit mode
Pan Du ▴ 440
@pan-du-4636
Last seen 8.1 years ago
Hi Javier The lumiHumanAll.db annotation package use nuID as probe ID, which is a direct unique mapping of the Illumina probe sequence. You can use "id2seq" function in lumi package to recover the probe sequence, and "seq2id" function to convert back. Please check the paper "nuID: A universal naming schema of oligonucleotides for Illumina, Affymetrix, and other microarrays".(PMID 17540033) for more details. The mapping from nuID to RefSeq ID in lumiHumanAll.db was based on the latest annotation provided by Illumina company. If you want to further check potential multiple mappings to gene isoforms (I believe the Illumina annotation should have removed the probes with multiple mappings), you can check it based on the probe sequence. Hope this is helpful to you. Pan > Date: Mon, 2 Apr 2012 09:47:39 +0200 > From: Javier P?rez Florido <jpflorido at="" gmail.com=""> > To: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] Definition of probes in lumiHumanAll annotation > ? ? ? ?database > Message-ID: <4F79599B.50309 at gmail.com> > Content-Type: text/plain; charset="ISO-8859-1"; format=flowed > > Dear list, > I've checked all the annotation elements given by lumiHumanAll.db and, > as far as I know, none of them provides the definition of probes. If I'm > not wrong, this definition can distinguishes isoforms and other factors. > How can I get such information from the lumiHumanAll annotation file? > For example, gene A1CF has three different probes, corresponding each > one to a different transcript variant, but I cannot obtain such > information from the lumiHumanAll database. > > Thanks, > Javier > > > > ------------------------------ > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > End of Bioconductor Digest, Vol 110, Issue 2 > ********************************************
0
Entering edit mode
Mark Dunning ★ 1.1k
@mark-dunning-3319
Last seen 11 months ago
Sheffield, Uk
Hi Javier, I'm afraid I'm not clear about what you mean by probe definitions. Rather than being Refseq-centric, the illuminaHumanv1.db, illuminaHumanv2.db, illuminaHumanv3.db, illuminaHumanv4.db packages provide an interface to individual probe sequences and quality information, if this is helpful? probes <- unlist(mget("A1CF", revmap(illuminaHumanv4SYMBOL))) > probes A1CF1 A1CF2 A1CF3 "ILMN_1779670" "ILMN_1806310" "ILMN_2383229" > mget(probes, illuminaHumanv4PROBESEQUENCE) $ILMN_1779670 [1] "GGCACATGCCCAGAGCCAGAAGCGAGCATGAGCACAGCAATTCCTGGCCT"$ILMN_1806310 [1] "GAGGTCTACCCAACTTTTGCAGTGACTGCCCGAGGGGATGGATATGGCAC" $ILMN_2383229 [1] "TGCTGTCCCTAATGCAACTGCACCCGTGTCTGCAGCCCAGCTCAAGCAAG" > mget(probes, illuminaHumanv4GENOMICLOCATION)$ILMN_1779670 [1] "chr10:52610480:52610529:-" $ILMN_1806310 [1] "chr10:52566496:52566545:-"$ILMN_2383229 [1] "chr10:52566587:52566636:-" Regards, Mark 2012/4/2 Javier P?rez Florido <jpflorido at="" gmail.com="">: > Dear list, > I've checked all the annotation elements given by lumiHumanAll.db and, as > far as I know, none of them provides the definition of probes. If I'm not > wrong, this definition can distinguishes isoforms and other factors. > How can I get such information from the lumiHumanAll annotation file? For > example, gene A1CF has three different probes, corresponding each one to a > different transcript variant, but I cannot obtain such information from the > lumiHumanAll database. > > Thanks, > Javier > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
0
Entering edit mode
Thanks, I meant information such as: Homo sapiens APOBEC1 complementation factor (A1CF), transcript variant 2, mRNA. Homo sapiens APOBEC1 complementation factor (A1CF), transcript variant 1, mRNA. Homo sapiens APOBEC1 complementation factor (A1CF), transcript variant 3, mRNA. Since the gene A1CF has three different probes each one related to a given transcript variant. Don't know how to find the above information through either illuminaHumanv4.db or lumiHumanAll.db annotation packages. Thanks, Javier On 02/04/2012 12:11, Mark Dunning wrote: > Hi Javier, > > I'm afraid I'm not clear about what you mean by probe definitions. > Rather than being Refseq-centric, the illuminaHumanv1.db, > illuminaHumanv2.db, illuminaHumanv3.db, illuminaHumanv4.db packages > provide an interface to individual probe sequences and quality > information, if this is helpful? > > probes<- unlist(mget("A1CF", revmap(illuminaHumanv4SYMBOL))) > >> probes > A1CF1 A1CF2 A1CF3 > "ILMN_1779670" "ILMN_1806310" "ILMN_2383229" > >> mget(probes, illuminaHumanv4PROBESEQUENCE) > $ILMN_1779670 > [1] "GGCACATGCCCAGAGCCAGAAGCGAGCATGAGCACAGCAATTCCTGGCCT" > >$ILMN_1806310 > [1] "GAGGTCTACCCAACTTTTGCAGTGACTGCCCGAGGGGATGGATATGGCAC" > > $ILMN_2383229 > [1] "TGCTGTCCCTAATGCAACTGCACCCGTGTCTGCAGCCCAGCTCAAGCAAG" > >> mget(probes, illuminaHumanv4GENOMICLOCATION) >$ILMN_1779670 > [1] "chr10:52610480:52610529:-" > > $ILMN_1806310 > [1] "chr10:52566496:52566545:-" > >$ILMN_2383229 > [1] "chr10:52566587:52566636:-" > > Regards, > > Mark > > 2012/4/2 Javier P?rez Florido<jpflorido at="" gmail.com="">: >> Dear list, >> I've checked all the annotation elements given by lumiHumanAll.db and, as >> far as I know, none of them provides the definition of probes. If I'm not >> wrong, this definition can distinguishes isoforms and other factors. >> How can I get such information from the lumiHumanAll annotation file? For >> example, gene A1CF has three different probes, corresponding each one to a >> different transcript variant, but I cannot obtain such information from the >> lumiHumanAll database. >> >> Thanks, >> Javier >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor
0
Entering edit mode
Last seen 8.1 years ago
Javier, When you say definition of the probes, do you just mean the sequence of the probes and/or which probes make up a "gene"? Illumina BeadChip arrays consist of only 1 probe when looking at probe-level data. If you are dealing with gene-summarized data (from the old BeadStudio or GenomeStudio), then you are dealing with the average signal over all the probes that map to that gene, but most genes only have 1 probe. If you want the probe sequence, and you have the nuID (obtained through the lumi package) then you just 'decode' the nuID using this simple webpage: https://prod.bioinformatics.northwestern.edu/nuID/nuIDoptions.cfm Then you could using browser/mapping tools to determine if the probe(s) are in a location that would allow you to distinguish isoforms. Based on sheer numbers, I don't think the BeadChip arrays contain nearly enough probes to make this possible for most genes. If you want to learn the details of the methodology used to build the lumiHumanAll annotations (and the rest of the lumi*All annotations), a wealth of information is at https://prod.bioinformatics.northwestern.edu/nuID/index.cfm I have used the files and details there to build a custom annotation package. It really isn't that hard and there are examples on the mailing list. Maybe that is what you need. Wade -----Original Message----- From: Javier P?rez Florido [mailto:jpflorido@gmail.com] Sent: Monday, April 02, 2012 2:48 AM To: bioconductor at stat.math.ethz.ch Subject: [BioC] Definition of probes in lumiHumanAll annotation database Dear list, I've checked all the annotation elements given by lumiHumanAll.db and, as far as I know, none of them provides the definition of probes. If I'm not wrong, this definition can distinguishes isoforms and other factors. How can I get such information from the lumiHumanAll annotation file? For example, gene A1CF has three different probes, corresponding each one to a different transcript variant, but I cannot obtain such information from the lumiHumanAll database. Thanks, Javier