Unique probes on all human affy chips
3
0
Entering edit mode
S Peri ▴ 320
@s-peri-835
Last seen 9.6 years ago
Dear group, Is there any place where I can get all the unique probe ids for all the Affy human chips (~13 chips). I am trying to get the unique probes (no duplicates). It turned out to be very computing intensive problem. I took all the probes from all 13 chips and made a program that writes the the unique id (if there are duplicates, for e.g. 64474_g_at is there on HG-U95C, HGU133, HGU133A2, and 133_plus2. In this case my program will write 64474_g_at once in my output). Using c++ code it is running for the last 20 hrs. I made sure there are no bad loops that would put me in infinite loop situation. It would be nice to have all the uniqe ids in some place where i can use them directly for my annotation purposes. Thanks Peri.
Annotation hgu133a2 affy Annotation hgu133a2 affy • 1.1k views
ADD COMMENT
0
Entering edit mode
@lgautieralternorg-747
Last seen 9.6 years ago
Peri, I am not certain of what you want: are they the probe sets or the individual probes ? If you want probe sets, I think that you could do it in R. The execution time would be few seconds (on reasonably recent computer...) Example: mycdfs <- c("hgu133acdf", "hgu95av2cdf") ## put all the cdfenv packs you ## want here allmyids <- c() for (i in seq(along=mycdf)) { library(mycdfs[i], character.only=TRUE) allmyids <- unique(c(ls(get(mycdfs[i])), allmyids)) ## memory saving... tmp <- paste("package", mycdfs[i], sep=":") detach(pos=match(tmp, search())) } ## unique probe set ids are in 'allmyids'... If you want the probes, a variation of the code above should make it. Hoping it helps, L. S Peri wrote: > Dear group, > Is there any place where I can get all the unique > probe ids for all the Affy human chips (~13 chips). > I am trying to get the unique probes (no duplicates). > It turned out to be very computing intensive problem. > I took all the probes from all 13 chips and made a > program that writes the the unique id (if there are > duplicates, for e.g. 64474_g_at is there on HG-U95C, > HGU133, HGU133A2, and 133_plus2. In this case my > program will write 64474_g_at once in my output). > Using c++ code it is running for the last 20 hrs. I > made sure there are no bad loops that would put me in > infinite loop situation. It would be nice to have all > the uniqe ids in some place where i can use them > directly for my annotation purposes. > > Thanks > Peri. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
Hi Laurent, I am interested in getting individual probes. I want to export them into my postgres table and map link them to sequences. Thanks Peri. --- Laurent Gautier <lgautier@altern.org> wrote: > Peri, > > > I am not certain of what you want: are they the > probe sets > or the individual probes ? > If you want probe sets, I think that you could do it > in R. The execution time would be few seconds (on > reasonably recent > computer...) > > Example: > > mycdfs <- c("hgu133acdf", "hgu95av2cdf") ## put all > the cdfenv packs you > ## want > here > > allmyids <- c() > > for (i in seq(along=mycdf)) { > library(mycdfs[i], character.only=TRUE) > allmyids <- unique(c(ls(get(mycdfs[i])), > allmyids)) > > ## memory saving... > tmp <- paste("package", mycdfs[i], sep=":") > detach(pos=match(tmp, search())) > } > > ## unique probe set ids are in 'allmyids'... > > > If you want the probes, a variation of the code > above should make > it. > > > Hoping it helps, > > > > L. > > > > S Peri wrote: > > Dear group, > > Is there any place where I can get all the unique > > probe ids for all the Affy human chips (~13 > chips). > > I am trying to get the unique probes (no > duplicates). > > It turned out to be very computing intensive > problem. > > I took all the probes from all 13 chips and made a > > program that writes the the unique id (if there > are > > duplicates, for e.g. 64474_g_at is there on > HG-U95C, > > HGU133, HGU133A2, and 133_plus2. In this case my > > program will write 64474_g_at once in my output). > > Using c++ code it is running for the last 20 hrs. > I > > made sure there are no bad loops that would put me > in > > infinite loop situation. It would be nice to have > all > > the uniqe ids in some place where i can use them > > directly for my annotation purposes. > > > > Thanks > > Peri. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________ Declare Yourself - Register online to vote today!
ADD REPLY
0
Entering edit mode
mypacks <- c("hgu133aprobe", "hgu95av2probe") ## put all the cdfenv ## packs you want here allmyprobes <- c() for (i in seq(along=mypacks)) { library(mypacks[i], character.only=TRUE) allmyprobes <- unique(c(get(mypacks[i])$sequence, allmyprobes)) ## memory saving... tmp <- paste("package", mypacks[i], sep=":") detach(pos=match(tmp, search())) } ## unique probe sequences are in 'allprobes'... ## ... and shouldn't take 20 hours for 13 chips... If you feel like leaving C++ aside: - To map probes to reference sequences the package 'matchprobes' can be useful. - To manipulate mappings, the package 'altcdfenvs' might have functions of interest as well. L. S Peri wrote: > Hi Laurent, > I am interested in getting individual probes. I want > to export them into my postgres table and map link > them to sequences. > Thanks > Peri. > --- Laurent Gautier <lgautier@altern.org> wrote: > > >>Peri, >> >> >>I am not certain of what you want: are they the >>probe sets >>or the individual probes ? >>If you want probe sets, I think that you could do it >>in R. The execution time would be few seconds (on >>reasonably recent >>computer...) >> >>Example: >> >>mycdfs <- c("hgu133acdf", "hgu95av2cdf") ## put all >>the cdfenv packs you >> ## want >>here >> >>allmyids <- c() >> >>for (i in seq(along=mycdf)) { >> library(mycdfs[i], character.only=TRUE) >> allmyids <- unique(c(ls(get(mycdfs[i])), >>allmyids)) >> >> ## memory saving... >> tmp <- paste("package", mycdfs[i], sep=":") >> detach(pos=match(tmp, search())) >>} >> >>## unique probe set ids are in 'allmyids'... >> >> >>If you want the probes, a variation of the code >>above should make >>it. >> >> >>Hoping it helps, >> >> >> >>L. >> >> >> >>S Peri wrote: >> >>>Dear group, >>> Is there any place where I can get all the unique >>>probe ids for all the Affy human chips (~13 >> >>chips). >> >>>I am trying to get the unique probes (no >> >>duplicates). >> >>>It turned out to be very computing intensive >> >>problem. >> >>>I took all the probes from all 13 chips and made a >>>program that writes the the unique id (if there >> >>are >> >>>duplicates, for e.g. 64474_g_at is there on >> >>HG-U95C, >> >>>HGU133, HGU133A2, and 133_plus2. In this case my >>>program will write 64474_g_at once in my output). >>>Using c++ code it is running for the last 20 hrs. >> >>I >> >>>made sure there are no bad loops that would put me >> >>in >> >>>infinite loop situation. It would be nice to have >> >>all >> >>>the uniqe ids in some place where i can use them >>>directly for my annotation purposes. >>> >>>Thanks >>>Peri. >>> >>>_______________________________________________ >>>Bioconductor mailing list >>>Bioconductor@stat.math.ethz.ch >>>https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >> >> > > > > > _______________________________ > Do you Yahoo!? > Declare Yourself - Register online to vote today! > http://vote.yahoo.com >
ADD REPLY
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 9.6 years ago
You have not told us what type of data you are looking for ? Do you want the probe sequences, genbank identifiers or merely affymetrix ids ? Do you know what Cel Definition File aka CDF is ? For brief explanation, see http://www.bioconductor.org/data/cdfenvs/desc.html I do not know how to do this in BioConductor. But if I need the annotation information, I get it directly from affymetrix website. 1) Select the human chip you want from http://www.affymetrix.com/support/technical/byproduct.affx?cat=arrays& Human 2) Find the section called "NetAffx Annotation Files" or "Sequence Files" and select the format/file you want 3) At this stage you will be asked for login. Registration is free. Suppose you have download the information of interest for all human arrays, then you can remove redundancies by using the AffymetrixID which is unique identifier of probesets. On Sun, 2004-09-19 at 15:59, S Peri wrote: > Dear group, > Is there any place where I can get all the unique > probe ids for all the Affy human chips (~13 chips). > I am trying to get the unique probes (no duplicates). > It turned out to be very computing intensive problem. > I took all the probes from all 13 chips and made a > program that writes the the unique id (if there are > duplicates, for e.g. 64474_g_at is there on HG-U95C, > HGU133, HGU133A2, and 133_plus2. In this case my > program will write 64474_g_at once in my output). > Using c++ code it is running for the last 20 hrs. I > made sure there are no bad loops that would put me in > infinite loop situation. It would be nice to have all > the uniqe ids in some place where i can use them > directly for my annotation purposes. > > Thanks > Peri. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.0 years ago
United States
On Sun, Sep 19, 2004 at 05:49:08PM +0100, Adaikalavan Ramasamy wrote: > You have not told us what type of data you are looking for ? Do you want > the probe sequences, genbank identifiers or merely affymetrix ids ? Do > you know what Cel Definition File aka CDF is ? For brief explanation, > see http://www.bioconductor.org/data/cdfenvs/desc.html > > I do not know how to do this in BioConductor. But if I need the > annotation information, I get it directly from affymetrix website. > Yes, but I think that the question was not about a single chip, but rather about all chips - and I don't think that netaffx helps you with that, you need to do some computation. I believe that the question is about 25mers, and in that case dumping the cdf files (either from BioC or netaffx) and loading them into a database is one step, from there I would rely on the merge capabilities of the database. Robert > 1) Select the human chip you want from > http://www.affymetrix.com/support/technical/byproduct.affx?cat=array s&Human > 2) Find the section called "NetAffx Annotation Files" or "Sequence > Files" and select the format/file you want > 3) At this stage you will be asked for login. Registration is free. > > Suppose you have download the information of interest for all human > arrays, then you can remove redundancies by using the AffymetrixID which > is unique identifier of probesets. > > > > On Sun, 2004-09-19 at 15:59, S Peri wrote: > > Dear group, > > Is there any place where I can get all the unique > > probe ids for all the Affy human chips (~13 chips). > > I am trying to get the unique probes (no duplicates). > > It turned out to be very computing intensive problem. > > I took all the probes from all 13 chips and made a > > program that writes the the unique id (if there are > > duplicates, for e.g. 64474_g_at is there on HG-U95C, > > HGU133, HGU133A2, and 133_plus2. In this case my > > program will write 64474_g_at once in my output). > > Using c++ code it is running for the last 20 hrs. I > > made sure there are no bad loops that would put me in > > infinite loop situation. It would be nice to have all > > the uniqe ids in some place where i can use them > > directly for my annotation purposes. > > > > Thanks > > Peri. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- +--------------------------------------------------------------------- ------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B20 | | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu | +--------------------------------------------------------------------- ------+
ADD COMMENT

Login before adding your answer.

Traffic: 661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6