Getting probes id for particular probeset id
3
0
Entering edit mode
@marek-piatek-bi-3927
Last seen 10.2 years ago
Hi all, I'm trying to get probes for particular probset id from my MoGene arrays. From experiment description file (dabg.summary.txt) I can see that there are around 241,500 probset ids for my 12 arrays. When loading .CEL files into bioconductor I see 1,102,500 values for my 12 arrays. Thus I think there should be around 4 probes per 1 probeset on average. However, when I load an experiment description file into an AnnotatedDataFrame object: Affy.Expt <- read.AnnotatedDataFrame("dabg.summary.txt", header=TRUE, row.names=1, sep="\t") and try to use it as my phenoData when loading .CEL files into Affybatch object : Affy.Data <- ReadAffy(filenames=colnames(pData(Affy.Expt)), phenoData=Affy.Expt, verbose=TRUE) I get an error: Warning message: In read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : Incompatible phenoData object. Created a new one. I understand that as a not consistent number of rows between my experiment description file (241,500 probset ids) and number of rows in .CEL files (1,102,500 probes). When it does that it resets the probsets id and starts numbering the rows from 1 to 1,102,500 and thus loosing track of probset ids. The point is that I need to know which probes belong to which probeset id and have their values stored. I looked at CDF file but it looks strange and I can't get anything useful from there. I thought that maybe looking into rma algorithm will help me out somehow, but it calls external function, which I don't understand. Is there some easy way to get that information? Thank you in advance, Mark [[alternative HTML version deleted]]
cdf cdf • 1.8k views
ADD COMMENT
0
Entering edit mode
@marek-piatek-bi-3927
Last seen 10.2 years ago
Hi all, I?m trying to get probes for particular probset id from my MoGene arrays. From experiment description file (dabg.summary.txt) I can see that there are around 241,500 probset ids for my 12 arrays. When loading .CEL files into bioconductor I see 1,102,500 values for my 12 arrays. Thus I think there should be around 4 probes per 1 probeset on average. However, when I load an experiment description file into an AnnotatedDataFrame object: Affy.Expt <- read.AnnotatedDataFrame("dabg.summary.txt", header=TRUE, row.names=1, sep="\t") and try to use it as my phenoData when loading .CEL files into Affybatch object : Affy.Data <- ReadAffy(filenames=colnames(pData(Affy.Expt)), phenoData=Affy.Expt, verbose=TRUE) I get an error: Warning message: In read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : Incompatible phenoData object. Created a new one. I understand that as a not consistent number of rows between my experiment description file (241,500 probset ids) and number of rows in .CEL files (1,102,500 probes). When it does that it resets the probsets id and starts numbering the rows from 1 to 1,102,500 and thus loosing track of probset ids. The point is that I need to know which probes belong to which probeset id and have their values stored. I looked at CDF file but it looks strange and I can?t get anything useful from there. I thought that maybe looking into rma algorithm will help me out somehow, but it calls external function, which I don?t understand. Is there some easy way to get that information? Thank you in advance, Mark
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 minutes ago
United States
Hi Marek, marek piatek (BI) wrote: > Hi all, > I'm trying to get probes for particular probset id from my MoGene arrays. From experiment description file (dabg.summary.txt) I can see that there are around 241,500 probset ids for my 12 arrays. When loading .CEL files into bioconductor I see 1,102,500 values for my 12 arrays. Thus I think there should be around 4 probes per 1 probeset on average. > However, when I load an experiment description file into an AnnotatedDataFrame object: > Affy.Expt <- read.AnnotatedDataFrame("dabg.summary.txt", header=TRUE, row.names=1, sep="\t") > and try to use it as my phenoData when loading .CEL files into Affybatch object : > Affy.Data <- ReadAffy(filenames=colnames(pData(Affy.Expt)), phenoData=Affy.Expt, verbose=TRUE) > I get an error: > Warning message: > In read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : > Incompatible phenoData object. Created a new one. > I understand that as a not consistent number of rows between my experiment description file (241,500 probset ids) and number of rows in .CEL files (1,102,500 probes). When it does that it resets the probsets id and starts numbering the rows from 1 to 1,102,500 and thus loosing track of probset ids. > > The point is that I need to know which probes belong to which probeset id and have their values stored. > I looked at CDF file but it looks strange and I can't get anything useful from there. I thought that maybe looking into rma algorithm will help me out somehow, but it calls external function, which I don't understand. > Is there some easy way to get that information? Yes, use the functions in the affy package that were designed to do this sort of thing. Let's say you want the probe values from a few probesets: probesets <- c("10338001","10338003","10338004") probelist <- pm(Affy.Data, probesets, TRUE) will give you a list of length 3, containing the probe values for these probesets. Best, Jim > > Thank you in advance, > Mark > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
Hi Jim and all, Thanks for your help! I also found yesterday that this task can be completed in an alternative way as well. You just have to load the "oligo" package and then basically do something like: myIndexes <- oligo:::getFidProbeset(myData) # where myData stores .CEL files data and then: myProbes <- exprs(myData[idx[,1],]) That should do the trick. Thanks for your help once again. Mark -----Original Message----- From: James W. MacDonald [mailto:jmacdon@med.umich.edu] Sent: 10 February 2010 16:23 To: marek piatek (BI) Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] Getting probes id for particular probeset id Hi Marek, marek piatek (BI) wrote: > Hi all, > I'm trying to get probes for particular probset id from my MoGene arrays. From experiment description file (dabg.summary.txt) I can see that there are around 241,500 probset ids for my 12 arrays. When loading .CEL files into bioconductor I see 1,102,500 values for my 12 arrays. Thus I think there should be around 4 probes per 1 probeset on average. > However, when I load an experiment description file into an AnnotatedDataFrame object: > Affy.Expt <- read.AnnotatedDataFrame("dabg.summary.txt", header=TRUE, row.names=1, sep="\t") > and try to use it as my phenoData when loading .CEL files into Affybatch object : > Affy.Data <- ReadAffy(filenames=colnames(pData(Affy.Expt)), phenoData=Affy.Expt, verbose=TRUE) > I get an error: > Warning message: > In read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : > Incompatible phenoData object. Created a new one. > I understand that as a not consistent number of rows between my experiment description file (241,500 probset ids) and number of rows in .CEL files (1,102,500 probes). When it does that it resets the probsets id and starts numbering the rows from 1 to 1,102,500 and thus loosing track of probset ids. > > The point is that I need to know which probes belong to which probeset id and have their values stored. > I looked at CDF file but it looks strange and I can't get anything useful from there. I thought that maybe looking into rma algorithm will help me out somehow, but it calls external function, which I don't understand. > Is there some easy way to get that information? Yes, use the functions in the affy package that were designed to do this sort of thing. Let's say you want the probe values from a few probesets: probesets <- c("10338001","10338003","10338004") probelist <- pm(Affy.Data, probesets, TRUE) will give you a list of length 3, containing the probe values for these probesets. Best, Jim > > Thank you in advance, > Mark > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
@marek-piatek-bi-3927
Last seen 10.2 years ago
Hi all, I'm trying to get probes for particular probset id from my MoGene arrays. From experiment description file (dabg.summary.txt) I can see that there are around 241,500 probset ids for my 12 arrays. When loading .CEL files into bioconductor I see 1,102,500 values for my 12 arrays. Thus I think there should be around 4 probes per 1 probeset on average. However, when I load an experiment description file into an AnnotatedDataFrame object: Affy.Expt <- read.AnnotatedDataFrame("dabg.summary.txt", header=TRUE, row.names=1, sep="\t") and try to use it as my phenoData when loading .CEL files into Affybatch object : Affy.Data <- ReadAffy(filenames=colnames(pData(Affy.Expt)), phenoData=Affy.Expt, verbose=TRUE) I get an error: Warning message: In read.affybatch(filenames = l$filenames, phenoData = l$phenoData, : Incompatible phenoData object. Created a new one. I understand that as a not consistent number of rows between my experiment description file (241,500 probset ids) and number of rows in .CEL files (1,102,500 probes). When it does that it resets the probsets id and starts numbering the rows from 1 to 1,102,500 and thus loosing track of probset ids. The point is that I need to know which probes belong to which probeset id and have their values stored. I looked at CDF file but it looks strange and I can't get anything useful from there. I thought that maybe looking into rma algorithm will help me out somehow, but it calls external function, which I don't understand. Is there some easy way to get that information? Thank you in advance, Mark [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 917 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6