How to generate an annotation library without CDF file?
2
0
Entering edit mode
@shu-wen-huang-5230
Last seen 9.7 years ago
How to generate a new annotation from a non-cdf file in R/Bioconductor? I received a library folder from Affymatrix. It includes pgf, cif, clf, bgp, txt..., EXCEPT CDF file. How do I generate a library out of them. Thanks! Sw [[alternative HTML version deleted]]
Annotation cdf Annotation cdf • 1.3k views
ADD COMMENT
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.1 years ago
Brazil/Campinas/UNICAMP
PGFs are given for Gene/Exon ST arrays... and chances are that the package you need is already on BioConductor. (btw, a CDF for such array design is not recommended by Affymetrix themselves) Check Sections 1 and 4 of the document below: http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/doc /primer.pdf benilton
ADD COMMENT
0
Entering edit mode
Hi benilton, Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! Sw -----Original Message----- From: Benilton Carvalho [mailto:beniltoncarvalho@gmail.com] Sent: Saturday, April 14, 2012 6:29 PM To: Shu-wen Huang Cc: bioconductor at r-project.org Subject: Re: [BioC] How to generate an annotation library without CDF file? PGFs are given for Gene/Exon ST arrays... and chances are that the package you need is already on BioConductor. (btw, a CDF for such array design is not recommended by Affymetrix themselves) Check Sections 1 and 4 of the document below: http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/doc /primer.pdf benilton
ADD REPLY
0
Entering edit mode
To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on... benilton On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: > Hi benilton, > > Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! > > > Sw > > -----Original Message----- > From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] > Sent: Saturday, April 14, 2012 6:29 PM > To: Shu-wen Huang > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to generate an annotation library without CDF file? > > PGFs are given for Gene/Exon ST arrays... and chances are that the package you need is already on BioConductor. (btw, a CDF for such array design is not recommended by Affymetrix themselves) > > Check Sections 1 and 4 of the document below: > > http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/d oc/primer.pdf > > benilton
ADD REPLY
0
Entering edit mode
Below are my codes. It seems I need to somehow generate Sorgh- WTa520972F Library in order to do Normalization. However, I don't have CDF file, but many other format files. >library(affy) >library(limma) >library(gcrma) >library(genefilter) ## read the Targets.txt file ## >setwd("all") >targets = readTargets() ## create a phenodata object and attach it to the data ## >myCovs = data.frame(targets) >rownames(myCovs) = myCovs[,1] >nlev = as.numeric(apply(myCovs, 2, function(x) nlevels(as.factor(x)))) >metadata = data.frame(labelDescription = paste(colnames(myCovs), ": ", nlev, " level", ifelse(nlev==1,"","s"), sep=""), >row.names=colnames(myCovs)) >phenoData = new("AnnotatedDataFrame", data=myCovs, varMetadata=metadata) ## read the data, attach the phenodata and normalize it using gcRMA ## >dat = ReadAffy(sampleNames = myCovs$Name, filenames = myCovs$Celfile, phenoData = phenoData, celfile.path = "celfiles") >eset = gcrma(dat, verbose = FALSE) ############ error messages received ############ Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain Sorgh-WTa520972F Library - package sorghwta520972fcdf not installed Bioconductor - sorghwta520972fcdf not available -----Original Message----- From: Benilton Carvalho [mailto:beniltoncarvalho@gmail.com] Sent: Saturday, April 14, 2012 7:48 PM To: Shu-wen Huang Cc: bioconductor at r-project.org Subject: Re: [BioC] How to generate an annotation library without CDF file? To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on... benilton On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: > Hi benilton, > > Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! > > > Sw > > -----Original Message----- > From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] > Sent: Saturday, April 14, 2012 6:29 PM > To: Shu-wen Huang > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to generate an annotation library without CDF file? > > PGFs are given for Gene/Exon ST arrays... and chances are that the > package you need is already on BioConductor. (btw, a CDF for such > array design is not recommended by Affymetrix themselves) > > Check Sections 1 and 4 of the document below: > > http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/doc > /primer.pdf > > benilton
ADD REPLY
0
Entering edit mode
With the files you current have, you could generate the appropriate annotation package and work with the preprocessing steps through oligo and shown on the sections of the document I suggested initially. However, I'm not sure gcrma() would work with oligo objects - in the meantime, you could use rma(). Maybe Jean can provide further insight... b On 15 April 2012 01:55, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: > Below are my codes. It seems I need to somehow generate Sorgh- WTa520972F Library in order to do Normalization. However, I don't have CDF file, but many other format files. > > >>library(affy) >>library(limma) >>library(gcrma) >>library(genefilter) > > ## read the Targets.txt file ## >>setwd("all") >>targets = readTargets() > > ## create a phenodata object and attach it to the data ## >>myCovs = data.frame(targets) >>rownames(myCovs) = myCovs[,1] >>nlev = as.numeric(apply(myCovs, 2, function(x) nlevels(as.factor(x)))) >>metadata = data.frame(labelDescription = paste(colnames(myCovs), ": ", nlev, " level", ifelse(nlev==1,"","s"), sep=""), >row.names=colnames(myCovs)) >>phenoData = new("AnnotatedDataFrame", data=myCovs, varMetadata=metadata) > > ## read the data, attach the phenodata and normalize it using gcRMA ## >>dat = ReadAffy(sampleNames = myCovs$Name, filenames = myCovs$Celfile, phenoData = phenoData, celfile.path = "celfiles") >>eset = gcrma(dat, verbose = FALSE) > > > > ############ error messages received ############ > Error in getCdfInfo(object) : > ?Could not obtain CDF environment, problems encountered: > Specified environment does not contain Sorgh-WTa520972F > Library - package sorghwta520972fcdf not installed > Bioconductor - sorghwta520972fcdf not available > > > > > > -----Original Message----- > From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] > Sent: Saturday, April 14, 2012 7:48 PM > To: Shu-wen Huang > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to generate an annotation library without CDF file? > > To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on... > > benilton > > On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >> Hi benilton, >> >> Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! >> >> >> Sw >> >> -----Original Message----- >> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >> Sent: Saturday, April 14, 2012 6:29 PM >> To: Shu-wen Huang >> Cc: bioconductor at r-project.org >> Subject: Re: [BioC] How to generate an annotation library without CDF file? >> >> PGFs are given for Gene/Exon ST arrays... and chances are that the >> package you need is already on BioConductor. (btw, a CDF for such >> array design is not recommended by Affymetrix themselves) >> >> Check Sections 1 and 4 of the document below: >> >> http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/doc >> /primer.pdf >> >> benilton
ADD REPLY
0
Entering edit mode
I tried to use rma() shown below. However, it seems I can't go around the need of sorghwta520972fcdf. Or did I misunderstand what you suggested? >eset = rma(dat) Error in getCdfInfo(object) : Could not obtain CDF environment, problems encountered: Specified environment does not contain Sorgh-WTa520972F Library - package sorghwta520972fcdf not installed Bioconductor - sorghwta520972fcdf not available Sw -----Original Message----- From: Benilton Carvalho [mailto:beniltoncarvalho@gmail.com] Sent: Saturday, April 14, 2012 8:08 PM To: Shu-wen Huang Cc: bioconductor at r-project.org Subject: Re: [BioC] How to generate an annotation library without CDF file? With the files you current have, you could generate the appropriate annotation package and work with the preprocessing steps through oligo and shown on the sections of the document I suggested initially. However, I'm not sure gcrma() would work with oligo objects - in the meantime, you could use rma(). Maybe Jean can provide further insight... b On 15 April 2012 01:55, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: > Below are my codes. It seems I need to somehow generate Sorgh- WTa520972F Library in order to do Normalization. However, I don't have CDF file, but many other format files. > > >>library(affy) >>library(limma) >>library(gcrma) >>library(genefilter) > > ## read the Targets.txt file ## >>setwd("all") >>targets = readTargets() > > ## create a phenodata object and attach it to the data ## >>myCovs = data.frame(targets) >>rownames(myCovs) = myCovs[,1] >>nlev = as.numeric(apply(myCovs, 2, function(x) nlevels(as.factor(x)))) >>metadata = data.frame(labelDescription = paste(colnames(myCovs), ": ", >>nlev, " level", ifelse(nlev==1,"","s"), sep=""), >>>row.names=colnames(myCovs)) phenoData = new("AnnotatedDataFrame", >>data=myCovs, varMetadata=metadata) > > ## read the data, attach the phenodata and normalize it using gcRMA ## >>dat = ReadAffy(sampleNames = myCovs$Name, filenames = myCovs$Celfile, >>phenoData = phenoData, celfile.path = "celfiles") eset = gcrma(dat, >>verbose = FALSE) > > > > ############ error messages received ############ Error in > getCdfInfo(object) : > ?Could not obtain CDF environment, problems encountered: > Specified environment does not contain Sorgh-WTa520972F Library - > package sorghwta520972fcdf not installed Bioconductor - > sorghwta520972fcdf not available > > > > > > -----Original Message----- > From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] > Sent: Saturday, April 14, 2012 7:48 PM > To: Shu-wen Huang > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to generate an annotation library without CDF file? > > To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on... > > benilton > > On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >> Hi benilton, >> >> Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! >> >> >> Sw >> >> -----Original Message----- >> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >> Sent: Saturday, April 14, 2012 6:29 PM >> To: Shu-wen Huang >> Cc: bioconductor at r-project.org >> Subject: Re: [BioC] How to generate an annotation library without CDF file? >> >> PGFs are given for Gene/Exon ST arrays... and chances are that the >> package you need is already on BioConductor. (btw, a CDF for such >> array design is not recommended by Affymetrix themselves) >> >> Check Sections 1 and 4 of the document below: >> >> http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/do >> c >> /primer.pdf >> >> benilton
ADD REPLY
0
Entering edit mode
You did misunderstand. 1) Get all your files 2) Install the pdInfoBuilder package 3) Use the example in Section 8 of the pdInfoBuilder vignette ( http://bioconductor.org/packages/release/bioc/vignettes/pdInfoBuilder/ inst/doc/BuildingPDInfoPkgs.pdf ) 4) Install the resulting annotation package 5) Install oligo 6) Use the Sections 1 and 4 of the document I suggested on my first message. b On 15 April 2012 02:16, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: > I tried to use rma() shown below. However, it seems I can't go around the need of sorghwta520972fcdf. Or did I misunderstand what you suggested? > >>eset = rma(dat) > > Error in getCdfInfo(object) : > ?Could not obtain CDF environment, problems encountered: > Specified environment does not contain Sorgh-WTa520972F > Library - package sorghwta520972fcdf not installed > Bioconductor - sorghwta520972fcdf not available > > > Sw > > > -----Original Message----- > From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] > Sent: Saturday, April 14, 2012 8:08 PM > To: Shu-wen Huang > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to generate an annotation library without CDF file? > > With the files you current have, you could generate the appropriate annotation package and work with the preprocessing steps through oligo and shown on the sections of the document I suggested initially. > However, I'm not sure gcrma() would work with oligo objects - in the meantime, you could use rma(). Maybe Jean can provide further insight... > > b > > On 15 April 2012 01:55, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >> Below are my codes. It seems I need to somehow generate Sorgh- WTa520972F Library in order to do Normalization. However, I don't have CDF file, but many other format files. >> >> >>>library(affy) >>>library(limma) >>>library(gcrma) >>>library(genefilter) >> >> ## read the Targets.txt file ## >>>setwd("all") >>>targets = readTargets() >> >> ## create a phenodata object and attach it to the data ## >>>myCovs = data.frame(targets) >>>rownames(myCovs) = myCovs[,1] >>>nlev = as.numeric(apply(myCovs, 2, function(x) nlevels(as.factor(x)))) >>>metadata = data.frame(labelDescription = paste(colnames(myCovs), ": ", >>>nlev, " level", ifelse(nlev==1,"","s"), sep=""), >>>>row.names=colnames(myCovs)) phenoData = new("AnnotatedDataFrame", >>>data=myCovs, varMetadata=metadata) >> >> ## read the data, attach the phenodata and normalize it using gcRMA ## >>>dat = ReadAffy(sampleNames = myCovs$Name, filenames = myCovs$Celfile, >>>phenoData = phenoData, celfile.path = "celfiles") eset = gcrma(dat, >>>verbose = FALSE) >> >> >> >> ############ error messages received ############ Error in >> getCdfInfo(object) : >> ?Could not obtain CDF environment, problems encountered: >> Specified environment does not contain Sorgh-WTa520972F Library - >> package sorghwta520972fcdf not installed Bioconductor - >> sorghwta520972fcdf not available >> >> >> >> >> >> -----Original Message----- >> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >> Sent: Saturday, April 14, 2012 7:48 PM >> To: Shu-wen Huang >> Cc: bioconductor at r-project.org >> Subject: Re: [BioC] How to generate an annotation library without CDF file? >> >> To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on... >> >> benilton >> >> On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >>> Hi benilton, >>> >>> Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! >>> >>> >>> Sw >>> >>> -----Original Message----- >>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >>> Sent: Saturday, April 14, 2012 6:29 PM >>> To: Shu-wen Huang >>> Cc: bioconductor at r-project.org >>> Subject: Re: [BioC] How to generate an annotation library without CDF file? >>> >>> PGFs are given for Gene/Exon ST arrays... and chances are that the >>> package you need is already on BioConductor. (btw, a CDF for such >>> array design is not recommended by Affymetrix themselves) >>> >>> Check Sections 1 and 4 of the document below: >>> >>> http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/do >>> c >>> /primer.pdf >>> >>> benilton
ADD REPLY
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.1 years ago
Brazil/Campinas/UNICAMP
Hi Shu-wen, I'm moving this back to the mailing list, so everyone can benefit from this discussion and even provide you with alternatives. Regarding the probeset.csv file, I'd expect Affymetrix to give you this file. You should contact them with this regard. benilton On 15 April 2012 04:13, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: > In order to run makePdInfoPackage, it requires 3 files, PGF, CLF, and probeset.csv. However, among the giving files, I don't have any .probeset.csv. Can any of the files below replace it? > > Here are all the files came with the CEL files. > Can any other file, such as bgp, cif, grc, mps, gcc, smd ?replace it? > > > I tried to reformat .bgp to .probeset.csv. After the commands below, I received a failure message in the bottom. > >>library(pdInfoBuilder) >>baseDir <- "/home/shuang/Analysis/R/dataset_20120413" >>(pgf <- list.files(baseDir, pattern = ".pgf",full.names = TRUE)) >>(clf <- list.files(baseDir, pattern = ".clf", full.names = TRUE)) >>(prob <- list.files(baseDir, pattern = ".probeset.csv", full.names = TRUE)) >>seed <- new("AffyGenePDInfoPkgSeed",pgfFile = pgf, clfFile = clf, probeFile = prob, biocViews = "AnnotationData", organism = "Sorghum", species = "Bicolor") >>makePdInfoPackage(seed, destDir = ".") > > > Parsing file: Sorgh-WTa520972F.pgf... OK > Parsing file: Sorgh-WTa520972F.clf... OK > Creating initial table for probes... OK > Creating dictionaries... OK > Parsing file: Sorgh-WTa520972F.probeset.csv... OK > Error in `[.data.frame`(probesets, , cols) : undefined columns selected > In addition: Warning messages: > 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' > 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' > > > > -----Original Message----- > From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] > Sent: Saturday, April 14, 2012 8:22 PM > To: Shu-wen Huang > Cc: bioconductor at r-project.org > Subject: Re: [BioC] How to generate an annotation library without CDF file? > > You did misunderstand. > > 1) Get all your files > 2) Install the pdInfoBuilder package > 3) Use the example in Section 8 of the pdInfoBuilder vignette ( http ://bioconductor.org/packages/release/bioc/vignettes/pdInfoBuilder/inst /doc/BuildingPDInfoPkgs.pdf > ) > 4) Install the resulting annotation package > 5) Install oligo > 6) Use the Sections 1 and 4 of the document I suggested on my first message. > > b > > On 15 April 2012 02:16, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >> I tried to use rma() shown below. However, it seems I can't go around the need of sorghwta520972fcdf. Or did I misunderstand what you suggested? >> >>>eset = rma(dat) >> >> Error in getCdfInfo(object) : >> ?Could not obtain CDF environment, problems encountered: >> Specified environment does not contain Sorgh-WTa520972F Library - >> package sorghwta520972fcdf not installed Bioconductor - >> sorghwta520972fcdf not available >> >> >> Sw >> >> >> -----Original Message----- >> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >> Sent: Saturday, April 14, 2012 8:08 PM >> To: Shu-wen Huang >> Cc: bioconductor at r-project.org >> Subject: Re: [BioC] How to generate an annotation library without CDF file? >> >> With the files you current have, you could generate the appropriate annotation package and work with the preprocessing steps through oligo and shown on the sections of the document I suggested initially. >> However, I'm not sure gcrma() would work with oligo objects - in the meantime, you could use rma(). Maybe Jean can provide further insight... >> >> b >> >> On 15 April 2012 01:55, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >>> Below are my codes. It seems I need to somehow generate Sorgh- WTa520972F Library in order to do Normalization. However, I don't have CDF file, but many other format files. >>> >>> >>>>library(affy) >>>>library(limma) >>>>library(gcrma) >>>>library(genefilter) >>> >>> ## read the Targets.txt file ## >>>>setwd("all") >>>>targets = readTargets() >>> >>> ## create a phenodata object and attach it to the data ## >>>>myCovs = data.frame(targets) >>>>rownames(myCovs) = myCovs[,1] >>>>nlev = as.numeric(apply(myCovs, 2, function(x) >>>>nlevels(as.factor(x)))) metadata = data.frame(labelDescription = >>>>paste(colnames(myCovs), ": ", nlev, " level", ifelse(nlev==1,"","s"), >>>>sep=""), >>>>>row.names=colnames(myCovs)) phenoData = new("AnnotatedDataFrame", >>>>data=myCovs, varMetadata=metadata) >>> >>> ## read the data, attach the phenodata and normalize it using gcRMA >>> ## >>>>dat = ReadAffy(sampleNames = myCovs$Name, filenames = myCovs$Celfile, >>>>phenoData = phenoData, celfile.path = "celfiles") eset = gcrma(dat, >>>>verbose = FALSE) >>> >>> >>> >>> ############ error messages received ############ Error in >>> getCdfInfo(object) : >>> ?Could not obtain CDF environment, problems encountered: >>> Specified environment does not contain Sorgh-WTa520972F Library - >>> package sorghwta520972fcdf not installed Bioconductor - >>> sorghwta520972fcdf not available >>> >>> >>> >>> >>> >>> -----Original Message----- >>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >>> Sent: Saturday, April 14, 2012 7:48 PM >>> To: Shu-wen Huang >>> Cc: bioconductor at r-project.org >>> Subject: Re: [BioC] How to generate an annotation library without CDF file? >>> >>> To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on... >>> >>> benilton >>> >>> On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com=""> wrote: >>>> Hi benilton, >>>> >>>> Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks! >>>> >>>> >>>> Sw >>>> >>>> -----Original Message----- >>>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com] >>>> Sent: Saturday, April 14, 2012 6:29 PM >>>> To: Shu-wen Huang >>>> Cc: bioconductor at r-project.org >>>> Subject: Re: [BioC] How to generate an annotation library without CDF file? >>>> >>>> PGFs are given for Gene/Exon ST arrays... and chances are that the >>>> package you need is already on BioConductor. (btw, a CDF for such >>>> array design is not recommended by Affymetrix themselves) >>>> >>>> Check Sections 1 and 4 of the document below: >>>> >>>> http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/d >>>> o >>>> c >>>> /primer.pdf >>>> >>>> benilton
ADD COMMENT

Login before adding your answer.

Traffic: 560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6