Hi Shu-wen,
I'm moving this back to the mailing list, so everyone can benefit from
this discussion and even provide you with alternatives.
Regarding the probeset.csv file, I'd expect Affymetrix to give you
this file. You should contact them with this regard.
benilton
On 15 April 2012 04:13, Shu-wen Huang <shuang at="" chromatininc.com="">
wrote:
> In order to run makePdInfoPackage, it requires 3 files, PGF, CLF,
and probeset.csv. However, among the giving files, I don't have any
.probeset.csv. Can any of the files below replace it?
>
> Here are all the files came with the CEL files.
> Can any other file, such as bgp, cif, grc, mps, gcc, smd ?replace
it?
>
>
> I tried to reformat .bgp to .probeset.csv. After the commands below,
I received a failure message in the bottom.
>
>>library(pdInfoBuilder)
>>baseDir <- "/home/shuang/Analysis/R/dataset_20120413"
>>(pgf <- list.files(baseDir, pattern = ".pgf",full.names = TRUE))
>>(clf <- list.files(baseDir, pattern = ".clf", full.names = TRUE))
>>(prob <- list.files(baseDir, pattern = ".probeset.csv", full.names =
TRUE))
>>seed <- new("AffyGenePDInfoPkgSeed",pgfFile = pgf, clfFile = clf,
probeFile = prob, biocViews = "AnnotationData", organism = "Sorghum",
species = "Bicolor")
>>makePdInfoPackage(seed, destDir = ".")
>
>
> Parsing file: Sorgh-WTa520972F.pgf... OK
> Parsing file: Sorgh-WTa520972F.clf... OK
> Creating initial table for probes... OK
> Creating dictionaries... OK
> Parsing file: Sorgh-WTa520972F.probeset.csv... OK
> Error in `[.data.frame`(probesets, , cols) : undefined columns
selected
> In addition: Warning messages:
> 1: In is.na(x) : is.na() applied to non-(list or vector) of type
'NULL'
> 2: In is.na(x) : is.na() applied to non-(list or vector) of type
'NULL'
>
>
>
> -----Original Message-----
> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
> Sent: Saturday, April 14, 2012 8:22 PM
> To: Shu-wen Huang
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] How to generate an annotation library without
CDF file?
>
> You did misunderstand.
>
> 1) Get all your files
> 2) Install the pdInfoBuilder package
> 3) Use the example in Section 8 of the pdInfoBuilder vignette ( http
://bioconductor.org/packages/release/bioc/vignettes/pdInfoBuilder/inst
/doc/BuildingPDInfoPkgs.pdf
> )
> 4) Install the resulting annotation package
> 5) Install oligo
> 6) Use the Sections 1 and 4 of the document I suggested on my first
message.
>
> b
>
> On 15 April 2012 02:16, Shu-wen Huang <shuang at="" chromatininc.com="">
wrote:
>> I tried to use rma() shown below. However, it seems I can't go
around the need of sorghwta520972fcdf. Or did I misunderstand what you
suggested?
>>
>>>eset = rma(dat)
>>
>> Error in getCdfInfo(object) :
>> ?Could not obtain CDF environment, problems encountered:
>> Specified environment does not contain Sorgh-WTa520972F Library -
>> package sorghwta520972fcdf not installed Bioconductor -
>> sorghwta520972fcdf not available
>>
>>
>> Sw
>>
>>
>> -----Original Message-----
>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
>> Sent: Saturday, April 14, 2012 8:08 PM
>> To: Shu-wen Huang
>> Cc: bioconductor at r-project.org
>> Subject: Re: [BioC] How to generate an annotation library without
CDF file?
>>
>> With the files you current have, you could generate the appropriate
annotation package and work with the preprocessing steps through oligo
and shown on the sections of the document I suggested initially.
>> However, I'm not sure gcrma() would work with oligo objects - in
the meantime, you could use rma(). Maybe Jean can provide further
insight...
>>
>> b
>>
>> On 15 April 2012 01:55, Shu-wen Huang <shuang at="" chromatininc.com="">
wrote:
>>> Below are my codes. It seems I need to somehow generate Sorgh-
WTa520972F Library in order to do Normalization. However, I don't have
CDF file, but many other format files.
>>>
>>>
>>>>library(affy)
>>>>library(limma)
>>>>library(gcrma)
>>>>library(genefilter)
>>>
>>> ## read the Targets.txt file ##
>>>>setwd("all")
>>>>targets = readTargets()
>>>
>>> ## create a phenodata object and attach it to the data ##
>>>>myCovs = data.frame(targets)
>>>>rownames(myCovs) = myCovs[,1]
>>>>nlev = as.numeric(apply(myCovs, 2, function(x)
>>>>nlevels(as.factor(x)))) metadata = data.frame(labelDescription =
>>>>paste(colnames(myCovs), ": ", nlev, " level",
ifelse(nlev==1,"","s"),
>>>>sep=""),
>>>>>row.names=colnames(myCovs)) phenoData = new("AnnotatedDataFrame",
>>>>data=myCovs, varMetadata=metadata)
>>>
>>> ## read the data, attach the phenodata and normalize it using
gcRMA
>>> ##
>>>>dat = ReadAffy(sampleNames = myCovs$Name, filenames =
myCovs$Celfile,
>>>>phenoData = phenoData, celfile.path = "celfiles") eset =
gcrma(dat,
>>>>verbose = FALSE)
>>>
>>>
>>>
>>> ############ error messages received ############ Error in
>>> getCdfInfo(object) :
>>> ?Could not obtain CDF environment, problems encountered:
>>> Specified environment does not contain Sorgh-WTa520972F Library -
>>> package sorghwta520972fcdf not installed Bioconductor -
>>> sorghwta520972fcdf not available
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
>>> Sent: Saturday, April 14, 2012 7:48 PM
>>> To: Shu-wen Huang
>>> Cc: bioconductor at r-project.org
>>> Subject: Re: [BioC] How to generate an annotation library without
CDF file?
>>>
>>> To generate an annotation package, you should use the PGF file...
and one alternative for this is the pdInfoBuilder package... but
without further details, it's hard to go on...
>>>
>>> benilton
>>>
>>> On 15 April 2012 01:40, Shu-wen Huang <shuang at="" chromatininc.com="">
wrote:
>>>> Hi benilton,
>>>>
>>>> Our group generated a particular list of probes. It's not
available in BioConductor. Do you mean I should try to generate a
library from PGF file? Thanks!
>>>>
>>>>
>>>> Sw
>>>>
>>>> -----Original Message-----
>>>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
>>>> Sent: Saturday, April 14, 2012 6:29 PM
>>>> To: Shu-wen Huang
>>>> Cc: bioconductor at r-project.org
>>>> Subject: Re: [BioC] How to generate an annotation library without
CDF file?
>>>>
>>>> PGFs are given for Gene/Exon ST arrays... and chances are that
the
>>>> package you need is already on BioConductor. (btw, a CDF for such
>>>> array design is not recommended by Affymetrix themselves)
>>>>
>>>> Check Sections 1 and 4 of the document below:
>>>>
>>>>
http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/d
>>>> o
>>>> c
>>>> /primer.pdf
>>>>
>>>> benilton