On Monday 20 November 2006 10:55, Jo??o Fadista wrote:
> Hi everyone,
>
> As I read the "snapCGH: Segmentation, Normalization and Processing
of aCGH
> Data User?s Guide" I became really excited with all the features in
it to
> analyse CGH data and because it is designed to be used in
conjunction with
> limma package, which I have already been using. I have done the
practicals
> and browsed the main functions using the data given in the package.
>
> After this stage I wanted to deal with a real data set so I
downloaded a
> CGH experiment from GEO (Gene Expression Omnibus) and put it on R
workspace
> using the GEOquery package. After that I converted the GEO DataSet
into an
> MAList to be able to use the data with snapCGH package.
>
> Despite of this, when I used the function processCGH it gave me an
error:
> > MA2 <- processCGH(MA, method.of.averaging=mean, ID="ID")
>
> Error in processCGH(MA, method.of.averaging = mean, ID = "ID") :
$design
> component is null
>
> So, then I managed to to make the design column, but it gave me an
error,
but a different one:
> > MA$design <- rep(1,10)
> >
> > MA2 <- processCGH(MA, method.of.averaging=mean, ID="ID")
>
> Error in order(na.last, decreasing, ...) : argument 1 is not a
vector
>
>
>
> Therefore, if MA is an object of class MAList this function should
work. I
> do not see what is wrong. Isn?t the snapCGH package compatible with
the
> GEO datasets?
>
> There is also another thing. In the examples folder of the package,
the
> clones.info file has the columns Chromosome and Position, but in the
> dataset from GEO there is only the Entrez.GeneID identifier. Do you
know of
> anyway I could convert one into another?
Hi, Joao.
All the CGH methods that are available via bioconductor require a
chromosome
and basepair position. They cannot work without these. There are a
number
of ways to get chromosome location, but perhaps the simplest is to use
the
biomaRt package to go from gene_id to chromosome and position. I
don't think
that you will be able to proceed without having the chromosome
locations
included in the MA$genes data frame and, although I am not sure, I
would
guess that the error is because of not having these. Perhaps others
on the
list will confirm this.
Sean
Hi
As Sean has said the methods available within snapCGH won't work if
the
Position and Chromosome elements aren't present in the $genes
dataframe.
However I think the error you are currently seeing isn't related to
that. The processCGH function averages replicates of the clones and
the ID argument specifies which column in $genes contains an
identifier
for each clone. If you don't have such an identifier then the easiest
thing to do is add a column with the name "ID" to $genes with the
numbers from 1 to the length of the genes dataframe.
Hopefully the processCGH function will then work
Mike Smith
Quoting Sean Davis <sdavis2 at="" mail.nih.gov="">:
> On Monday 20 November 2006 10:55, Jo?o Fadista wrote:
>> Hi everyone,
>>
>> As I read the "snapCGH: Segmentation, Normalization and Processing
of aCGH
>> Data User?s Guide" I became really excited with all the features in
it to
>> analyse CGH data and because it is designed to be used in
conjunction with
>> limma package, which I have already been using. I have done the
practicals
>> and browsed the main functions using the data given in the package.
>>
>> After this stage I wanted to deal with a real data set so I
downloaded a
>> CGH experiment from GEO (Gene Expression Omnibus) and put it on R
workspace
>> using the GEOquery package. After that I converted the GEO DataSet
into an
>> MAList to be able to use the data with snapCGH package.
>>
>> Despite of this, when I used the function processCGH it gave me an
error:
>> > MA2 <- processCGH(MA, method.of.averaging=mean, ID="ID")
>>
>> Error in processCGH(MA, method.of.averaging = mean, ID = "ID") :
$design
>> component is null
>>
>> So, then I managed to to make the design column, but it gave me an
error,
> but a different one:
>> > MA$design <- rep(1,10)
>> >
>> > MA2 <- processCGH(MA, method.of.averaging=mean, ID="ID")
>>
>> Error in order(na.last, decreasing, ...) : argument 1 is not a
vector
>>
>>
>>
>> Therefore, if MA is an object of class MAList this function should
work. I
>> do not see what is wrong. Isn?t the snapCGH package compatible with
the
>> GEO datasets?
>>
>> There is also another thing. In the examples folder of the package,
the
>> clones.info file has the columns Chromosome and Position, but in
the
>> dataset from GEO there is only the Entrez.GeneID identifier. Do you
know of
>> anyway I could convert one into another?
>
> Hi, Joao.
>
> All the CGH methods that are available via bioconductor require a
chromosome
> and basepair position. They cannot work without these. There are a
number
> of ways to get chromosome location, but perhaps the simplest is to
use the
> biomaRt package to go from gene_id to chromosome and position. I
don't think
> that you will be able to proceed without having the chromosome
locations
> included in the MA$genes data frame and, although I am not sure, I
would
> guess that the error is because of not having these. Perhaps others
on the
> list will confirm this.
>
> Sean
>
>
Hi everybody,
I am trying to annotate my dataset (home spotted array, two colors,
mice) using AnnBuilder.
Every time I run the program the connection with the kegg
website is not working, so I am able to build the annotation
package but not for the kegg pathways. Does anybody know how to
fix this problem or did anybody find a way to by pass it (like
downloading a list of accession numbers and corresponding pathways)?
here my script:
**********************************************************************
*******************************
library(AnnBuilder)
#Loading required package: Biobase
#Loading required package: tools
#Welcome to Bioconductor
# Vignettes contain introductory material. To view,
# simply type: openVignette()
# For details on reading vignettes, see
# the openVignette help page.
#Loading required package: annotate
library(GO)
sessionInfo()
#Version 2.3.1 (2006-06-01)
#i386-pc-linux-gnu
#
#attached base packages:
#[1] "splines" "tools" "methods" "stats" "graphics"
#"grDevices"
#[7] "utils" "datasets" "base"
#
#other attached packages:
#
# globaltest vsn limma
multtest
# "4.2.0" "1.10.0" "2.7.3"
"1.10.2"
# survival affydata affy
affyio
# "2.20" "1.8.0" "1.10.0"
"1.0.0"
# KEGG GO AnnBuilder RSQLite
# "1.12.0" "1.12.0" "1.10.0"
"0.4-1"
# DBI annotate XML
Biobase
# "0.1-10" "1.10.0" "0.99-7"
"1.10.0"
mySrcUrls <- getSrcUrl("all", organism="Mus Musclusus")
base<- file.path(.path.package("AnnBuilder"), "data",
"lgtc.ids.1.txt")
myBaseType<- "gbNRef"
ABPkgBuilder(baseName=base,
srcUrls = mySrcUrls,
baseMapType = myBaseType,
pkgName = "lgtc201106",
pkgPath = ".",
organism ="Mus Musclusus",
version ="1.1.0",
author = list(author = "Paola Pedotti",
maintener ="Paola Pedotti <p.pedotti at="" lumc.nl="">")
)
#Failed to get data from URL:
ftp://ftp.genome.ad.jp/pub/kegg/pathways//07214.gene
#Failed to get data from URL:
ftp://ftp.genome.ad.jp/pub/kegg/pathways//07215.gene
#Failed to get data from URL:
ftp://ftp.genome.ad.jp/pub/kegg/pathways//07216.gene
#Failed to get data from URL:
ftp://ftp.genome.ad.jp/pub/kegg/pathways//07217.gene
#Failed to get data from URL:
ftp://ftp.genome.ad.jp/pub/kegg/pathways//07218.gene
#[1] "0 2 2"
#Warning message:
#cannot open file
'/usr/local/lib/R/site-library/AnnBuilder/templates/PKGNAMEGO.1.Rd',
reason 'No such file or directory'
#The following data sets have been added to the database and will be
removed:
# [1] "./lgtc161106/data/lgtc161106ACCNUM.rda"
# [2] "./lgtc161106/data/lgtc161106CHR.rda"
# [3] "./lgtc161106/data/lgtc161106ENZYME.rda"
# [4] "./lgtc161106/data/lgtc161106GENENAME.rda"
# [5] "./lgtc161106/data/lgtc161106GO.1.rda"
# [6] "./lgtc161106/data/lgtc161106GO2ALLPROBES.rda"
# [7] "./lgtc161106/data/lgtc161106GO2PROBE.rda"
# [8] "./lgtc161106/data/lgtc161106GO.rda"
# [9] "./lgtc161106/data/lgtc161106LOCUSID.rda"
#[10] "./lgtc161106/data/lgtc161106MAPCOUNTS.rda"
#[11] "./lgtc161106/data/lgtc161106MAP.rda"
#[12] "./lgtc161106/data/lgtc161106OMIM.rda"
#[13] "./lgtc161106/data/lgtc161106ORGANISM.rda"
#[14] "./lgtc161106/data/lgtc161106PATH.rda"
#[15] "./lgtc161106/data/lgtc161106PMID2PROBE.rda"
#[16] "./lgtc161106/data/lgtc161106PMID.rda"
#[17] "./lgtc161106/data/lgtc161106QCDATA.rda"
#[18] "./lgtc161106/data/lgtc161106QC.rda"
#[19] "./lgtc161106/data/lgtc161106REFSEQ.rda"
#[20] "./lgtc161106/data/lgtc161106SUMFUNC.rda"
#[21] "./lgtc161106/data/lgtc161106SYMBOL.rda"
#[22] "./lgtc161106/data/lgtc161106UNIGENE.rda"
#Warning message:
#Can't
copy /usr/local/lib/R/site-library/AnnBuilder/templates/PKGNAMEGO.1.Rd
in: copyTemplates(repList, pattern, pkgName, pkgPath)
**********************************************************************
*******************************
thank you in advance
Paola
_______________________________________
Center for Human and Clinical Genetics
Leiden University Medical Center
Postzone: S-04-P, Postbus 9600
2300 RC Leiden, The Netherlands
Telephone: +31 71 526 9440
Fax: +31 71 526 8285