Stuck with Yeast Tiling Array
1
0
Entering edit mode
@richard-harrison-2386
Last seen 8.2 years ago
Dear All, I am having problems trying to analyze some genomic tiling arrays that i have done using s.cerevisiae reverse tiling arrays. Can anyone help on some/all of these issues please? I am trying to use the tilingArray package and the davidTiling packages 1) I load my arrays in using the following commands: cels= dir(pattern=".CEL") richard = readCel2eSet(cels) this renames my arrays to "1" and "2". How can i rename these to their original names? > dim(richard) Features Samples 6553600 2 > sampleNames (richard) [1] "1" "2" > Second it puts them in one column. How can i make this two columns....or more (see below) > library(davidTiling) > data("davidTiling") sampleNames(davidTiling) > sampleNames(davidTiling) [1] "09_11_04_S96_genDNA_16hrs_45C_noDMSO.cel" [2] "041119_S96genDNA_re-hybe.cel" [3] "041120_S96genDNA_re-hybe.cel" [4] "05_04_27_2xpolyA_NAP3.cel" [5] "05_04_26_2xpolyA_NAP2.cel" [6] "05_04_20_2xpolyA_NAP_2to1.cel" [7] "050409_totcDNA_14ug_no52.cel" [8] "030505_totcDNA_15ug_affy.cel" Thanks, Richard
tilingArray tilingArray • 703 views
0
Entering edit mode
@joern-toedling-1244
Last seen 8.2 years ago
Hi Richard, > cels= dir(pattern=".CEL") > richard = readCel2eSet(cels) > > this renames my arrays to "1" and "2". How can i rename these to > their original names? > you can set the sampleNames of an ExpressionSet object by sampleNames(richard) <- .... for example: sampleNames(richard) <- cels > > dim(richard) > Features Samples > 6553600 2 > > sampleNames (richard) > [1] "1" "2" > > > > > Second it puts them in one column. How can i make this two > columns....or more (see below) > I am sorry, but I do not understand what you want to do or what the problem may be. Can you please give more details? What exactly is being put in one column? Please also provide the output of "sessionInfo()" to let us know which package versions you are using. Best regards, Joern
0
Entering edit mode
Thanks Joern, The first part works great: > sampleNames(richard) [1] "./ucont.CEL" "./utest.CEL" I now see the above. I was worried that when i use the davidTiling dataset and i do sampleNames, each Cel file is preceeded by a number, so what i would expect to see is: [1] "./ucont.CEL" [2] "./utest.CEL" What I actually want to do is normalise the arrays, by using the normalize by reference function. I now get the following error: >isDNA = richard$nucleicAcid == "ucont" > isRNA = richard$nucleicAcid == "utest" > pm = PMindex(probeAnno) > bg= BGindex(probeAnno) > yn = normalizeByReference (richard [,isRNA] , reference = richard [,isDNA], pm = pm, background + = bg) Error in normalizeByReference(richard[, isRNA], reference = richard[, : There is nothing to normalize in 'x'. Any suggestions? Here is my sessionInfo: > sessionInfo() R version 2.5.1 (2007-06-27) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB .U TF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UT F- 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_ ID ENTIFICATION=C attached base packages: [1] "splines" "grid" "tools" "stats" "graphics" "grDevices" [7] "utils" "datasets" "methods" "base" other attached packages: davidTiling GO tilingArray pixmap geneplotter lattice "1.0.4" "1.16.0" "1.14.0" "0.4-7" "1.14.0" "0.15-11" annotate genefilter survival vsn limma strucchange "1.14.1" "1.14.1" "2.32" "2.2.0" "2.10.5" "1.3-2" sandwich zoo RColorBrewer affy affyio Biobase "2.0-2" "1.3-2" "0.2-3" "1.14.2" "1.4.1" "1.14.1" Many thanks, Richard On 19 Sep 2007, at 16:58, Joern Toedling wrote: > Hi Richard, > >> cels= dir(pattern=".CEL") >> richard = readCel2eSet(cels) >> >> this renames my arrays to "1" and "2". How can i rename these to >> their original names? >> > > you can set the sampleNames of an ExpressionSet object by > sampleNames(richard) <- .... > for example: > sampleNames(richard) <- cels > >>> dim(richard) >> Features Samples >> 6553600 2 >>> sampleNames (richard) >> [1] "1" "2" >>> >> >> >> Second it puts them in one column. How can i make this two >> columns....or more (see below) >> > > I am sorry, but I do not understand what you want to do or what the > problem may be. > Can you please give more details? What exactly is being put in one > column? > Please also provide the output of "sessionInfo()" > to let us know which package versions you are using. > > Best regards, > Joern >
0
Entering edit mode
Richard, > Thanks Joern, > The first part works great: > > > sampleNames(richard) > [1] "./ucont.CEL" "./utest.CEL" not sure, whether that preceding './' in front of every name is what you would want, how about sampleNames(richard) <- dir(pattern=".CEL", full.names=FALSE) instead. > I was worried that when i use the davidTiling dataset and i do > sampleNames, each Cel file is preceeded by a number, so what i would > expect to see is: > > [1] "./ucont.CEL" > [2] "./utest.CEL" this is just how a character vector displayed in your console, with the ExpressionSet in davidTiling the single entries in that sampleNames character vector are just too long such that only one can be displayed per line. The number preceding each line simply indicates which element of the vector this line starts with. With your sampleNames the single entries are relatively short and all of the can fit in one line, though. > > What I actually want to do is normalise the arrays, by using the > normalize by reference function. I now get the following error: > > > >isDNA = richard$nucleicAcid == "ucont" > > isRNA = richard$nucleicAcid == "utest" > > pm = PMindex(probeAnno) > > bg= BGindex(probeAnno) > > yn = normalizeByReference (richard [,isRNA] , reference = richard > [,isDNA], pm = pm, background + = bg) > > Error in normalizeByReference(richard[, isRNA], reference = richard[, : > There is nothing to normalize in 'x'. Have you checked the contents of isDNA and isRNA? And are "pm" and "bg" reasonable and is "bg" a subset of "pm". Please refer to the vignettes of davidTiling for more details on these. And does pData(richard) have a column "nucleicAcid"? Since you only have two samples, it may be easier to set isDNA <- 1; isRNA <- 2 This is only useful, though, if the array with the cel file 'ucont.CEL' really is a genomic-DNA-hybridization and "utest.CEL" an RNA-hybridization. If not, the whole normalization by a genomic-DNA hybridization may not be appropriate. Best regards, Joern
0
Entering edit mode
Thanks Joern, I'm starting to understand now. I am a total beginner at R, so all this is very new. How do i make an equivalent of the davidTiling $nucleicAcid ? If I type that i see this: > davidTiling$nucleicAcid [1] genomic DNA genomic DNA genomic DNA poly(A) RNA poly(A) RNA poly (A) RNA [7] total RNA total RNA Levels: genomic DNA poly(A) RNA total RNA I need to create a similar thing for my (richard) data set. Any ideas? my isDNA and isRNA, are both empty at the moment...which is why it isn't working! I have two DNA samples (and i'm just treating one as RNA) so I can work through the programs and see how it all works and where I need to get things modified, before I do a large batch of arrays (I've been messing with salt conditions so I need to see that these arrays actually have stuff hybed!!) All I actually want to do is an equivalent to rma, but i think that the tilingArray/davidTiling libraries are the only resources available at the moment.. i think!? Thanks for all your help, Richard On 19 Sep 2007, at 18:29, Joern Toedling wrote: > Richard, > >> Thanks Joern, >> The first part works great: >> >>> sampleNames(richard) >> [1] "./ucont.CEL" "./utest.CEL" > > not sure, whether that preceding './' in front of every name is > what you > would want, how about > sampleNames(richard) <- dir(pattern=".CEL", full.names=FALSE) > instead. >> I was worried that when i use the davidTiling dataset and i do >> sampleNames, each Cel file is preceeded by a number, so what i would >> expect to see is: >> >> [1] "./ucont.CEL" >> [2] "./utest.CEL" > this is just how a character vector displayed in your console, with > the > ExpressionSet in davidTiling the single entries in that sampleNames > character vector are just too long such that only one can be displayed > per line. The number preceding each line simply indicates which > element > of the vector this line starts with. > With your sampleNames the single entries are relatively short and > all of > the can fit in one line, though. > >> >> What I actually want to do is normalise the arrays, by using the >> normalize by reference function. I now get the following error: >> >> >>> isDNA = richard$nucleicAcid == "ucont" >>> isRNA = richard$nucleicAcid == "utest" >>> pm = PMindex(probeAnno) >>> bg= BGindex(probeAnno) >>> yn = normalizeByReference (richard [,isRNA] , reference = richard >> [,isDNA], pm = pm, background + = bg) >> >> Error in normalizeByReference(richard[, isRNA], reference = richard >> [, : >> There is nothing to normalize in 'x'. > > Have you checked the contents of isDNA and isRNA? And are "pm" and > "bg" > reasonable and is "bg" a subset of "pm". Please refer to the vignettes > of davidTiling for more details on these. And does pData(richard) > have a > column "nucleicAcid"? Since you only have two samples, it may be > easier > to set > isDNA <- 1; isRNA <- 2 > This is only useful, though, if the array with the cel file > 'ucont.CEL' > really is a genomic-DNA-hybridization and "utest.CEL" an > RNA-hybridization. If not, the whole normalization by a genomic-DNA > hybridization may not be appropriate. > > Best regards, > Joern >
0
Entering edit mode
Hello Richard, the column nucleicAcid is part of the AnnotatedDateFrame object that describes the samples in the ExpressionSet. There are different ways to create such, have a look at the help pages for that class and the 'ExpressionSet'. It is assigned using the phenoData method then. Richard Harrison wrote: > I'm starting to understand now. I am a total beginner at R, so all > this is very new. How do i make an equivalent of the > davidTiling$nucleicAcid ? If I type that i see this: > > > davidTiling$nucleicAcid > [1] genomic DNA genomic DNA genomic DNA poly(A) RNA poly(A) RNA > poly(A) RNA > [7] total RNA total RNA > Levels: genomic DNA poly(A) RNA total RNA > > I need to create a similar thing for my (richard) data set. Any ideas? > my isDNA and isRNA, are both empty at the moment...which is why it > isn't working! > > I have two DNA samples (and i'm just treating one as RNA) so I can > work through the programs and see how it all works and where I need to > get things modified, before I do a large batch of arrays (I've been > messing with salt conditions so I need to see that these arrays > actually have stuff hybed!!) > > All I actually want to do is an equivalent to rma, but i think that > the tilingArray/davidTiling libraries are the only resources available > at the moment.. i think!? I am not sure whether you would want to use the function normalizeByReference at all. In our case, we had poly-A-RNA samples and direct genomic hybridization and we discovered that using the genomic DNA for normalization of the RNA samples worked much better than other probe-based background corrections and normalizations for improving the signal-to-noise ratio. That's what the function is for. There definitely are lots of other functions in other Bioconductor packages that you can use for quality assessment of your arrays and for background correction of your intensity data. The packages "affy" and "oligo" may be good places to start. I would suggest yougive a description of your samples and arrays and what you want to analyze with these arrays to the list and I am sure people who have have done similar analyses will be happy to give some suggestions on other packages. Sorry that I cannot be of more help at the moment. Regards, Joern