Question

create CNA object in DNAcopy package with agilent data

0

Entering edit mode

jhs1jjm@leeds.ac.uk ▴ 230

@jhs1jjmleedsacuk-2338

Last seen 9.6 years ago

Hi, I'm using R 2.5.0 on openSUSE 10.2 x86_64. I'm struggling as to what to use for the chromosome and maploc arguments for the CNA function. I've got data from 3 agilent 44k CGH arrays. I've created the marrayNorm and Raw objects using read.Agilent. The CNA function usage is as follows: CNA(genomdat, chrom, maploc, data.type=c("logratio","binary"), sampleid=NULL) I've read the CNA help page but am still struggling. For genomdat I've worked out this is mnorm at maM, the average log ratios, this will be my data.type. The column headers in my raw data file include systematic name in the following format: chr3:175483690-175483749 This seems to have been read into my work session using read.Agilent but how do use this and it isn't ordered, is this important? I've looked at the coriell data example but this is all nicely ordered and the headers are different to my data file. If anyone could point me in the right direction that would be great. John

CGH CGH • 1.1k views

ADD COMMENT • link updated 16.6 years ago by Sean Davis 21k • written 16.6 years ago by jhs1jjm@leeds.ac.uk ▴ 230

score 0 · Answer 1 · 2007-09-13

jhs1jjm at leeds.ac.uk wrote: > Hi, I'm using R 2.5.0 on openSUSE 10.2 x86_64. > > I'm struggling as to what to use for the chromosome and maploc arguments for the > CNA function. I've got data from 3 agilent 44k CGH arrays. I've created the > marrayNorm and Raw objects using read.Agilent. The CNA function usage is as > follows: > > CNA(genomdat, chrom, maploc, data.type=c("logratio","binary"), > sampleid=NULL) > > I've read the CNA help page but am still struggling. > For genomdat I've worked out this is mnorm at maM, the average log ratios, this > will be my data.type. > > The column headers in my raw data file include systematic name in the following > format: > > chr3:175483690-175483749 > > This seems to have been read into my work session using read.Agilent but how do > use this and it isn't ordered, is this important? I've looked at the coriell > data example but this is all nicely ordered and the headers are different to my > data file. > > If anyone could point me in the right direction that would be great. This is the information that you will need to use, yes. It contains the chromosome and location information. You will need to manipulate this column to get the chromosome and locations into separate columns. You can do this in R or in Excel. Sean

score 0 · Answer 2 · 2007-09-13

jhs1jjm at leeds.ac.uk wrote: > Could you possibly tell me what functions/package I need to look at in R in > order to do this as I do not have excel and may well need to handle data that > exceeds the maximum number of rows in openoffice. `extractAgilentInfo` <- function(charvec) { tmp <- do.call(rbind,strsplit(charvec,':')) #split chrom from locations tmp2 <- do.call(rbind,strsplit(tmp[,2],'-')) #split locations tmp3 <- sub('chr','',tmp[,1]) #convert to numeric chromosome if wanted tmp3[tmp3=='X'] <- 23 # May need to change these numbers to tmp3[tmp3=='Y'] <- 24 # match your species tmp3 <- as.integer(tmp3) tmp[is.na(tmp3),1] <- NA return(data.frame(chromosome=tmp[,1],location=as.integer(tmp2[,1]),Num Chrom=tmp3)) } Use like so: agilentInfo <- extractAgilentInfo(as.character(rawdat$SystematicName) And you will get back a data.frame of what you need, I think. Sean

score 0 · Answer 3 · 2007-09-13

jhs1jjm at leeds.ac.uk wrote: > Sean, > > Thanks for that. Couldn't get it to work but not too worry as I wouldn't want to > take credit for writing a function like that and my tutor wouldn't expect it. > Someone has written some perl code to do it for him but I want to get to grips > with R. I've tried to decipher what you've done and daresay I can get there > although in a slightly long winded method. I can bring up the Systematic names > with the following: > > x <- manorm at maGnames@maInfo[,3] > > I've had a look at the strsplit help: > > ch_loc_split <- strsplit(x,":") > > I'll have a look at the rest of the code and functions you've used then get back > to you. If there's any potential pitfalls for a newbie then by all means let me > know. > Jim, What did you try and what didn't work? Error messages and actual commands will help here. Sean

score 0 · Answer 4 · 2007-09-13

jhs1jjm at leeds.ac.uk wrote: > Sean, > > Awesome, seems to have worked. There were 2 warnings, NAs introduced by > coercion. Just changed the end (after messing around with importing the raw > data) to the marray object as follows: > > agilentInfo <- extractAgilentInfo(as.character(mnorm at maGnames@maInfo[,3])) > > Guessing that would have taken me a while to work out. Is there any reason why > this wouldn't work for the 244k array just for future reference? Will try start > the DNAcopy analysis now. > John, Great to hear that it worked for you. The NAs introduced are expected and are associated with the control probes on the array. It should work just fine for all Agilent arrays as long as the systematic name is in the same format. Agilent is actually pretty good about keeping things stable over different arrays and over time. Sean P.S. In the future, feel free to reply back to the list. Doing so allows everyone to learn from the interaction and has the added benefit of creating a lasting record of any answers in the archive.