GWATools use in creating ncdf files
1
0
Entering edit mode
Sam Rose ▴ 60
@sam-rose-5964
Last seen 9.6 years ago
Hi Stephanie, Quick question, do you have any experience with the following error in the ncdfAddData() function? Error in `[.data.frame`(dat, , new.names) : undefined columns selected > traceback() 4: stop("undefined columns selected") 3: `[.data.frame`(dat, , new.names) 2: dat[, new.names] 1: ncdfAddData(path = ".", ncdf.filename = geno.nc.file, snp.annotation = snp.annot, scan.annotation = scan.annot, sep.type = "\t", skip.num = 1, col.total = 16, col.nums = col.nums, scan.name.in.file = 0) I am using a new data set exactly the same as the last one which worked and can't seem to figure out the problem behind this one. These are what the tops of my scan and snp annotation files look like: snp: snpID chromosome position rsID 1 1 72017 rs4477212 2 1 524110 SNP1-524110 3 1 555149 SNP1-555149 4 1 559487 SNP1-559487 5 1 679049 rs4000335 6 1 713781 SNP1-713781 7 1 719495 SNP1-719495 8 1 742429 rs3094315 9 1 742584 rs3131972 scan: scanID subjectID genoRunID sex file 2 PT-JOPP 5491005008_R03C01 M 5491005008_R03C01.gtc.txt.use 3 PT-JOQ4 5434246116_R01C01 F 5434246116_R01C01.gtc.txt.use 4 PT-JOQ6 5491005134_R03C01 M 5491005134_R03C01.gtc.txt.use 5 PT-JOQB 5434078047_R04C01 F 5434078047_R04C01.gtc.txt.use 6 PT-JOQG 5491005152_R03C01 F 5491005152_R03C01.gtc.txt.use 7 PT-JOP2 5434246116_R03C01 F 5434246116_R03C01.gtc.txt.use 8 PT-JOQI 5491005134_R04C01 M 5491005134_R04C01.gtc.txt.use 9 PT-JOQL 5434246084_R04C01 M 5434246084_R04C01.gtc.txt.use 11 PT-JOQW 5491005061_R02C01 M 5491005061_R02C01.gtc.txt.use Any help would be appreciated. Thanks, Sam On Wed, Jun 12, 2013 at 9:45 AM, Sam Rose <srose@broadinstitute.org> wrote: > Hi Stephanie, > > After using your commands I was able to successfully use the package > without error. I think the error was within the column names, which needed > to be changed from genoRunID to scanID in order to be recognized by the > later commands. > > Thanks for all of your help it was very much appreciated. > > Best, > Sam > > > On Fri, Jun 7, 2013 at 3:27 PM, Stephanie M. Gogarten < > sdmorris@u.washington.edu> wrote: > >> Hi Sam, >> >> I can't reproduce your error using the data you sent. Either you had an >> error in creating your netCDF files, or the sample you were working with >> doesn't have any usable data. Below is the code I used; see if you can >> reproduce it with your other sample. >> >> library(GWASTools) >> >> scan.annot <- read.table("8850270138_R01C01.**gtc.txt.scan_annotation", >> colClasses=c("integer", rep("character",4)), >> header=TRUE) >> scanAnnot <- ScanAnnotationDataFrame(scan.**annot) >> >> snp.annot <- read.table("8850270138_R01C01.**gtc.txt.snp_annotation", >> as.is=TRUE, header=TRUE) >> snpAnnot <- SnpAnnotationDataFrame(snp.**annot) >> >> geno.nc.file <- "geno.nc" >> ncdfCreate(ncdf.filename=geno.**nc.file, snp.annotation=snp.annot, >> variables="genotype", n.samples=1, precision="single") >> names(scan.annot)[3] <- "scanName" >> names(snp.annot)[4] <- "snpName" >> col.nums <- as.integer(c(1,7,8)) >> names(col.nums) <- c("snp","a1","a2") >> ncdfAddData(path=".", ncdf.filename=geno.nc.file, >> snp.annotation=snp.annot, >> scan.annotation=scan.annot, >> sep.type="\t", skip.num=1, >> col.total=16, col.nums=col.nums, >> scan.name.in.file=0) >> >> bl.nc.file <- "bl.nc" >> ncdfCreate(ncdf.filename=bl.**nc.file, snp.annotation=snp.annot, >> variables=c("BAlleleFreq", "LogRRatio"), >> n.samples=1, precision="single") >> col.nums <- as.integer(c(1,15,16)) >> names(col.nums) <- c("snp", "ballelefreq", "logrratio") >> ncdfAddData(path=".", ncdf.filename=bl.nc.file, >> snp.annotation=snp.annot, >> scan.annotation=scan.annot, >> sep.type="\t", skip.num=1, >> col.total=16, col.nums=col.nums, >> scan.name.in.file=0) >> >> genoData <- GenotypeData(**NcdfGenotypeReadergeno.nc.**file), >> scanAnnot=scanAnnot, snpAnnot=snpAnnot) >> blData <- IntensityData(**NcdfIntensityReaderbl.nc.**file), >> scanAnnot=scanAnnot, snpAnnot=snpAnnot) >> >> scan.ids <- read.table("scan.ids")[,1] >> snp.ids <- read.table("snp.ids")[,1] >> chrom.ids <- read.table("chrom.ids")[,1] >> seg <- anomSegmentBAF(blData, genoData, scan.ids=scan.ids, >> chrom.ids=chrom.ids, snp.ids=snp.ids) >> head(seg) >> scanID chromosome left.index right.index num.mark seg.mean >> 1 1 1 5 59479 17205 0.1582 >> 2 1 2 59498 117436 17360 0.1596 >> 3 1 3 117437 164865 14181 0.1595 >> 4 1 4 164872 205472 12218 0.1640 >> 5 1 5 205474 247715 13313 0.1602 >> 6 1 6 247752 296212 15232 0.1629 >> >> best wishes, >> Stephanie >> >> On 6/6/13 1:28 PM, Sam Rose wrote: >> >>> Hi Stephanie, >>> >>> I am providing a CEU control sample data from the same study, NA12878. I >>> just sent it to you in a dropbox link. Let me know if there are any >>> questions. >>> >>> Best, >>> Sam >>> >>> >>> >>> >>> On Wed, Jun 5, 2013 at 12:11 PM, Stephanie M. Gogarten >>> <sdmorris@u.washington.edu <mailto:sdmorris@u.washington.**edu<sdmorris@u.washington.edu="">>> >>> wrote: >>> >>> It's not finding any BAF values that meet all the criteria (snpID in >>> "snp.ids", chromosome in "chrom.ids", genotype is heterozygous or >>> missing, BAF is non-missing). >>> >>> Is it possible for you to send me the data you're using, along with >>> your values of "scan.ids", "chrom.ids", and "snp.ids"? >>> >>> Stephanie >>> >>> >>> On 6/4/13 2:23 PM, Sam Rose wrote: >>> >>> Yes this was intentional. I just wanted to get it running for >>> one sample >>> and then expand to the rest. I was also limiting this to only >>> autosomes >>> for the time being. >>> >>> This is the error message I get now: >>> >>> > seg <- anomSegmentBAF(blData, genoData, scan.ids=scan.ids, >>> + chrom.ids=chrom.ids, >>> snp.ids=snp.ids) >>> Error in anomSegmentBAF(blData, genoData, scan.ids = scan.ids, >>> chrom.ids >>> = chrom.ids, : >>> no valid BAF values for chromosomes in chrom.ids >>> >>> Hopefully this helps. >>> >>> Best, >>> Sam >>> >>> >>> On Tue, Jun 4, 2013 at 12:28 AM, Stephanie M. Gogarten >>> <sdmorris@u.washington.edu <mailto:sdmorris@u.washington.**edu<sdmorris@u.washington.edu=""> >>> > >>> <mailto:sdmorris@u.washington.**__edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu="">>>> >>> wrote: >>> >>> You have only one sample in your netCDF files - is this >>> intentional? >>> >>> That should not cause your error, however. Can you try >>> running with >>> the latest GWASTools version and tell me what the new error >>> message is? >>> >>> Also, you should include in your scan annotation a >>> character vector >>> "sex" with values of "M" or "F". The code treats males and >>> females >>> differently for X chromosome SNPs, and will complain later >>> if this >>> variable is missing. >>> >>> Stephanie >>> >>> >>> On 6/3/13 4:04 PM, Sam Rose wrote: >>> >>> After checking again it still isn't quite working. >>> >>> I am pasting below the str() results for my intensity >>> and genotype >>> objects, maybe something in this can point clearly to >>> what I am >>> doing >>> wrong. >>> >>> I am using an integer vector of 1 to the total number >>> of snps >>> for my snp >>> id since it gave me some trouble before when it wasn't >>> sorted. >>> >>> Best, >>> Sam >>> >>> > str(genoData) >>> Formal class 'GenotypeData' [package "GWASTools"] with >>> 3 slots >>> ..@ data :Formal class 'NcdfGenotypeReader' >>> [package >>> "GWASTools"] >>> with 13 slots >>> .. .. ..@ snpDim : chr "snp" >>> .. .. ..@ scanDim : chr "sample" >>> .. .. ..@ snpIDvar : chr "snp" >>> .. .. ..@ chromosomeVar: chr "chromosome" >>> .. .. ..@ positionVar : chr "position" >>> .. .. ..@ scanIDvar : chr "sampleID" >>> .. .. ..@ genotypeVar : chr "genotype" >>> .. .. ..@ XchromCode : int 23 >>> .. .. ..@ YchromCode : int 25 >>> .. .. ..@ XYchromCode : int 24 >>> .. .. ..@ MchromCode : int 26 >>> .. .. ..@ filename : chr "tmp.geno.skea.nc >>> <http: tmp.geno.skea.nc=""> >>> <http: tmp.geno.skea.nc=""> <http: tmp.geno.skea.nc="">" >>> >>> >>> .. .. ..@ handler :List of 10 >>> .. .. .. ..$ id : int 524288 >>> .. .. .. ..$ ndims : int 2 >>> .. .. .. ..$ natts : int 2 >>> .. .. .. ..$ unlimdimid : num 1 >>> .. .. .. ..$ filename : chr "tmp.geno.skea.nc >>> <http: tmp.geno.skea.nc=""> >>> <http: tmp.geno.skea.nc=""> >>> <http: tmp.geno.skea.nc="">" >>> >>> .. .. .. ..$ varid2Rindex: num [1:6] 0 1 0 2 3 4 >>> .. .. .. ..$ writable : logi FALSE >>> .. .. .. ..$ dim :List of 2 >>> .. .. .. .. ..$ sample:List of 8 >>> .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. ..- attr(*, "class")= chr "dim.ncdf" >>> .. .. .. .. ..$ snp :List of 8 >>> .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 2 3 4 >>> 5 6 7 8 >>> 9 10 ... >>> .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. ..- attr(*, "class")= chr "dim.ncdf" >>> .. .. .. ..$ nvars : num 4 >>> .. .. .. ..$ var :List of 4 >>> .. .. .. .. ..$ sampleID :List of 16 >>> .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. ..$ name : chr "sampleID" >>> .. .. .. .. .. ..$ ndims : int 1 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int 1 >>> .. .. .. .. .. ..$ prec : chr "int" >>> .. .. .. .. .. ..$ dimids : num 1 >>> .. .. .. .. .. ..$ units : chr "id" >>> .. .. .. .. .. ..$ longname : chr "sampleID" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 1 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ missval : int 0 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ position :List of 16 >>> .. .. .. .. .. ..$ id : int 4 >>> .. .. .. .. .. ..$ name : chr "position" >>> .. .. .. .. .. ..$ ndims : int 1 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int 709358 >>> .. .. .. .. .. ..$ prec : chr "int" >>> .. .. .. .. .. ..$ dimids : num 2 >>> .. .. .. .. .. ..$ units : chr "bases" >>> .. .. .. .. .. ..$ longname : chr "position" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 1 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int 709358 >>> .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. ..$ missval : int -1 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ chromosome:List of 16 >>> .. .. .. .. .. ..$ id : int 5 >>> .. .. .. .. .. ..$ name : chr "chromosome" >>> .. .. .. .. .. ..$ ndims : int 1 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int 709358 >>> .. .. .. .. .. ..$ prec : chr "int" >>> .. .. .. .. .. ..$ dimids : num 2 >>> .. .. .. .. .. ..$ units : chr "id" >>> .. .. .. .. .. ..$ longname : chr "chromosome" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 1 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int 709358 >>> .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. ..$ missval : int -1 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ genotype :List of 16 >>> .. .. .. .. .. ..$ id : int 6 >>> .. .. .. .. .. ..$ name : chr "genotype" >>> .. .. .. .. .. ..$ ndims : int 2 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int [1:2] 709358 1 >>> .. .. .. .. .. ..$ prec : chr "byte" >>> .. .. .. .. .. ..$ dimids : num [1:2] 2 1 >>> .. .. .. .. .. ..$ units : chr "num_A_alleles" >>> .. .. .. .. .. ..$ longname : chr "genotype" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 2 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int [1:2] 709358 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ missval : int -1 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. ..- attr(*, "class")= chr "ncdf" >>> ..@ snpAnnot :Formal class 'SnpAnnotationDataFrame' >>> [package >>> "GWASTools"] with 11 slots >>> .. .. ..@ idCol : chr "snpID" >>> .. .. ..@ chromosomeCol : chr "chromosome" >>> .. .. ..@ positionCol : chr "position" >>> .. .. ..@ XchromCode : int 23 >>> .. .. ..@ YchromCode : int 25 >>> .. .. ..@ XYchromCode : int 24 >>> .. .. ..@ MchromCode : int 26 >>> .. .. ..@ varMetadata :'data.frame': 4 >>> obs. of 1 >>> variable: >>> .. .. .. ..$ labelDescription: chr [1:4] NA NA NA NA >>> .. .. ..@ data :'data.frame': >>> 709358 obs. of 4 >>> variables: >>> .. .. .. ..$ snpID : int [1:709358] 1 2 3 4 5 6 >>> 7 8 9 10 ... >>> .. .. .. ..$ chromosome: int [1:709358] 1 1 1 1 1 1 >>> 1 1 1 1 ... >>> .. .. .. ..$ position : int [1:709358] 82154 >>> 752566 752721 >>> 768448 >>> 776546 798959 800007 838555 846808 854250 ... >>> .. .. .. ..$ rsID : Factor w/ 709358 levels >>> "rs1000000","rs1000002",..: 444820 394558 397236 154397 >>> 130894 89309 >>> 528142 485618 444755 595849 ... >>> .. .. ..@ dimLabels : chr [1:2] "snps" >>> "variables" >>> .. .. ..@ .__classVersion__:Formal class 'Versions' >>> [package >>> "Biobase"] with 1 slots >>> .. .. .. .. ..@ .Data:List of 1 >>> .. .. .. .. .. ..$ : int [1:3] 1 1 0 >>> ..@ scanAnnot:Formal class >>> 'ScanAnnotationDataFrame' [package >>> "GWASTools"] with 6 slots >>> .. .. ..@ idCol : chr "scanID" >>> .. .. ..@ sexCol : chr "sex" >>> .. .. ..@ varMetadata :'data.frame': 4 >>> obs. of 1 >>> variable: >>> .. .. .. ..$ labelDescription: chr [1:4] NA NA NA NA >>> .. .. ..@ data :'data.frame': 1 >>> obs. of 4 >>> variables: >>> .. .. .. ..$ scanID : int 1 >>> .. .. .. ..$ subjectID: Factor w/ 1 level >>> "PT-PTWN": 1 >>> .. .. .. ..$ genoRunID: Factor w/ 1 level >>> "8820505004_R01C01": 1 >>> .. .. .. ..$ file : Factor w/ 1 level >>> "8820505004_R01C01.gtc.txt.___**_use": 1 >>> >>> .. .. ..@ dimLabels : chr [1:2] "scans" >>> "variables" >>> .. .. ..@ .__classVersion__:Formal class 'Versions' >>> [package >>> "Biobase"] with 1 slots >>> .. .. .. .. ..@ .Data:List of 1 >>> .. .. .. .. .. ..$ : int [1:3] 1 1 0 >>> >>> > str(blData) >>> Formal class 'IntensityData' [package "GWASTools"] with >>> 3 slots >>> ..@ data :Formal class 'NcdfIntensityReader' >>> [package >>> "GWASTools"] with 17 slots >>> .. .. ..@ snpDim : chr "snp" >>> .. .. ..@ scanDim : chr "sample" >>> .. .. ..@ snpIDvar : chr "snp" >>> .. .. ..@ chromosomeVar: chr "chromosome" >>> .. .. ..@ positionVar : chr "position" >>> .. .. ..@ scanIDvar : chr "sampleID" >>> .. .. ..@ qualityVar : chr "quality" >>> .. .. ..@ xVar : chr "X" >>> .. .. ..@ yVar : chr "Y" >>> .. .. ..@ bafVar : chr "BAlleleFreq" >>> .. .. ..@ lrrVar : chr "LogRRatio" >>> .. .. ..@ XchromCode : int 23 >>> .. .. ..@ YchromCode : int 25 >>> .. .. ..@ XYchromCode : int 24 >>> .. .. ..@ MchromCode : int 26 >>> .. .. ..@ filename : chr "tmp.baf.skea.nc >>> <http: tmp.baf.skea.nc=""> >>> <http: tmp.baf.skea.nc=""> <http: tmp.baf.skea.nc="">" >>> >>> >>> .. .. ..@ handler :List of 10 >>> .. .. .. ..$ id : int 458752 >>> .. .. .. ..$ ndims : int 2 >>> .. .. .. ..$ natts : int 2 >>> .. .. .. ..$ unlimdimid : num 1 >>> .. .. .. ..$ filename : chr "tmp.baf.skea.nc >>> <http: tmp.baf.skea.nc=""> >>> <http: tmp.baf.skea.nc=""> <http: tmp.baf.skea.nc="">" >>> >>> >>> .. .. .. ..$ varid2Rindex: num [1:7] 0 1 0 2 3 4 5 >>> .. .. .. ..$ writable : logi FALSE >>> .. .. .. ..$ dim :List of 2 >>> .. .. .. .. ..$ sample:List of 8 >>> .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. ..- attr(*, "class")= chr "dim.ncdf" >>> .. .. .. .. ..$ snp :List of 8 >>> .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 2 3 4 >>> 5 6 7 8 >>> 9 10 ... >>> .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. ..- attr(*, "class")= chr "dim.ncdf" >>> .. .. .. ..$ nvars : num 5 >>> .. .. .. ..$ var :List of 5 >>> .. .. .. .. ..$ sampleID :List of 16 >>> .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. ..$ name : chr "sampleID" >>> .. .. .. .. .. ..$ ndims : int 1 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int 1 >>> .. .. .. .. .. ..$ prec : chr "int" >>> .. .. .. .. .. ..$ dimids : num 1 >>> .. .. .. .. .. ..$ units : chr "id" >>> .. .. .. .. .. ..$ longname : chr "sampleID" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 1 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ missval : int 0 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ position :List of 16 >>> .. .. .. .. .. ..$ id : int 4 >>> .. .. .. .. .. ..$ name : chr "position" >>> .. .. .. .. .. ..$ ndims : int 1 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int 709358 >>> .. .. .. .. .. ..$ prec : chr "int" >>> .. .. .. .. .. ..$ dimids : num 2 >>> .. .. .. .. .. ..$ units : chr "bases" >>> .. .. .. .. .. ..$ longname : chr "position" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 1 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int 709358 >>> .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. ..$ missval : int -1 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ chromosome :List of 16 >>> .. .. .. .. .. ..$ id : int 5 >>> .. .. .. .. .. ..$ name : chr "chromosome" >>> .. .. .. .. .. ..$ ndims : int 1 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int 709358 >>> .. .. .. .. .. ..$ prec : chr "int" >>> .. .. .. .. .. ..$ dimids : num 2 >>> .. .. .. .. .. ..$ units : chr "id" >>> .. .. .. .. .. ..$ longname : chr "chromosome" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 1 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int 709358 >>> .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. ..$ missval : int -1 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ BAlleleFreq:List of 16 >>> .. .. .. .. .. ..$ id : int 6 >>> .. .. .. .. .. ..$ name : chr "BAlleleFreq" >>> .. .. .. .. .. ..$ ndims : int 2 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int [1:2] 709358 1 >>> .. .. .. .. .. ..$ prec : chr "float" >>> .. .. .. .. .. ..$ dimids : num [1:2] 2 1 >>> .. .. .. .. .. ..$ units : chr "intensity" >>> .. .. .. .. .. ..$ longname : chr "BAlleleFreq" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 2 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int [1:2] 709358 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ missval : num -9999 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. .. ..$ LogRRatio :List of 16 >>> .. .. .. .. .. ..$ id : int 7 >>> .. .. .. .. .. ..$ name : chr "LogRRatio" >>> .. .. .. .. .. ..$ ndims : int 2 >>> .. .. .. .. .. ..$ natts : int 2 >>> .. .. .. .. .. ..$ size : int [1:2] 709358 1 >>> .. .. .. .. .. ..$ prec : chr "float" >>> .. .. .. .. .. ..$ dimids : num [1:2] 2 1 >>> .. .. .. .. .. ..$ units : chr "intensity" >>> .. .. .. .. .. ..$ longname : chr "LogRRatio" >>> .. .. .. .. .. ..$ dims : list() >>> .. .. .. .. .. ..$ dim :List of 2 >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "snp" >>> .. .. .. .. .. .. .. ..$ len : int 709358 >>> .. .. .. .. .. .. .. ..$ unlim : logi FALSE >>> .. .. .. .. .. .. .. ..$ id : int 2 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 3 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : int >>> [1:709358(1d)] 1 >>> 2 3 4 5 >>> 6 7 8 9 10 ... >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. .. ..$ :List of 8 >>> .. .. .. .. .. .. .. ..$ name : chr "sample" >>> .. .. .. .. .. .. .. ..$ len : int 1 >>> .. .. .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. .. .. ..$ id : int 1 >>> .. .. .. .. .. .. .. ..$ dimvarid : num 1 >>> .. .. .. .. .. .. .. ..$ units : chr "count" >>> .. .. .. .. .. .. .. ..$ vals : logi NA >>> .. .. .. .. .. .. .. ..$ create_dimvar: logi TRUE >>> .. .. .. .. .. .. .. ..- attr(*, "class")= chr >>> "dim.ncdf" >>> .. .. .. .. .. ..$ varsize : int [1:2] 709358 1 >>> .. .. .. .. .. ..$ unlim : logi TRUE >>> .. .. .. .. .. ..$ missval : num -9999 >>> .. .. .. .. .. ..$ hasAddOffset: logi FALSE >>> .. .. .. .. .. ..$ hasScaleFact: logi FALSE >>> .. .. .. .. .. ..- attr(*, "class")= chr "var.ncdf" >>> .. .. .. ..- attr(*, "class")= chr "ncdf" >>> ..@ snpAnnot :Formal class 'SnpAnnotationDataFrame' >>> [package >>> "GWASTools"] with 11 slots >>> .. .. ..@ idCol : chr "snpID" >>> .. .. ..@ chromosomeCol : chr "chromosome" >>> .. .. ..@ positionCol : chr "position" >>> .. .. ..@ XchromCode : int 23 >>> .. .. ..@ YchromCode : int 25 >>> .. .. ..@ XYchromCode : int 24 >>> .. .. ..@ MchromCode : int 26 >>> .. .. ..@ varMetadata :'data.frame': 4 >>> obs. of 1 >>> variable: >>> .. .. .. ..$ labelDescription: chr [1:4] NA NA NA NA >>> .. .. ..@ data :'data.frame': >>> 709358 obs. of 4 >>> variables: >>> .. .. .. ..$ snpID : int [1:709358] 1 2 3 4 5 6 >>> 7 8 9 10 ... >>> .. .. .. ..$ chromosome: int [1:709358] 1 1 1 1 1 1 >>> 1 1 1 1 ... >>> .. .. .. ..$ position : int [1:709358] 82154 >>> 752566 752721 >>> 768448 >>> 776546 798959 800007 838555 846808 854250 ... >>> .. .. .. ..$ rsID : Factor w/ 709358 levels >>> "rs1000000","rs1000002",..: 444820 394558 397236 154397 >>> 130894 89309 >>> 528142 485618 444755 595849 ... >>> .. .. ..@ dimLabels : chr [1:2] "snps" >>> "variables" >>> .. .. ..@ .__classVersion__:Formal class 'Versions' >>> [package >>> "Biobase"] with 1 slots >>> .. .. .. .. ..@ .Data:List of 1 >>> .. .. .. .. .. ..$ : int [1:3] 1 1 0 >>> ..@ scanAnnot:Formal class >>> 'ScanAnnotationDataFrame' [package >>> "GWASTools"] with 6 slots >>> .. .. ..@ idCol : chr "scanID" >>> .. .. ..@ sexCol : chr "sex" >>> .. .. ..@ varMetadata :'data.frame': 4 >>> obs. of 1 >>> variable: >>> .. .. .. ..$ labelDescription: chr [1:4] NA NA NA NA >>> .. .. ..@ data :'data.frame': 1 >>> obs. of 4 >>> variables: >>> .. .. .. ..$ scanID : int 1 >>> .. .. .. ..$ subjectID: Factor w/ 1 level >>> "PT-PTWN": 1 >>> .. .. .. ..$ genoRunID: Factor w/ 1 level >>> "8820505004_R01C01": 1 >>> .. .. .. ..$ file : Factor w/ 1 level >>> "8820505004_R01C01.gtc.txt.___**_use": 1 >>> >>> .. .. ..@ dimLabels : chr [1:2] "scans" >>> "variables" >>> .. .. ..@ .__classVersion__:Formal class 'Versions' >>> [package >>> "Biobase"] with 1 slots >>> .. .. .. .. ..@ .Data:List of 1 >>> .. .. .. .. .. ..$ : int [1:3] 1 1 0 >>> >>> >>> >>> On Fri, May 31, 2013 at 2:41 PM, Sam Rose >>> <srose@broadinstitute.org>>> <mailto:srose@broadinstitute.**org <srose@broadinstitute.org="">> >>> <mailto:srose@broadinstitute._**_org>>> <mailto:srose@broadinstitute.**org <srose@broadinstitute.org="">>> >>> <mailto:srose@broadinstitute.>>> <mailto:srose@broadinstitute.>**____org >>> >>> <mailto:srose@broadinstitute._**_org>>> <mailto:srose@broadinstitute.**org <srose@broadinstitute.org="">>>>> >>> wrote: >>> >>> Looks like there was some problems reading the >>> file in on >>> my end, >>> some chromosomes didn't make it in probably from a >>> preprocessing >>> step on my end. I'll let you know if I can't >>> rectify. >>> >>> Thanks again for the help, >>> >>> Sam >>> >>> >>> On Thu, May 30, 2013 at 4:43 PM, Stephanie M. >>> Gogarten >>> <sdmorris@u.washington.edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu="">> >>> <mailto:sdmorris@u.washington.**__edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu=""> >>> >> >>> <mailto:sdmorris@u.washington.>>> <mailto:sdmorris@u.washington.**>____edu >>> <mailto:sdmorris@u.washington.**__edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu="">>>>> >>> wrote: >>> >>> Hi Sam, >>> >>> I need to add a more informative error message >>> - the >>> problem is >>> that no valid BAF values are reaching the call >>> to CNA >>> (baf.dat >>> is NULL). This could happen if the values of >>> snp.ids or >>> chrom.ids are invalid - these should all be >>> integer values >>> matching the contents of snpID and chromosome >>> in the netCDF >>> file. What values are you using for these >>> arguments? >>> >>> You will need to have LRR in the intensity >>> NetCDF file. A >>> portion of the code downstream from the error >>> you're >>> getting >>> uses LRR to filter potential anomalies. >>> >>> Stephanie >>> >>> >>> On 5/30/13 12:30 PM, Sam Rose wrote: >>> >>> Thank you for your previous help Stephanie. >>> >>> I am afraid I have another problem I can't >>> seem to >>> work out. >>> >>> I have gotten as far as reading in the >>> BAlleleFreq >>> and Geno >>> files into >>> their respective ncdf formats. I only have >>> the baf >>> data in >>> the intensity >>> ncdf file, do I need LRR too? When I run >>> the >>> anomDetectBAF() >>> function it >>> gives me this error: >>> >>> > anom <- anomDetectBAF(blData, genoData, >>> scan.ids=scan.ids, >>> chrom.ids=chrom.ids, snp.ids=snp.ids, >>> centromere=centromeres.hg19) >>> Error in CNA(as.vector(baf.dat), chr, >>> index, >>> data.type = >>> "logratio", >>> sampleid = snum) : >>> genomdat must be numeric >>> >>> I have checked and the data that I put in >>> to the >>> genotype >>> data file was >>> numeric and present as well as the baf >>> data. I'm >>> wondering >>> if you have >>> seen this error before and may potentially >>> know >>> what I can >>> do to rectify? >>> >>> Thanks, >>> Sam >>> >>> >>> On Wed, Apr 24, 2013 at 12:01 AM, >>> Stephanie M. Gogarten >>> <sdmorris@u.washington.edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu="">> >>> <mailto:sdmorris@u.washington.**__edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu=""> >>> >> >>> <mailto:sdmorris@u.washington.>>> <mailto:sdmorris@u.washington.**>____edu >>> <mailto:sdmorris@u.washington.**__edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu=""> >>> >>> >>> <mailto:sdmorris@u.washington>>> <mailto:sdmorris@u.washington>**. >>> <mailto:sdmorris@u.washington>>> <mailto:sdmorris@u.washington>**.__>____edu >>> >>> >>> <mailto:sdmorris@u.washington.>>> <mailto:sdmorris@u.washington.**>____edu >>> <mailto:sdmorris@u.washington.**__edu>>> <mailto:sdmorris@u.washington.**edu <sdmorris@u.washington.edu="">>>>>> >>> wrote: >>> >>> Hi Sam, >>> >>> Section 2 of the vignette "GWAS Data >>> Cleaning" >>> contains >>> an example >>> of how to import raw illumina data of >>> exactly >>> this type >>> into >>> GWASTools. The example data is >>> contained in >>> the package >>> "GWASdata." >>> >>> If you have any further questions >>> after >>> reading the >>> vignette, please >>> cc the bioconductor mailing list >>> (bioconductor@r-project.org >>> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >>> > >>> <mailto:bioconductor@r-__**project.org<bioconductor@r-__project.org> >>> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >>> >> >>> <mailto:bioconductor@r-____**project.org<biocondu ctor@r-____project.org=""> >>> <mailto:bioconductor@r-__**project.org<bioconductor@r-__project.org> >>> > >>> <mailto:bioconductor@r-__**project.org<bioconductor@r-__project.org> >>> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >>> >>> >>> >>> <mailto:bioconductor@r-______**project.org<bioconductor@r- ______project.org=""> >>> <mailto:bioconductor@r-____**project.org<bioconductor@r-__ __project.org=""> >>> > >>> <mailto:bioconductor@r-____**project.org<biocondu ctor@r-____project.org=""> >>> <mailto:bioconductor@r-__**project.org<bioconductor@r-__project.org> >>> >> >>> >>> >>> <mailto:bioconductor@r-____**project .org<bioconductor@r-____project.org=""> >>> <mailto:bioconductor@r-__**project.org<bioconductor@r-__project.org> >>> > >>> <mailto:bioconductor@r-__**project.org<bioconductor@r-__project.org> >>> <mailto:bioconductor@r-**project.org<bioconductor@r-project.org> >>> >>>>). >>> >>> >>> Section 7 may also be of use to you, >>> as it >>> deals with >>> chromosome >>> anomaly detection. >>> >>> best wishes, >>> Stephanie >>> >>> >>> On 4/23/13 7:54 PM, Sam Rose wrote: >>> >>> Hi Stephanie, >>> >>> My name is Sam Rose and I am >>> contacting >>> you the >>> GWASTools package in >>> Bioconductor of which it says you >>> are the >>> maintainer. >>> >>> I am trying to use the package to >>> call >>> mosaic CNVs >>> in my samples and >>> can't seem to get it to work. >>> >>> I'm wondering if you have an >>> example of >>> the raw >>> illumina data to >>> put in >>> there, and maybe examples of some >>> of the >>> things >>> required in the >>> 'ncdfAddData' command (i.e. >>> sample column, >>> col.nums). I have >>> created the >>> shell ncdf file, but beyond that >>> the >>> headers and >>> data formats >>> seem to be >>> giving me trouble so I just >>> though I would >>> ask. >>> >>> Our Illumina raw data files look >>> >> > > > -- > ----- > *Sam Rose, Stanley Center Research Associate II > Stanley Center for Psychiatric Research, The Broad Institute > 7 Cambridge Center, Cambridge, MA 02142* > 617.714.7853, srose@broadinstitute.org > > -- ----- *Sam Rose, Stanley Center Research Associate II Stanley Center for Psychiatric Research, The Broad Institute 7 Cambridge Center, Cambridge, MA 02142* 617.714.7853, srose@broadinstitute.org [[alternative HTML version deleted]]
SNP Annotation cdf GWASTools SNP Annotation cdf GWASTools • 1.4k views
ADD COMMENT
0
Entering edit mode
@stephanie-m-gogarten-5121
Last seen 28 days ago
University of Washington
Hi Sam, I haven't seen this error before. The relevant bit of code is this: dat$geno <- paste(dat$a1,dat$a2,sep="") new.names <- names(dat)[!is.element(names(dat),c("a1","a2"))] dat <- dat[,new.names] "dat" is the data.frame read from your FinalReport file. "a1" and "a2" columns are the two alleles, specified in the col.nums argument. If the data file had columns for "a1" and "a2" but not "geno," the code pastes "a1" and "a2" together to make a "geno" column, then removes the "a1" and "a2" columns. I don't know how "new.names" can contain invalid columns, since it should be a subset of the original column names. Maybe double check your values for col.nums? Otherwise the only way I can think to debug is to step through the code for ncdfAddData and see what's happening at this part. I could do that if you could send me the relevant file, but I'm guessing you can't do that for security reasons, so maybe you could try it. Stephanie On 7/3/13 12:59 PM, Sam Rose wrote: > Hi Stephanie, > > Quick question, do you have any experience with the following error in > the ncdfAddData() function? > Error in `[.data.frame`(dat, , new.names) : undefined columns selected > > traceback() > 4: stop("undefined columns selected") > 3: `[.data.frame`(dat, , new.names) > 2: dat[, new.names] > 1: ncdfAddData(path = ".", ncdf.filename = geno.nc.file, snp.annotation > = snp.annot, > scan.annotation = scan.annot, sep.type = "\t", skip.num = 1, > col.total = 16, col.nums = col.nums, scan.name.in.file = 0) > > I am using a new data set exactly the same as the last one which worked > and can't seem to figure out the problem behind this one. > > These are what the tops of my scan and snp annotation files look like: > snp: > snpID chromosome position rsID > 1 1 72017 rs4477212 > 2 1 524110 SNP1-524110 > 3 1 555149 SNP1-555149 > 4 1 559487 SNP1-559487 > 5 1 679049 rs4000335 > 6 1 713781 SNP1-713781 > 7 1 719495 SNP1-719495 > 8 1 742429 rs3094315 > 9 1 742584 rs3131972 > > scan: > scanID subjectID genoRunID sex file > 2 PT-JOPP 5491005008_R03C01 M > 5491005008_R03C01.gtc.txt.use > 3 PT-JOQ4 5434246116_R01C01 F > 5434246116_R01C01.gtc.txt.use > 4 PT-JOQ6 5491005134_R03C01 M > 5491005134_R03C01.gtc.txt.use > 5 PT-JOQB 5434078047_R04C01 F > 5434078047_R04C01.gtc.txt.use > 6 PT-JOQG 5491005152_R03C01 F > 5491005152_R03C01.gtc.txt.use > 7 PT-JOP2 5434246116_R03C01 F > 5434246116_R03C01.gtc.txt.use > 8 PT-JOQI 5491005134_R04C01 M > 5491005134_R04C01.gtc.txt.use > 9 PT-JOQL 5434246084_R04C01 M > 5434246084_R04C01.gtc.txt.use > 11 PT-JOQW 5491005061_R02C01 M > 5491005061_R02C01.gtc.txt.use > > Any help would be appreciated. > > Thanks, > Sam > > > > On Wed, Jun 12, 2013 at 9:45 AM, Sam Rose <srose at="" broadinstitute.org=""> <mailto:srose at="" broadinstitute.org="">> wrote: > > Hi Stephanie, > > After using your commands I was able to successfully use the package > without error. I think the error was within the column names, which > needed to be changed from genoRunID to scanID in order to be > recognized by the later commands. > > Thanks for all of your help it was very much appreciated. > > Best, > Sam > > > On Fri, Jun 7, 2013 at 3:27 PM, Stephanie M. Gogarten > <sdmorris at="" u.washington.edu="" <mailto:sdmorris="" at="" u.washington.edu="">> wrote: > > Hi Sam, > > I can't reproduce your error using the data you sent. Either > you had an error in creating your netCDF files, or the sample > you were working with doesn't have any usable data. Below is > the code I used; see if you can reproduce it with your other sample. > > library(GWASTools) > > scan.annot <- > read.table("8850270138_R01C01.__gtc.txt.scan_annotation", > colClasses=c("integer", > rep("character",4)), > header=TRUE) > scanAnnot <- ScanAnnotationDataFrame(scan.__annot) > > snp.annot <- > read.table("8850270138_R01C01.__gtc.txt.snp_annotation", > as.is <http: as.is="">=TRUE, header=TRUE) > snpAnnot <- SnpAnnotationDataFrame(snp.__annot) > > geno.nc.file <- "geno.nc <http: geno.nc="">" > ncdfCreate(ncdf.filename=geno.__nc.file, snp.annotation=snp.annot, > variables="genotype", n.samples=1, precision="single") > names(scan.annot)[3] <- "scanName" > names(snp.annot)[4] <- "snpName" > col.nums <- as.integer(c(1,7,8)) > names(col.nums) <- c("snp","a1","a2") > ncdfAddData(path=".", ncdf.filename=geno.nc.file, > snp.annotation=snp.annot, > scan.annotation=scan.annot, > sep.type="\t", skip.num=1, > col.total=16, col.nums=col.nums, > scan.name.in.file=0) > > bl.nc.file <- "bl.nc <http: bl.nc="">" > ncdfCreate(ncdf.filename=bl.__nc.file, snp.annotation=snp.annot, > variables=c("BAlleleFreq", "LogRRatio"), > n.samples=1, precision="single") > col.nums <- as.integer(c(1,15,16)) > names(col.nums) <- c("snp", "ballelefreq", "logrratio") > ncdfAddData(path=".", ncdf.filename=bl.nc.file, > snp.annotation=snp.annot, > scan.annotation=scan.annot, > sep.type="\t", skip.num=1, > col.total=16, col.nums=col.nums, > scan.name.in.file=0) > > genoData <- GenotypeData__NcdfGenotypeReadergeno.nc > <http: geno.nc="">.__file), > scanAnnot=scanAnnot, snpAnnot=snpAnnot) > blData <- IntensityData__NcdfIntensityReaderbl.nc > <http: bl.nc="">.__file), > scanAnnot=scanAnnot, snpAnnot=snpAnnot) > > scan.ids <- read.table("scan.ids")[,1] > snp.ids <- read.table("snp.ids")[,1] > chrom.ids <- read.table("chrom.ids")[,1] > seg <- anomSegmentBAF(blData, genoData, scan.ids=scan.ids, > chrom.ids=chrom.ids, snp.ids=snp.ids) > head(seg) > scanID chromosome left.index right.index num.mark seg.mean > 1 1 1 5 59479 17205 0.1582 > 2 1 2 59498 117436 17360 0.1596 > 3 1 3 117437 164865 14181 0.1595 > 4 1 4 164872 205472 12218 0.1640 > 5 1 5 205474 247715 13313 0.1602 > 6 1 6 247752 296212 15232 0.1629 > > best wishes, > Stephanie > > On 6/6/13 1:28 PM, Sam Rose wrote: > > Hi Stephanie, > > I am providing a CEU control sample data from the same > study, NA12878. I > just sent it to you in a dropbox link. Let me know if there > are any > questions. > > Best, > Sam > > > > > On Wed, Jun 5, 2013 at 12:11 PM, Stephanie M. Gogarten > <sdmorris at="" u.washington.edu=""> <mailto:sdmorris at="" u.washington.edu=""> > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>> wrote: > > It's not finding any BAF values that meet all the > criteria (snpID in > "snp.ids", chromosome in "chrom.ids", genotype is > heterozygous or > missing, BAF is non-missing). > > Is it possible for you to send me the data you're > using, along with > your values of "scan.ids", "chrom.ids", and "snp.ids"? > > Stephanie > > > On 6/4/13 2:23 PM, Sam Rose wrote: > > Yes this was intentional. I just wanted to get it > running for > one sample > and then expand to the rest. I was also limiting > this to only > autosomes > for the time being. > > This is the error message I get now: > > > seg <- anomSegmentBAF(blData, genoData, > scan.ids=scan.ids, > + chrom.ids=chrom.ids, > snp.ids=snp.ids) > Error in anomSegmentBAF(blData, genoData, scan.ids > = scan.ids, > chrom.ids > = chrom.ids, : > no valid BAF values for chromosomes in chrom.ids > > Hopefully this helps. > > Best, > Sam > > > On Tue, Jun 4, 2013 at 12:28 AM, Stephanie M. Gogarten > <sdmorris at="" u.washington.edu=""> <mailto:sdmorris at="" u.washington.edu=""> > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">> > <mailto:sdmorris at="" u.washington.=""> <mailto:sdmorris at="" u.washington.="">____edu > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>>> wrote: > > You have only one sample in your netCDF files > - is this > intentional? > > That should not cause your error, however. > Can you try > running with > the latest GWASTools version and tell me what > the new error > message is? > > Also, you should include in your scan annotation a > character vector > "sex" with values of "M" or "F". The code > treats males and > females > differently for X chromosome SNPs, and will > complain later > if this > variable is missing. > > Stephanie > > > On 6/3/13 4:04 PM, Sam Rose wrote: > > After checking again it still isn't quite > working. > > I am pasting below the str() results for > my intensity > and genotype > objects, maybe something in this can point > clearly to > what I am > doing > wrong. > > I am using an integer vector of 1 to the > total number > of snps > for my snp > id since it gave me some trouble before > when it wasn't > sorted. > > Best, > Sam > > > str(genoData) > Formal class 'GenotypeData' [package > "GWASTools"] with > 3 slots > ..@ data :Formal class > 'NcdfGenotypeReader' > [package > "GWASTools"] > with 13 slots > .. .. ..@ snpDim : chr "snp" > .. .. ..@ scanDim : chr "sample" > .. .. ..@ snpIDvar : chr "snp" > .. .. ..@ chromosomeVar: chr "chromosome" > .. .. ..@ positionVar : chr "position" > .. .. ..@ scanIDvar : chr "sampleID" > .. .. ..@ genotypeVar : chr "genotype" > .. .. ..@ XchromCode : int 23 > .. .. ..@ YchromCode : int 25 > .. .. ..@ XYchromCode : int 24 > .. .. ..@ MchromCode : int 26 > .. .. ..@ filename : chr > "tmp.geno.skea.nc <http: tmp.geno.skea.nc=""> > <http: tmp.geno.skea.nc=""> > <http: tmp.geno.skea.nc=""> > <http: tmp.geno.skea.nc="">" > > > .. .. ..@ handler :List of 10 > .. .. .. ..$ id : int 524288 > .. .. .. ..$ ndims : int 2 > .. .. .. ..$ natts : int 2 > .. .. .. ..$ unlimdimid : num 1 > .. .. .. ..$ filename : chr > "tmp.geno.skea.nc <http: tmp.geno.skea.nc=""> > <http: tmp.geno.skea.nc=""> > <http: tmp.geno.skea.nc=""> > <http: tmp.geno.skea.nc="">" > > .. .. .. ..$ varid2Rindex: num [1:6] 0 > 1 0 2 3 4 > .. .. .. ..$ writable : logi FALSE > .. .. .. ..$ dim :List of 2 > .. .. .. .. ..$ sample:List of 8 > .. .. .. .. .. ..$ name : chr > "sample" > .. .. .. .. .. ..$ len : int 1 > .. .. .. .. .. ..$ unlim : logi > TRUE > .. .. .. .. .. ..$ id : int 1 > .. .. .. .. .. ..$ dimvarid : num 1 > .. .. .. .. .. ..$ units : chr > "count" > .. .. .. .. .. ..$ vals : logi NA > .. .. .. .. .. ..$ create_dimvar: logi > TRUE > .. .. .. .. .. ..- attr(*, "class")= > chr "dim.ncdf" > .. .. .. .. ..$ snp :List of 8 > .. .. .. .. .. ..$ name : chr > "snp" > .. .. .. .. .. ..$ len : int > 709358 > .. .. .. .. .. ..$ unlim : logi > FALSE > .. .. .. .. .. ..$ id : int 2 > .. .. .. .. .. ..$ dimvarid : num 3 > .. .. .. .. .. ..$ units : chr > "count" > .. .. .. .. .. ..$ vals : int > [1:709358(1d)] 1 2 3 4 > 5 6 7 8 > 9 10 ... > .. .. .. .. .. ..$ create_dimvar: logi > TRUE > .. .. .. .. .. ..- attr(*, "class")= > chr "dim.ncdf" > .. .. .. ..$ nvars : num 4 > .. .. .. ..$ var :List of 4 > .. .. .. .. ..$ sampleID :List of 16 > .. .. .. .. .. ..$ id : int 2 > .. .. .. .. .. ..$ name : chr > "sampleID" > .. .. .. .. .. ..$ ndims : int 1 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int 1 > .. .. .. .. .. ..$ prec : chr "int" > .. .. .. .. .. ..$ dimids : num 1 > .. .. .. .. .. ..$ units : chr "id" > .. .. .. .. .. ..$ longname : chr > "sampleID" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 1 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "sample" > .. .. .. .. .. .. .. ..$ len > : int 1 > .. .. .. .. .. .. .. ..$ unlim > : logi TRUE > .. .. .. .. .. .. .. ..$ id > : int 1 > .. .. .. .. .. .. .. ..$ dimvarid > : num 1 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : logi NA > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int 1 > .. .. .. .. .. ..$ unlim : logi TRUE > .. .. .. .. .. ..$ missval : int 0 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ position :List of 16 > .. .. .. .. .. ..$ id : int 4 > .. .. .. .. .. ..$ name : chr > "position" > .. .. .. .. .. ..$ ndims : int 1 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > 709358 > .. .. .. .. .. ..$ prec : chr "int" > .. .. .. .. .. ..$ dimids : num 2 > .. .. .. .. .. ..$ units : chr > "bases" > .. .. .. .. .. ..$ longname : chr > "position" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 1 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > 709358 > .. .. .. .. .. ..$ unlim : logi > FALSE > .. .. .. .. .. ..$ missval : int -1 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ chromosome:List of 16 > .. .. .. .. .. ..$ id : int 5 > .. .. .. .. .. ..$ name : chr > "chromosome" > .. .. .. .. .. ..$ ndims : int 1 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > 709358 > .. .. .. .. .. ..$ prec : chr "int" > .. .. .. .. .. ..$ dimids : num 2 > .. .. .. .. .. ..$ units : chr "id" > .. .. .. .. .. ..$ longname : chr > "chromosome" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 1 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > 709358 > .. .. .. .. .. ..$ unlim : logi > FALSE > .. .. .. .. .. ..$ missval : int -1 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ genotype :List of 16 > .. .. .. .. .. ..$ id : int 6 > .. .. .. .. .. ..$ name : chr > "genotype" > .. .. .. .. .. ..$ ndims : int 2 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > [1:2] 709358 1 > .. .. .. .. .. ..$ prec : chr > "byte" > .. .. .. .. .. ..$ dimids : num > [1:2] 2 1 > .. .. .. .. .. ..$ units : chr > "num_A_alleles" > .. .. .. .. .. ..$ longname : chr > "genotype" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 2 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "sample" > .. .. .. .. .. .. .. ..$ len > : int 1 > .. .. .. .. .. .. .. ..$ unlim > : logi TRUE > .. .. .. .. .. .. .. ..$ id > : int 1 > .. .. .. .. .. .. .. ..$ dimvarid > : num 1 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : logi NA > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > [1:2] 709358 1 > .. .. .. .. .. ..$ unlim : logi TRUE > .. .. .. .. .. ..$ missval : int -1 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. ..- attr(*, "class")= chr "ncdf" > ..@ snpAnnot :Formal class > 'SnpAnnotationDataFrame' > [package > "GWASTools"] with 11 slots > .. .. ..@ idCol : chr "snpID" > .. .. ..@ chromosomeCol : chr > "chromosome" > .. .. ..@ positionCol : chr > "position" > .. .. ..@ XchromCode : int 23 > .. .. ..@ YchromCode : int 25 > .. .. ..@ XYchromCode : int 24 > .. .. ..@ MchromCode : int 26 > .. .. ..@ varMetadata > :'data.frame': 4 > obs. of 1 > variable: > .. .. .. ..$ labelDescription: chr > [1:4] NA NA NA NA > .. .. ..@ data :'data.frame': > 709358 obs. of 4 > variables: > .. .. .. ..$ snpID : int > [1:709358] 1 2 3 4 5 6 > 7 8 9 10 ... > .. .. .. ..$ chromosome: int > [1:709358] 1 1 1 1 1 1 > 1 1 1 1 ... > .. .. .. ..$ position : int > [1:709358] 82154 > 752566 752721 > 768448 > 776546 798959 800007 838555 846808 854250 ... > .. .. .. ..$ rsID : Factor w/ > 709358 levels > "rs1000000","rs1000002",..: 444820 394558 > 397236 154397 > 130894 89309 > 528142 485618 444755 595849 ... > .. .. ..@ dimLabels : chr [1:2] > "snps" > "variables" > .. .. ..@ .__classVersion__:Formal > class 'Versions' > [package > "Biobase"] with 1 slots > .. .. .. .. ..@ .Data:List of 1 > .. .. .. .. .. ..$ : int [1:3] 1 1 0 > ..@ scanAnnot:Formal class > 'ScanAnnotationDataFrame' [package > "GWASTools"] with 6 slots > .. .. ..@ idCol : chr "scanID" > .. .. ..@ sexCol : chr "sex" > .. .. ..@ varMetadata > :'data.frame': 4 > obs. of 1 > variable: > .. .. .. ..$ labelDescription: chr > [1:4] NA NA NA NA > .. .. ..@ data > :'data.frame': 1 > obs. of 4 > variables: > .. .. .. ..$ scanID : int 1 > .. .. .. ..$ subjectID: Factor w/ 1 > level "PT-PTWN": 1 > .. .. .. ..$ genoRunID: Factor w/ 1 level > "8820505004_R01C01": 1 > .. .. .. ..$ file : Factor w/ 1 level > "8820505004_R01C01.gtc.txt.______use": 1 > > .. .. ..@ dimLabels : chr [1:2] > "scans" > "variables" > .. .. ..@ .__classVersion__:Formal > class 'Versions' > [package > "Biobase"] with 1 slots > .. .. .. .. ..@ .Data:List of 1 > .. .. .. .. .. ..$ : int [1:3] 1 1 0 > > > str(blData) > Formal class 'IntensityData' [package > "GWASTools"] with > 3 slots > ..@ data :Formal class > 'NcdfIntensityReader' > [package > "GWASTools"] with 17 slots > .. .. ..@ snpDim : chr "snp" > .. .. ..@ scanDim : chr "sample" > .. .. ..@ snpIDvar : chr "snp" > .. .. ..@ chromosomeVar: chr "chromosome" > .. .. ..@ positionVar : chr "position" > .. .. ..@ scanIDvar : chr "sampleID" > .. .. ..@ qualityVar : chr "quality" > .. .. ..@ xVar : chr "X" > .. .. ..@ yVar : chr "Y" > .. .. ..@ bafVar : chr "BAlleleFreq" > .. .. ..@ lrrVar : chr "LogRRatio" > .. .. ..@ XchromCode : int 23 > .. .. ..@ YchromCode : int 25 > .. .. ..@ XYchromCode : int 24 > .. .. ..@ MchromCode : int 26 > .. .. ..@ filename : chr > "tmp.baf.skea.nc <http: tmp.baf.skea.nc=""> > <http: tmp.baf.skea.nc=""> > <http: tmp.baf.skea.nc=""> > <http: tmp.baf.skea.nc="">" > > > .. .. ..@ handler :List of 10 > .. .. .. ..$ id : int 458752 > .. .. .. ..$ ndims : int 2 > .. .. .. ..$ natts : int 2 > .. .. .. ..$ unlimdimid : num 1 > .. .. .. ..$ filename : chr > "tmp.baf.skea.nc <http: tmp.baf.skea.nc=""> > <http: tmp.baf.skea.nc=""> > <http: tmp.baf.skea.nc=""> > <http: tmp.baf.skea.nc="">" > > > .. .. .. ..$ varid2Rindex: num [1:7] 0 > 1 0 2 3 4 5 > .. .. .. ..$ writable : logi FALSE > .. .. .. ..$ dim :List of 2 > .. .. .. .. ..$ sample:List of 8 > .. .. .. .. .. ..$ name : chr > "sample" > .. .. .. .. .. ..$ len : int 1 > .. .. .. .. .. ..$ unlim : logi > TRUE > .. .. .. .. .. ..$ id : int 1 > .. .. .. .. .. ..$ dimvarid : num 1 > .. .. .. .. .. ..$ units : chr > "count" > .. .. .. .. .. ..$ vals : logi NA > .. .. .. .. .. ..$ create_dimvar: logi > TRUE > .. .. .. .. .. ..- attr(*, "class")= > chr "dim.ncdf" > .. .. .. .. ..$ snp :List of 8 > .. .. .. .. .. ..$ name : chr > "snp" > .. .. .. .. .. ..$ len : int > 709358 > .. .. .. .. .. ..$ unlim : logi > FALSE > .. .. .. .. .. ..$ id : int 2 > .. .. .. .. .. ..$ dimvarid : num 3 > .. .. .. .. .. ..$ units : chr > "count" > .. .. .. .. .. ..$ vals : int > [1:709358(1d)] 1 2 3 4 > 5 6 7 8 > 9 10 ... > .. .. .. .. .. ..$ create_dimvar: logi > TRUE > .. .. .. .. .. ..- attr(*, "class")= > chr "dim.ncdf" > .. .. .. ..$ nvars : num 5 > .. .. .. ..$ var :List of 5 > .. .. .. .. ..$ sampleID :List of 16 > .. .. .. .. .. ..$ id : int 2 > .. .. .. .. .. ..$ name : chr > "sampleID" > .. .. .. .. .. ..$ ndims : int 1 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int 1 > .. .. .. .. .. ..$ prec : chr "int" > .. .. .. .. .. ..$ dimids : num 1 > .. .. .. .. .. ..$ units : chr "id" > .. .. .. .. .. ..$ longname : chr > "sampleID" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 1 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "sample" > .. .. .. .. .. .. .. ..$ len > : int 1 > .. .. .. .. .. .. .. ..$ unlim > : logi TRUE > .. .. .. .. .. .. .. ..$ id > : int 1 > .. .. .. .. .. .. .. ..$ dimvarid > : num 1 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : logi NA > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int 1 > .. .. .. .. .. ..$ unlim : logi TRUE > .. .. .. .. .. ..$ missval : int 0 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ position :List of 16 > .. .. .. .. .. ..$ id : int 4 > .. .. .. .. .. ..$ name : chr > "position" > .. .. .. .. .. ..$ ndims : int 1 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > 709358 > .. .. .. .. .. ..$ prec : chr "int" > .. .. .. .. .. ..$ dimids : num 2 > .. .. .. .. .. ..$ units : chr > "bases" > .. .. .. .. .. ..$ longname : chr > "position" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 1 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > 709358 > .. .. .. .. .. ..$ unlim : logi > FALSE > .. .. .. .. .. ..$ missval : int -1 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ chromosome :List of 16 > .. .. .. .. .. ..$ id : int 5 > .. .. .. .. .. ..$ name : chr > "chromosome" > .. .. .. .. .. ..$ ndims : int 1 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > 709358 > .. .. .. .. .. ..$ prec : chr "int" > .. .. .. .. .. ..$ dimids : num 2 > .. .. .. .. .. ..$ units : chr "id" > .. .. .. .. .. ..$ longname : chr > "chromosome" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 1 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > 709358 > .. .. .. .. .. ..$ unlim : logi > FALSE > .. .. .. .. .. ..$ missval : int -1 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ BAlleleFreq:List of 16 > .. .. .. .. .. ..$ id : int 6 > .. .. .. .. .. ..$ name : chr > "BAlleleFreq" > .. .. .. .. .. ..$ ndims : int 2 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > [1:2] 709358 1 > .. .. .. .. .. ..$ prec : chr > "float" > .. .. .. .. .. ..$ dimids : num > [1:2] 2 1 > .. .. .. .. .. ..$ units : chr > "intensity" > .. .. .. .. .. ..$ longname : chr > "BAlleleFreq" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 2 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "sample" > .. .. .. .. .. .. .. ..$ len > : int 1 > .. .. .. .. .. .. .. ..$ unlim > : logi TRUE > .. .. .. .. .. .. .. ..$ id > : int 1 > .. .. .. .. .. .. .. ..$ dimvarid > : num 1 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : logi NA > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > [1:2] 709358 1 > .. .. .. .. .. ..$ unlim : logi TRUE > .. .. .. .. .. ..$ missval : num -9999 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. .. ..$ LogRRatio :List of 16 > .. .. .. .. .. ..$ id : int 7 > .. .. .. .. .. ..$ name : chr > "LogRRatio" > .. .. .. .. .. ..$ ndims : int 2 > .. .. .. .. .. ..$ natts : int 2 > .. .. .. .. .. ..$ size : int > [1:2] 709358 1 > .. .. .. .. .. ..$ prec : chr > "float" > .. .. .. .. .. ..$ dimids : num > [1:2] 2 1 > .. .. .. .. .. ..$ units : chr > "intensity" > .. .. .. .. .. ..$ longname : chr > "LogRRatio" > .. .. .. .. .. ..$ dims : list() > .. .. .. .. .. ..$ dim :List of 2 > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "snp" > .. .. .. .. .. .. .. ..$ len > : int 709358 > .. .. .. .. .. .. .. ..$ unlim > : logi FALSE > .. .. .. .. .. .. .. ..$ id > : int 2 > .. .. .. .. .. .. .. ..$ dimvarid > : num 3 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : int > [1:709358(1d)] 1 > 2 3 4 5 > 6 7 8 9 10 ... > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. .. ..$ :List of 8 > .. .. .. .. .. .. .. ..$ name > : chr "sample" > .. .. .. .. .. .. .. ..$ len > : int 1 > .. .. .. .. .. .. .. ..$ unlim > : logi TRUE > .. .. .. .. .. .. .. ..$ id > : int 1 > .. .. .. .. .. .. .. ..$ dimvarid > : num 1 > .. .. .. .. .. .. .. ..$ units > : chr "count" > .. .. .. .. .. .. .. ..$ vals > : logi NA > .. .. .. .. .. .. .. ..$ > create_dimvar: logi TRUE > .. .. .. .. .. .. .. ..- attr(*, > "class")= chr > "dim.ncdf" > .. .. .. .. .. ..$ varsize : int > [1:2] 709358 1 > .. .. .. .. .. ..$ unlim : logi TRUE > .. .. .. .. .. ..$ missval : num -9999 > .. .. .. .. .. ..$ hasAddOffset: logi > FALSE > .. .. .. .. .. ..$ hasScaleFact: logi > FALSE > .. .. .. .. .. ..- attr(*, "class")= > chr "var.ncdf" > .. .. .. ..- attr(*, "class")= chr "ncdf" > ..@ snpAnnot :Formal class > 'SnpAnnotationDataFrame' > [package > "GWASTools"] with 11 slots > .. .. ..@ idCol : chr "snpID" > .. .. ..@ chromosomeCol : chr > "chromosome" > .. .. ..@ positionCol : chr > "position" > .. .. ..@ XchromCode : int 23 > .. .. ..@ YchromCode : int 25 > .. .. ..@ XYchromCode : int 24 > .. .. ..@ MchromCode : int 26 > .. .. ..@ varMetadata > :'data.frame': 4 > obs. of 1 > variable: > .. .. .. ..$ labelDescription: chr > [1:4] NA NA NA NA > .. .. ..@ data :'data.frame': > 709358 obs. of 4 > variables: > .. .. .. ..$ snpID : int > [1:709358] 1 2 3 4 5 6 > 7 8 9 10 ... > .. .. .. ..$ chromosome: int > [1:709358] 1 1 1 1 1 1 > 1 1 1 1 ... > .. .. .. ..$ position : int > [1:709358] 82154 > 752566 752721 > 768448 > 776546 798959 800007 838555 846808 854250 ... > .. .. .. ..$ rsID : Factor w/ > 709358 levels > "rs1000000","rs1000002",..: 444820 394558 > 397236 154397 > 130894 89309 > 528142 485618 444755 595849 ... > .. .. ..@ dimLabels : chr [1:2] > "snps" > "variables" > .. .. ..@ .__classVersion__:Formal > class 'Versions' > [package > "Biobase"] with 1 slots > .. .. .. .. ..@ .Data:List of 1 > .. .. .. .. .. ..$ : int [1:3] 1 1 0 > ..@ scanAnnot:Formal class > 'ScanAnnotationDataFrame' [package > "GWASTools"] with 6 slots > .. .. ..@ idCol : chr "scanID" > .. .. ..@ sexCol : chr "sex" > .. .. ..@ varMetadata > :'data.frame': 4 > obs. of 1 > variable: > .. .. .. ..$ labelDescription: chr > [1:4] NA NA NA NA > .. .. ..@ data > :'data.frame': 1 > obs. of 4 > variables: > .. .. .. ..$ scanID : int 1 > .. .. .. ..$ subjectID: Factor w/ 1 > level "PT-PTWN": 1 > .. .. .. ..$ genoRunID: Factor w/ 1 level > "8820505004_R01C01": 1 > .. .. .. ..$ file : Factor w/ 1 level > "8820505004_R01C01.gtc.txt.______use": 1 > > .. .. ..@ dimLabels : chr [1:2] > "scans" > "variables" > .. .. ..@ .__classVersion__:Formal > class 'Versions' > [package > "Biobase"] with 1 slots > .. .. .. .. ..@ .Data:List of 1 > .. .. .. .. .. ..$ : int [1:3] 1 1 0 > > > > On Fri, May 31, 2013 at 2:41 PM, Sam Rose > <srose at="" broadinstitute.org=""> <mailto:srose at="" broadinstitute.org=""> > <mailto:srose at="" broadinstitute.__org=""> <mailto:srose at="" broadinstitute.org="">> > <mailto:srose at="" broadinstitute.=""> <mailto:srose at="" broadinstitute.="">____org > <mailto:srose at="" broadinstitute.__org=""> <mailto:srose at="" broadinstitute.org="">>> > <mailto:srose at="" broadinstitute=""> <mailto:srose at="" broadinstitute="">. > <mailto:srose at="" broadinstitute=""> <mailto:srose at="" broadinstitute="">.>______org > > <mailto:srose at="" broadinstitute.=""> <mailto:srose at="" broadinstitute.="">____org > <mailto:srose at="" broadinstitute.__org=""> <mailto:srose at="" broadinstitute.org="">>>>> wrote: > > Looks like there was some problems > reading the > file in on > my end, > some chromosomes didn't make it in > probably from a > preprocessing > step on my end. I'll let you know if > I can't rectify. > > Thanks again for the help, > > Sam > > > On Thu, May 30, 2013 at 4:43 PM, > Stephanie M. Gogarten > <sdmorris at="" u.washington.edu=""> <mailto:sdmorris at="" u.washington.edu=""> > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">> > <mailto:sdmorris at="" u.washington.=""> <mailto:sdmorris at="" u.washington.="">____edu > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>> > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">. > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">.__>____edu > <mailto:sdmorris at="" u.washington.=""> <mailto:sdmorris at="" u.washington.="">____edu > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>>>> wrote: > > Hi Sam, > > I need to add a more informative > error message > - the > problem is > that no valid BAF values are > reaching the call > to CNA > (baf.dat > is NULL). This could happen if > the values of > snp.ids or > chrom.ids are invalid - these > should all be > integer values > matching the contents of snpID > and chromosome > in the netCDF > file. What values are you using > for these > arguments? > > You will need to have LRR in the > intensity > NetCDF file. A > portion of the code downstream > from the error > you're > getting > uses LRR to filter potential > anomalies. > > Stephanie > > > On 5/30/13 12:30 PM, Sam Rose wrote: > > Thank you for your previous > help Stephanie. > > I am afraid I have another > problem I can't > seem to > work out. > > I have gotten as far as > reading in the > BAlleleFreq > and Geno > files into > their respective ncdf > formats. I only have > the baf > data in > the intensity > ncdf file, do I need LRR too? > When I run the > anomDetectBAF() > function it > gives me this error: > > > anom <- > anomDetectBAF(blData, genoData, > scan.ids=scan.ids, > chrom.ids=chrom.ids, > snp.ids=snp.ids, > centromere=centromeres.hg19) > Error in > CNA(as.vector(baf.dat), chr, index, > data.type = > "logratio", > sampleid = snum) : > genomdat must be numeric > > I have checked and the data > that I put in > to the > genotype > data file was > numeric and present as well > as the baf > data. I'm > wondering > if you have > seen this error before and > may potentially > know > what I can > do to rectify? > > Thanks, > Sam > > > On Wed, Apr 24, 2013 at 12:01 AM, > Stephanie M. Gogarten > <sdmorris at="" u.washington.edu=""> <mailto:sdmorris at="" u.washington.edu=""> > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">> > <mailto:sdmorris at="" u.washington.=""> <mailto:sdmorris at="" u.washington.="">____edu > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>> > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">. > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">.__>____edu > <mailto:sdmorris at="" u.washington.=""> <mailto:sdmorris at="" u.washington.="">____edu > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>>> > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington=""> > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">>__. > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington=""> > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">>__.__>____edu > > > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">. > <mailto:sdmorris at="" u.washington=""> <mailto:sdmorris at="" u.washington="">.__>____edu > <mailto:sdmorris at="" u.washington.=""> <mailto:sdmorris at="" u.washington.="">____edu > <mailto:sdmorris at="" u.washington.__edu=""> <mailto:sdmorris at="" u.washington.edu="">>>>>> wrote: > > Hi Sam, > > Section 2 of the > vignette "GWAS Data > Cleaning" > contains > an example > of how to import raw > illumina data of > exactly > this type > into > GWASTools. The example > data is > contained in > the package > "GWASdata." > > If you have any further > questions after > reading the > vignette, please > cc the bioconductor > mailing list > (bioconductor at r-project.org > <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">>> > <mailto:bioconductor at="" r-______project.org=""> <mailto:bioconductor at="" r-____project.org=""> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org="">> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">>>> > > <mailto:bioconductor at="" r-________project.org=""> <mailto:bioconductor at="" r-______project.org=""> > <mailto:bioconductor at="" r-______project.org=""> <mailto:bioconductor at="" r-____project.org="">> > <mailto:bioconductor at="" r-______project.org=""> <mailto:bioconductor at="" r-____project.org=""> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org="">>> > > > > <mailto:bioconductor at="" r-______project.org=""> <mailto:bioconductor at="" r-____project.org=""> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org="">> > <mailto:bioconductor at="" r-____project.org=""> <mailto:bioconductor at="" r-__project.org=""> > <mailto:bioconductor at="" r-__project.org=""> <mailto:bioconductor at="" r-project.org="">>>>>). > > > Section 7 may also be of > use to you, > as it > deals with > chromosome > anomaly detection. > > best wishes, > Stephanie > > > On 4/23/13 7:54 PM, Sam > Rose wrote: > > Hi Stephanie, > > My name is Sam Rose > and I am > contacting > you the > GWASTools package in > Bioconductor of > which it says you > are the > maintainer. > > I am trying to use > the package to > call > mosaic CNVs > in my samples and > can't seem to get it > to work. > > I'm wondering if you > have an > example of > the raw > illumina data to > put in > there, and maybe > examples of some > of the > things > required in the > 'ncdfAddData' > command (i.e. > sample column, > col.nums). I have > created the > shell ncdf file, but > beyond that the > headers and > data formats > seem to be > giving me trouble so > I just > though I would > ask. > > Our Illumina raw > data files look > > > > > -- > ----- > *Sam Rose, Stanley Center Research Associate II > Stanley Center for Psychiatric Research, The Broad Institute > 7 Cambridge Center, Cambridge, MA 02142* > 617.714.7853, srose at broadinstitute.org <mailto:srose at="" broadinstitute.org=""> > > > > > -- > ----- > *Sam Rose, Stanley Center Research Associate II > Stanley Center for Psychiatric Research, The Broad Institute > 7 Cambridge Center, Cambridge, MA 02142* > 617.714.7853, srose at broadinstitute.org <mailto:srose at="" broadinstitute.org=""> >
ADD COMMENT

Login before adding your answer.

Traffic: 1062 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6