Question: Creating Intensity GDS file for the R package GWAStools
0
gravatar for catbriggsm
2.1 years ago by
catbriggsm0
catbriggsm0 wrote:

I am attempting to reproduce the Mis-annotated Sex Check from the GWAStools Data Cleaning Document with my own data. (https://bioconductor.org/packages/devel/bioc/vignettes/GWASTools/inst/doc/DataCleaning.pdf, Pgs. 46-48). I am having trouble creating the Intensity GDS file in R from the available files I have. I have both .IDAT files and Plink ped/map files. 

 

Below is my code to create the Genotype GDS from the Plink Files

ped.fn <- "F:/requested.idatfiles/kids.ped"
 map.fn <- "F:/requested.idatfiles/kids.map"

snpgdsPED2GDS(ped.fn, map.fn, "test.gds") 

genofile <- openfn.gds("test.gds")
genofile
File: F:\requested.idatfiles\test.gds (5.3M)
+    [  ] *
|--+ sample.id   { Str8 796 ZIP_ra(21.4%), 1.6K }
|--+ snp.id   { Int32 26528 ZIP_ra(34.6%), 35.9K }
|--+ snp.rs.id   { Str8 26528 ZIP_ra(36.1%), 112.2K }
|--+ snp.position   { Int32 26528 ZIP_ra(86.7%), 89.8K }
|--+ snp.chromosome   { Int32 26528 ZIP_ra(0.13%), 149B } *
|--+ snp.allele   { Str8 26528 ZIP_ra(15.5%), 16.1K }
|--+ genotype   { Bit2 796x26528, 5.0M } *
\--+ sample.annot   [ data.frame ] *
   |--+ family   { Str8 796 ZIP_ra(45.3%), 1.5K }
   |--+ father   { Str8 796 ZIP_ra(2.01%), 39B }
   |--+ mother   { Str8 796 ZIP_ra(2.01%), 39B }
   |--+ sex   { Str8 796 ZIP_ra(13.7%), 225B }
   \--+ phenotype   { Str8 796 ZIP_ra(1.59%), 45B }
snpgdsSummary("test.gds")

(gds <- GdsGenotypeReader(genofile))

scanID <- getScanID(gds)
family <- getVariable(gds, "sample.annot/family")
father <- getVariable(gds, "sample.annot/father")

mother <- getVariable(gds, "sample.annot/mother")

sex <- getVariable(gds, "sample.annot/sex")
sex[sex == ""] <- NA # sex must be coded as M/F/NA
phenotype <- getVariable(gds, "sample.annot/phenotype")
scanAnnot <- ScanAnnotationDataFrame(data.frame(scanID, father, mother,
                                                   sex, phenotype,
                                                  stringsAsFactors=FALSE))
 snpID <- getSnpID(gds)
 chromosome <- getChromosome(gds)
position <- getPosition(gds)
alleleA <- getAlleleA(gds)
alleleB <- getAlleleB(gds)
rsID <- getVariable(gds, "snp.rs.id")
snpAnnot <- SnpAnnotationDataFrame(data.frame(snpID, chromosome, position,
                                                 rsID, alleleA, alleleB,
                                                 stringsAsFactors=FALSE)
                                   ,YchromCode=as.integer(25), XchromCode=as.integer(23))
genoData <- GenotypeData(gds, scanAnnot=scanAnnot, snpAnnot=snpAnnot)

 

I am having trouble creating the Intensity File from the IDAT files I have. I read them into R through the crlmm package as follows, but can't find documentation on how to convert it to a Intensity GDS file. Should I be using a different package/function? Is there documentation online I can follow?

idats<-readIdatFiles(sampleSheet=NULL, arrayNames=NULL, ids=NULL, path=path.all,
              arrayInfoColNames=list(barcode="SentrixBarcode_A",
                                     position="SentrixPosition_A"),
              highDensity=FALSE, sep="_",
              fileExt=list(green="Grn.idat", red="Red.idat"),
              saveDate=FALSE, verbose=TRUE)

 

gwastools idat gds format • 719 views
ADD COMMENTlink modified 2.1 years ago by Stephanie M. Gogarten670 • written 2.1 years ago by catbriggsm0
Answer: Creating Intensity GDS file for the R package GWAStools
0
gravatar for James W. MacDonald
2.1 years ago by
United States
James W. MacDonald50k wrote:

The Idat files just contain the raw green and red signals. whereas the GDS file is expecting to get the SNP calls. You could use the CRLMM package to generate the SNP calls and then make a GDS file, but if you already have the SNP calls from some other source it seems like extra work that might not be that useful.

ADD COMMENTlink written 2.1 years ago by James W. MacDonald50k
Answer: Creating Intensity GDS file for the R package GWAStools
0
gravatar for Stephanie M. Gogarten
2.1 years ago by
University of Washington
Stephanie M. Gogarten670 wrote:

From a quick look at the documentation for crlmm, I think the "R" and "G" intensities output by readIdatFiles are analogous to the raw X and Y intensities output by Illumina's GenomeStudio (which is what we based the GWASTools input on). You will want to normalize the intensities before using them for the sex check. To get the data into GDS, you can either write a text file in the format expected by createDataFile, or create the GDS file yourself using commands from the gdsfmt package.

Similarly, I think you can use crlmm's calculateRBaf function to generate LRR and BAF from your IDAT files (those are used later in the GWASTools vignette).

ADD COMMENTlink written 2.1 years ago by Stephanie M. Gogarten670
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 155 users visited in the last hour