problems of GenABEL

0

Entering edit mode

Ping-Hsun Hsieh ▴ 30

@ping-hsun-hsieh-3315

Last seen 9.7 years ago

Dear R/BioC experts, I am interested in the R package âGenABELâ and would like to test it on my dataset, but not successful. My toy dataset, 9 samples x 1000 SNPs, were successfully converted into gwaa.data class, however, its coding does not look right to me. Here is part of the genotype file. name chr pos strand MDSNP_02 MDSNP_04 MDSNP_06 MDSNP_07 MDSNP_08 MDSNP_10 MDSNP_11 MDSNP_12 MDSNP_15 SNP_A-2131660 1 1145994 + CT CT TT TT TT TT TT TT CT SNP_A-1967418 1 2224111 + GG GG GG GG AG AG GG GG GG SNP_A-1969580 1 2319424 + GG GG GG GG GG GG GG GG GG SNP_A-4263484 1 2543484 + TT CT CT CC CC TT TT TT CT SNP_A-1978185 1 2926730 - CC CC CC CC CC CC CC CC CC SNP_A-4264431 1 2941694 - CT CC CC CC CC CC CT CT CC SNP_A-1980898 1 3084986 - GG GG GG GG GG GG CG CG GG SNP_A-1983139 1 3155127 + AC AA AA AA AA AA AA AA AA The coding of the first sample, for first 9 SNPs only. >as.character(toydf@gtdata)[1,1:9] SNP_A-2131660 SNP_A-1967418 SNP_A-1969580 SNP_A-4263484 SNP_A-1978185 "T/C" "G/G" "1/1" "T/T" "1/1" SNP_A-4264431 SNP_A-1980898 SNP_A-1983139 SNP_A-4265735 "C/T" "G/G" "A/C" "C/T" The coding of the third SNP, âSNP_A-1969580â, for all 9 samples. > as.character(toydf@gtdata)[1:9,3] MDSNP_02 MDSNP_04 MDSNP_06 MDSNP_07 MDSNP_08 "1/1" "1/1" "1/1" "1/1" "1/1" MDSNP_10 MDSNP_11 MDSNP_12 MDSNP_15 "1/1" "1/1" "1/1" "1/1" As you can see, for example, SNP_A-1969580 are GG across 9 samples. Why does the coding of this SNP show â1/1â, rather than âG/Gâ? The other question I have, also related to the above question, is if it is required to have DNA bases in the genotype file. Could I use AA,AB, and BB as coding scheme, rather than the exact bases in GenABEL? Will it give troubles/errors? Thanks for your answer/response in advance! Best Regards, Mike [[alternative HTML version deleted]]

SNP SNP • 1.6k views

ADD COMMENT • link updated 15.0 years ago by Vincent J. Carey, Jr. 6.7k • written 15.0 years ago by Ping-Hsun Hsieh ▴ 30

0

Entering edit mode

Vincent J. Carey, Jr. 6.7k

@vincent-j-carey-jr-4

Last seen 11 hours ago

United States

GenABEL is not part of Bioconductor. The authors and their email addresses are noted in the DESCRIPTION file, visible through help(package="GenABEL"). Your best bet is to contact them. Additional details seem necessary to help you -- what exactly did you use to read the file you describe, and what is your sessionInfo()? On Mon, May 18, 2009 at 8:06 PM, Ping-Hsun Hsieh <hsiehp@ohsu.edu> wrote: > > > Dear R/BioC experts, > > > > I am interested in the R package GenABEL and would like to test it on my > dataset, but not successful. > > My toy dataset, 9 samples x 1000 SNPs, were successfully converted into > gwaa.data class, however, its coding does not look right to me. > > > > Here is part of the genotype file. > > name chr pos strand MDSNP_02 MDSNP_04 > MDSNP_06 MDSNP_07 MDSNP_08 MDSNP_10 > MDSNP_11 MDSNP_12 MDSNP_15 > > SNP_A-2131660 1 1145994 + > CT CT TT TT TT TT > TT TT CT > > SNP_A-1967418 1 2224111 + > GG GG GG GG AG AG GG > GG GG > > SNP_A-1969580 1 2319424 + > GG GG GG GG GG GG GG > GG GG > > SNP_A-4263484 1 2543484 + > TT CT CT CC CC TT > TT TT CT > > SNP_A-1978185 1 2926730 - > CC CC CC CC CC CC > CC CC CC > > SNP_A-4264431 1 2941694 - > CT CC CC CC CC CC > CT CT CC > > SNP_A-1980898 1 3084986 - > GG GG GG GG GG GG CG > CG GG > > SNP_A-1983139 1 3155127 + > AC AA AA AA AA AA > AA AA AA > > > > The coding of the first sample, for first 9 SNPs only. > > >as.character(toydf@gtdata)[1,1:9] > > SNP_A-2131660 SNP_A-1967418 SNP_A-1969580 > SNP_A-4263484 SNP_A-1978185 > > "T/C" "G/G" > "1/1" "T/T" > "1/1" > > SNP_A-4264431 SNP_A-1980898 SNP_A-1983139 > SNP_A-4265735 > > "C/T" "G/G" > "A/C" "C/T" > > > > The coding of the third SNP, SNP_A-1969580, for all 9 samples. > > > as.character(toydf@gtdata)[1:9,3] > > MDSNP_02 MDSNP_04 MDSNP_06 MDSNP_07 MDSNP_08 > > "1/1" "1/1" "1/1" > "1/1" "1/1" > > MDSNP_10 MDSNP_11 MDSNP_12 MDSNP_15 > > "1/1" "1/1" "1/1" > "1/1" > > > > As you can see, for example, SNP_A-1969580 are GG across 9 samples. Why > does the coding of this SNP show 1/1, rather than G/G? > > > > The other question I have, also related to the above question, is if it is > required to have DNA bases in the genotype file. > > Could I use AA,AB, and BB as coding scheme, rather than the exact bases in > GenABEL? Will it give troubles/errors? > > > > Thanks for your answer/response in advance! > > > > Best Regards, > > Mike > > > > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Vincent Carey, PhD Biostatistics, Channing Lab 617 525 2265 [[alternative HTML version deleted]]

ADD COMMENT • link 15.0 years ago Vincent J. Carey, Jr. 6.7k

Login before adding your answer.