Question: Bioconductor / plotting SNPs, runs of homozygosity
0
gravatar for Min-Han Tan
9.4 years ago by
Min-Han Tan40
Min-Han Tan40 wrote:
Good afternoon all, Sorry to trouble you all. I am quite new to plotting high throughput SNPs / chromosome locations, and am trying to visualize some SNP data generated by Plink. (Illumina 330K) I am wondering which package would be most useful (and ideally Illumina friendly). I looked at geneplotter, and SNPchip, but am not sure whether there was a quick way to start. For e.g. it looks like SNPchip is more Affymetrix based. Essentially, the data is runs of homozygosity. I have start and end coordinates (with corresponding SNPs) for each of these runs (example data below, for one pool of data) (CON refers to the consensus region - i.e. the region common to all, which in this case is actually 0 KB, but this isn't always the case) (UNION refers to the entire region spanned by the overlapping segments of homozygosity). Any advice would be appreciated. Thanks! Min-Han POOL FID IID PHE CHR SNP1 SNP2 BP1 BP2 KB NSNP NSIM GRP S34481 0 X1 1 12 rs4883195 rs11053499 8924893 9939332 1014.44 211 1 1 S34481 0 X2 2 12 rs10771151 rs7297150 8920868 10087355 1166.49 271 1 1 S34481 0 X3 2 12 rs7308209 rs12310310 8920732 10375281 1454.55 350 1 1 S34481 0 X4 2 12 rs2707209 rs4883195 6775525 8924893 2149.37 351 3 1* S34481 0 X5 2 12 rs917589 rs7968375 3412660 12393268 8980.61 2188 0 2* S34481 0 X6 1 12 rs4310684 rs11053781 8034750 10428536 2393.79 481 0 3* S34481 0 X7 1 12 rs2241025 rs619563 8136239 9982949 1846.71 330 0 4* S34481 0 X8 2 12 rs11043394 rs6488666 8153839 14284369 6130.53 1491 0 5* S34481 0 X9 1 12 rs1894814 rs2537760 8880044 10375636 1495.59 374 0 6* S34481 CON 9 5:04 12 rs4883195 rs4883195 8924893 8924893 0 1 NA NA S34481 UNION 9 5:04 12 rs917589 rs6488666 3412660 14284369 10871.7 2757 NA NA [[alternative HTML version deleted]]
snp geneplotter snpchip • 1.2k views
ADD COMMENTlink modified 9.4 years ago by Vincent J. Carey, Jr.6.3k • written 9.4 years ago by Min-Han Tan40
Answer: Bioconductor / plotting SNPs, runs of homozygosity
0
gravatar for Vincent J. Carey, Jr.
9.4 years ago by
United States
Vincent J. Carey, Jr.6.3k wrote:
I am sorry to say that this query does not seem very clear to me. There are plenty of facilities for visualizing genomic data in Bioconductor, and general graphical facilities of R might be suitable for what you are seeking. The data have a block structure over genomic coordinates and if you can get plink to emit that data in bed format, for example, the rtracklayer package could import it for further numerical manipulation. Clearly if you can make bed format you can do visualization in the browser by importing a custom track. But the details of what you want to show from the ROH data need to be clarified before further suggestions can be made. On Mon, Jul 12, 2010 at 3:47 PM, Min-Han Tan <minhan.tan at="" gmail.com=""> wrote: > Good afternoon all, > > Sorry to trouble you all. I am quite new to plotting high throughput SNPs / > chromosome locations, and am trying to visualize some SNP data generated by > Plink. (Illumina 330K) > > I am wondering which package would be most useful (and ideally Illumina > friendly). I looked at geneplotter, and SNPchip, but am not sure whether > there was a quick way to start. For e.g. it looks like SNPchip is more > Affymetrix based. Essentially, the data is runs of homozygosity. I have > start and end coordinates (with corresponding SNPs) for each of these runs > (example data below, for one pool of data) > > (CON refers to the consensus region - i.e. the region common to all, which > in this case is actually 0 KB, but this isn't always the case) > (UNION refers to the entire region spanned by the overlapping segments of > homozygosity). > > Any advice would be appreciated. Thanks! > > Min-Han > > > POOL ? ?FID ? ?IID ? ?PHE ? ?CHR ? ?SNP1 ? ?SNP2 ? ?BP1 ? ?BP2 ? ?KB > NSNP ? ?NSIM ? ?GRP > S34481 ? ?0 ? ?X1 ? ?1 ? ?12 ? ?rs4883195 ? ?rs11053499 ? ?8924893 > 9939332 ? ?1014.44 ? ?211 ? ?1 ? ?1 > S34481 ? ?0 ? ?X2 ? ?2 ? ?12 ? ?rs10771151 ? ?rs7297150 ? ?8920868 > 10087355 ? ?1166.49 ? ?271 ? ?1 ? ?1 > S34481 ? ?0 ? ?X3 ? ?2 ? ?12 ? ?rs7308209 ? ?rs12310310 ? ?8920732 > 10375281 ? ?1454.55 ? ?350 ? ?1 ? ?1 > S34481 ? ?0 ? ?X4 ? ?2 ? ?12 ? ?rs2707209 ? ?rs4883195 ? ?6775525 > 8924893 ? ?2149.37 ? ?351 ? ?3 ? ?1* > S34481 ? ?0 ? ?X5 ? ?2 ? ?12 ? ?rs917589 ? ?rs7968375 ? ?3412660 > 12393268 ? ?8980.61 ? ?2188 ? ?0 ? ?2* > S34481 ? ?0 ? ?X6 ? 1 ? ?12 ? ?rs4310684 ? ?rs11053781 ? ?8034750 > 10428536 ? ?2393.79 ? ?481 ? ?0 ? ?3* > S34481 ? ?0 ? ?X7 ? ?1 ? ?12 ? ?rs2241025 ? ?rs619563 ? ?8136239 > 9982949 ? ?1846.71 ? ?330 ? ?0 ? ?4* > S34481 ? ?0 ? ?X8 ? ?2 ? ?12 ? ?rs11043394 ? ?rs6488666 ? ?8153839 > 14284369 ? ?6130.53 ? ?1491 ? ?0 ? ?5* > S34481 ? ?0 ? ?X9 ? ?1 ? ?12 ? ?rs1894814 ? ?rs2537760 ? ?8880044 > 10375636 ? ?1495.59 ? ?374 ? ?0 ? ?6* > S34481 ? ?CON ? ?9 ? ?5:04 ? ?12 ? ?rs4883195 ? ?rs4883195 ? ?8924893 > 8924893 ? ?0 ? ?1 ? ?NA ? ?NA > S34481 ? ?UNION ? ?9 ? ?5:04 ? ?12 ? ?rs917589 ? ?rs6488666 ? ?3412660 > 14284369 ? ?10871.7 ? ?2757 ? ?NA ? ?NA > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 9.4 years ago by Vincent J. Carey, Jr.6.3k
I apologize. To clarify - I have runs of homozygosity data that vary in terms of cases/control ratios. (in the example below, it would be 5 cases: 4 controls for this particular pool). Ideally, I would like to be able to plot for each chromosome a graph - y axis being the case/control ratio; x axis = distance along chromosome. Each ROH segment would be marked out along the chromosome, based on the input data format. Sorry if this question seems extremely straightforward, and thanks for the pointer to rtracklayer. Min-Han On Mon, Jul 12, 2010 at 3:58 PM, Vincent Carey <stvjc@channing.harvard.edu>wrote: > I am sorry to say that this query does not seem very clear to me. > There are plenty of facilities for visualizing > genomic data in Bioconductor, and general graphical facilities of R > might be suitable for what you are seeking. > The data have a block structure over genomic coordinates and if you > can get plink to emit that data in > bed format, for example, the rtracklayer package could import it for > further numerical manipulation. Clearly if you can make > bed format you can do visualization in the browser by importing a > custom track. But the details of what you > want to show from the ROH data need to be clarified before further > suggestions can be made. > > On Mon, Jul 12, 2010 at 3:47 PM, Min-Han Tan <minhan.tan@gmail.com> wrote: > > Good afternoon all, > > > > Sorry to trouble you all. I am quite new to plotting high throughput SNPs > / > > chromosome locations, and am trying to visualize some SNP data generated > by > > Plink. (Illumina 330K) > > > > I am wondering which package would be most useful (and ideally Illumina > > friendly). I looked at geneplotter, and SNPchip, but am not sure whether > > there was a quick way to start. For e.g. it looks like SNPchip is more > > Affymetrix based. Essentially, the data is runs of homozygosity. I have > > start and end coordinates (with corresponding SNPs) for each of these > runs > > (example data below, for one pool of data) > > > > (CON refers to the consensus region - i.e. the region common to all, > which > > in this case is actually 0 KB, but this isn't always the case) > > (UNION refers to the entire region spanned by the overlapping segments of > > homozygosity). > > > > Any advice would be appreciated. Thanks! > > > > Min-Han > > > > > > POOL FID IID PHE CHR SNP1 SNP2 BP1 BP2 KB > > NSNP NSIM GRP > > S34481 0 X1 1 12 rs4883195 rs11053499 8924893 > > 9939332 1014.44 211 1 1 > > S34481 0 X2 2 12 rs10771151 rs7297150 8920868 > > 10087355 1166.49 271 1 1 > > S34481 0 X3 2 12 rs7308209 rs12310310 8920732 > > 10375281 1454.55 350 1 1 > > S34481 0 X4 2 12 rs2707209 rs4883195 6775525 > > 8924893 2149.37 351 3 1* > > S34481 0 X5 2 12 rs917589 rs7968375 3412660 > > 12393268 8980.61 2188 0 2* > > S34481 0 X6 1 12 rs4310684 rs11053781 8034750 > > 10428536 2393.79 481 0 3* > > S34481 0 X7 1 12 rs2241025 rs619563 8136239 > > 9982949 1846.71 330 0 4* > > S34481 0 X8 2 12 rs11043394 rs6488666 8153839 > > 14284369 6130.53 1491 0 5* > > S34481 0 X9 1 12 rs1894814 rs2537760 8880044 > > 10375636 1495.59 374 0 6* > > S34481 CON 9 5:04 12 rs4883195 rs4883195 8924893 > > 8924893 0 1 NA NA > > S34481 UNION 9 5:04 12 rs917589 rs6488666 3412660 > > 14284369 10871.7 2757 NA NA > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > [[alternative HTML version deleted]]
ADD REPLYlink written 9.4 years ago by Min-Han Tan40
GenomeGraphs is another relevant package. Look at the vignette. The problem is not completely straightforward to solve but you can compute the ratios and locations using R alone, or if you find additional facilities for rendering aspects of genomic context in GenomeGraphs useful, then go for it and perhaps you can write/contribute some utilities for nice renderings of this sort of data. This particular problem of graphing MIGHT have been solved outside of bioconductor, so going over the packages in the CRAN task views "genetics" view and checking the plink related list may be in order. On Mon, Jul 12, 2010 at 4:19 PM, Min-Han Tan <minhan.tan at="" gmail.com=""> wrote: > I apologize. > > To clarify - I have runs of homozygosity data that vary in terms of > cases/control ratios. (in the example below, it would be 5 cases: 4 controls > for this particular pool). > > Ideally, I would like to be able to plot for each chromosome a graph - y > axis being the case/control ratio; x axis = distance along chromosome. Each > ROH segment would be marked out along the chromosome, based on the input > data format. > > Sorry if this question seems extremely straightforward, and thanks for the > pointer to rtracklayer. > > Min-Han > > > > > On Mon, Jul 12, 2010 at 3:58 PM, Vincent Carey <stvjc at="" channing.harvard.edu=""> > wrote: >> >> I am sorry to say that this query does not seem very clear to me. >> There are plenty of facilities for visualizing >> genomic data in Bioconductor, and general graphical facilities of R >> might be suitable for what you are seeking. >> The data have a block structure over genomic coordinates and if you >> can get plink to emit that data in >> bed format, for example, the rtracklayer package could import it for >> further numerical manipulation. ?Clearly if you can make >> bed format you can do visualization in the browser by importing a >> custom track. ?But the details of what you >> want to show from the ROH data need to be clarified before further >> suggestions can be made. >> >> On Mon, Jul 12, 2010 at 3:47 PM, Min-Han Tan <minhan.tan at="" gmail.com=""> wrote: >> > Good afternoon all, >> > >> > Sorry to trouble you all. I am quite new to plotting high throughput >> > SNPs / >> > chromosome locations, and am trying to visualize some SNP data generated >> > by >> > Plink. (Illumina 330K) >> > >> > I am wondering which package would be most useful (and ideally Illumina >> > friendly). I looked at geneplotter, and SNPchip, but am not sure whether >> > there was a quick way to start. For e.g. it looks like SNPchip is more >> > Affymetrix based. Essentially, the data is runs of homozygosity. I have >> > start and end coordinates (with corresponding SNPs) for each of these >> > runs >> > (example data below, for one pool of data) >> > >> > (CON refers to the consensus region - i.e. the region common to all, >> > which >> > in this case is actually 0 KB, but this isn't always the case) >> > (UNION refers to the entire region spanned by the overlapping segments >> > of >> > homozygosity). >> > >> > Any advice would be appreciated. Thanks! >> > >> > Min-Han >> > >> > >> > POOL ? ?FID ? ?IID ? ?PHE ? ?CHR ? ?SNP1 ? ?SNP2 ? ?BP1 ? ?BP2 ? ?KB >> > NSNP ? ?NSIM ? ?GRP >> > S34481 ? ?0 ? ?X1 ? ?1 ? ?12 ? ?rs4883195 ? ?rs11053499 ? ?8924893 >> > 9939332 ? ?1014.44 ? ?211 ? ?1 ? ?1 >> > S34481 ? ?0 ? ?X2 ? ?2 ? ?12 ? ?rs10771151 ? ?rs7297150 ? ?8920868 >> > 10087355 ? ?1166.49 ? ?271 ? ?1 ? ?1 >> > S34481 ? ?0 ? ?X3 ? ?2 ? ?12 ? ?rs7308209 ? ?rs12310310 ? ?8920732 >> > 10375281 ? ?1454.55 ? ?350 ? ?1 ? ?1 >> > S34481 ? ?0 ? ?X4 ? ?2 ? ?12 ? ?rs2707209 ? ?rs4883195 ? ?6775525 >> > 8924893 ? ?2149.37 ? ?351 ? ?3 ? ?1* >> > S34481 ? ?0 ? ?X5 ? ?2 ? ?12 ? ?rs917589 ? ?rs7968375 ? ?3412660 >> > 12393268 ? ?8980.61 ? ?2188 ? ?0 ? ?2* >> > S34481 ? ?0 ? ?X6 ? 1 ? ?12 ? ?rs4310684 ? ?rs11053781 ? ?8034750 >> > 10428536 ? ?2393.79 ? ?481 ? ?0 ? ?3* >> > S34481 ? ?0 ? ?X7 ? ?1 ? ?12 ? ?rs2241025 ? ?rs619563 ? ?8136239 >> > 9982949 ? ?1846.71 ? ?330 ? ?0 ? ?4* >> > S34481 ? ?0 ? ?X8 ? ?2 ? ?12 ? ?rs11043394 ? ?rs6488666 ? ?8153839 >> > 14284369 ? ?6130.53 ? ?1491 ? ?0 ? ?5* >> > S34481 ? ?0 ? ?X9 ? ?1 ? ?12 ? ?rs1894814 ? ?rs2537760 ? ?8880044 >> > 10375636 ? ?1495.59 ? ?374 ? ?0 ? ?6* >> > S34481 ? ?CON ? ?9 ? ?5:04 ? ?12 ? ?rs4883195 ? ?rs4883195 ? ?8924893 >> > 8924893 ? ?0 ? ?1 ? ?NA ? ?NA >> > S34481 ? ?UNION ? ?9 ? ?5:04 ? ?12 ? ?rs917589 ? ?rs6488666 ? ?3412660 >> > 14284369 ? ?10871.7 ? ?2757 ? ?NA ? ?NA >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >
ADD REPLYlink written 9.4 years ago by Vincent J. Carey, Jr.6.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 188 users visited in the last hour