smlSet and snpMatrix / copy number analysis
1
0
Entering edit mode
@nathan-harmston-2904
Last seen 9.6 years ago
Hi all, I am currently looking at the GGtools package, specifically the smlSet. I am trying to integrate some snp data which I am using for studying CNV and the related gene expression information and I believe this may be the class that I I need to use. However due to the design of the snp.matrix it cannot store these log2ratios and associated Copy number. Is there an available package/class out there that has been created. I have spent some time looking at the bioconductor list and can't seem to find one? Could I in theory alter the smlSet class to stored my own version of snp.matrix instead? Many thanks in advance, Nathan [[alternative HTML version deleted]]
SNP GGtools SNP GGtools • 836 views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 6 weeks ago
United States
> Hi all, > > I am currently looking at the GGtools package, specifically the smlSet. I am > trying to integrate some snp data which I am using for studying CNV and the > related gene expression information and I believe this may be the class that > I I need to use. However due to the design of the snp.matrix it cannot store > these log2ratios and associated Copy number. Is there an available > package/class out there that has been created. I have spent some time > looking at the bioconductor list and can't seem to find one? Could I in > theory alter the smlSet class to stored my own version of snp.matrix > instead? in its current form snp.matrix is tailored to discrete genotype data represented as raw bytes. the raw representation gives us space and speed advantages but special code needs to be written in C to use this representation, all in the snpMatrix package. i am interested in supporting CNV-related information in a similar integrative structure but at the moment i do not have such data. i believe smlSet is a reasonable starting point for designing such an integrative structure, but a few points are in order 1) smlSet stands for "snp.matrix list" and this caters for an application with 4 million snp/sample distributed over 24 list elements representing chromosomes because i am dealing with the full hapmap phase II snp set. for 500k- type assays it would not be necessary to decompose into chromosomes, and some simplifications would follow 2) a nontrivial component of the smlSet infrastructure deals with managing snp location data, again for 4mm snp factored into chromosomes. this is not done in an optimal way and needs to be redesigned. managing 4 million locations is not pleasant on standard hardware; currently SQLite is used; data frames and netCDF were examined and found wanting in various respects for the applications targeted thus far. bottom line: i'd be happy to hear more about your requirements possibly off the list and we could discuss design steps for the relevant container structure and methods. we could introduce more tools in GGtools in short order. could you please indicate your affiliation? > > Many thanks in advance, > > Nathan > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > The information transmitted in this electronic communica...{{dropped:10}}
ADD COMMENT

Login before adding your answer.

Traffic: 1101 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6