SNP Injection in Custom BSGenome
1
0
Entering edit mode
@michaelweber1-11392
Last seen 3.5 years ago

Dear all Bioconductor users,

the package VariantAnnotation is a perfect tool to find non-synonymous SNPs in a VCF. My question is :
How to create a new genome including the found SNPs ?

I know it is possbile somehow for Human data and the pregenerated SNPLOCS object. But how to achieve the same for an unknown microorganism such as Candida albicans ?

At the moment I am doing it step by step:

1) Getting the GeneID and CDS (TxDB Package) cdsBy
2) Extracting the sequence from BSGenome
extractTranscriptSeqs(BSgenome.CAlbicans,cdsList)

3) Inserting the SNPs from a data.frame
for(i in 1:nrow(geneSnpDF)){
seq[geneSnpDF[i,"POS"]] <- DNAString(geneSnpDF[i,"ALT"])
}

What I would like to have a BSGenome.CAlbMutated where I can directly
extractTranscriptSeqs(BSgenome.CAlbicans,cdsList)
extractTranscriptSeqs(BSgenome.CAlbMutated,cdsList)

in order to compare the sequences in an alignment.

For Homo sapiens it is as easy as : injectSNPs(BSgenome.Homo, snps)

variantannotation bsgenome • 752 views
0
Entering edit mode
@herve-pages-1542
Last seen 8 hours ago
Seattle, WA, United States

Hi Michael,

We don't provide an easy way to inject arbitrary SNPs in an arbitrary BSgenome at the moment. However, it should not be too hard to forge the BSgenome.CAlbMutated,cdsList package. First you would need to compute the sequences of the mutated chromosomes (you can use replaceLetterAt for this), then write them to a 2bit file (put them in a DNAStringSet object and call rtracklayer::export on it), then use that 2bit file to forge the BSgenome.CAlbMutated,cdsList package (see the BSgenomeForge vignette in the BSgenome package for how to do this).

Hope this helps,

H.