Search
Question: Building a reduced BSgenome.Gallus.UCSC.galGal 4 from Bioconductor
0
3.0 years ago by
pertille0
Sweden
pertille0 wrote:

The Bioconductor provides the BSgenome.Ggallus.UCSC.galGal4 from the following command to be used on different approaches:

source("https://bioconductor.org/biocLite.R")
biocLite("BSgenome.Ggallus.UCSC.galGal4")
library(BSgenome.Ggallus.UCSC.galGal4)

We are looking to launch a new analysis approach of the reduced-genome. But for that, we need a Bioconductor "reduced-reference genome" on the same format as provided by the Bioconductor genome, in order to use the tools already available for analysis. Our idea is to create this genome from a merged alignment of 20 animals subjected to this reduced representation methodology.

Would it be possible?

written 3.0 years ago by pertille0
1

What exactly does 'reduced-reference genome' mean?

Restriction site associated DNA Sequencing. If you shear the DNA with, for exemple, PstI restriction enzyme, you can produce a library based only on the fragments cleaved for this enzyme and ranging between a especifically length, to be sequenced. This will represent a low percentage of the genome, but consequently, with a high depth.
1

Hi,

You can build a BSgenome data package from any set of DNA sequences as long as the sequences are available in a FASTA or 2bit file, or in a collection of FASTA files, and the sequences are named uniquely. See the BSgenomeForge vignette in the BSgenome software package for more information. Hopefully you'll end up with a BSgenome data package that can be used with the tools already available but keep in mind that for most analysis you also need access to annotations that match your BSgenome object i.e. that describe and report genomic features with respect to it. However, most annotation providers (e.g. NCBI, UCSC, Ensembl, etc...) only provide annotations for reference genomes. So depending on your analysis, you might also need to find annotations (or tweak and merge existing annotations) that match your "reduced" BSgenome.

H.