Dear all,
I was just wondering, would it be possible to create a GmapGenome from one of the genomic FaFiles from AnnotationHub? Might be eventually nice to have that as an option for GmapGenome.
cheers, jo
Dear all,
I was just wondering, would it be possible to create a GmapGenome from one of the genomic FaFiles from AnnotationHub? Might be eventually nice to have that as an option for GmapGenome.
cheers, jo
This is pretty easy already. Just GmapGenome(getSeq(file), ...). I added something to devel for this to be automatic, but I have not tested it. Obviously, this is not very efficient, but it works.
Dear Michael,
yes, indeed, it's straight forward. Actually, I did already implement a method for FaFile... in case you would like to add that...
## Add additional parameter "seqnames" that allows to export only specified
## sequences from the FaFile such that the index is build on them.
setMethod("gmap_build", c("FaFile", "GmapGenome"),
function(x, genome, seqnames=NULL, ...){
if(is.na(index(x))){
message("Creating missing index...", appendLF=FALSE)
indexFa(x)
x <- FaFile(path(x))
message("done")
}
## The question is whether we should restrict to some chromosomes?
faIndex <- scanFaIndex(x)
if(!is.null(seqnames)){
## Restrict to specified seqnames.
seqnames <- as.character(seqnames)
notPresent <- !(seqnames %in% seqnames(faIndex))
if(any(notPresent))
stop(paste0("Provided seqnames ",
paste(seqnames[notPresent], collapse=", "),
" not present in the FaFile!"))
## Subset the GRanges:
faIndex <- faIndex[seqnames(faIndex) %in% seqnames]
}
## Extract the sequences from the FaFile as DNAStringSet and
## export them a temporary file.
dds <- scanFa(x, param=faIndex)
tmpfile <- tempfile()
export(dds, con=tmpfile, format="fasta")
gmap_build(tmpfile, genome, ...)
})
This would require that you import in addition FaFile, scanFaIndex, scanFa, indexFa from Rsamtools.
Would be nice to have that (especially with the option to subset by seqnames... as usually the FaFiles, like "AH49186" from AnnotationHub with the genomic DNA sequence for GRCh38 contain a lot of additional sequences).
Thanks!
jo
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Judging from the help page of GmapGenome, is should be straightforward: