karyoploteR genome coverage plotting using BSgenome.TAIR10
2
0
Entering edit mode
rtt100 • 0
@911b91ef
Last seen 7 weeks ago
United States

Hello,

I am trying to plot genome coverage over the Arabidopsis thaliana genome using "karyoploteR". I tried to use BSgenome package to get the genome. The BSgenome package has TAIR9 genome which is old ( "BSgenome.Athaliana.TAIR.TAIR9"). I found that TAIR10.1 is available at https://bioconductor.statistik.tu-dortmund.de/packages/3.12/data/annotation/html/BSgenome.Athaliana.TAIR.TAIR10.1.html and tried to install

But I gets an error message when I tried to install.



# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("BSgenome.Athaliana.TAIR.TAIR10.1")

Warning message:
package 'BSgenome.Athaliana.TAIR.TAIR10.1' is not available for Bioconductor version '3.12'

A version of this package for your version of R might be available elsewhere


Feedback is greatly appreciated.

Thank you.

3
Entering edit mode
@james-w-macdonald-5106
Last seen 50 minutes ago
United States

We are on Bioc 3.15, which has TAIR10, so you need to update your R and Bioconductor installation.

0
Entering edit mode

Hello,

Thank you very much for your response. I should have mentioned this in my initial post. I started with R 4.2 and Bioc 3.15. I was unable to install it and got the same error. Then I used 3.12 as it was given in the link https://bioconductor.statistik.tu-dortmund.de/packages/3.12/data/annotation/html/BSgenome.Athaliana.TAIR.TAIR10.1.html

Thank you.

1
Entering edit mode

Oh, right. We did have that package and now it seems to be gone? No idea why that happened. You might try building your own. If you have BSgenome installed, there are two useful things in that package.

> library(BSgenome)
> thedir <- system.file("extdata/GentlemanLab/", package = "BSgenome")
> dir(thedir, "TAIR10")
[1] "BSgenome.Athaliana.TAIR.TAIR10.1-seed"
[2] "BSgenome.Athaliana.TAIR.TAIR10.1-tools"


The first file is a 'seed' file that you can use as a template to build a package for yourself. You would likely need to make some modifications, but it's what Herve used in the past so I assume it should still work. The second is a directory that contains some R code to parse the FASTA file into a format useful for building the package.

Those two things, plus the vignette for building a package should be sufficient to get one for yourself.

2
Entering edit mode
> download.file("https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/735/GCF_000001735.4_TAIR10.1/GCF_000001735.4_TAIR10.1_genomic.fna.gz", "GCF_000001735.4_TAIR10.1_genomic.fna.gz")
trying URL 'https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/735/GCF_000001735.4_TAIR10.1/GCF_000001735.4_TAIR10.1_genomic.fna.gz'
Content type 'application/x-gzip' length 37482339 bytes (35.7 MB)

## this next bit is the code in the tools dir
> library(Biostrings)
> library(GenomeInfoDb)

> current_RefSeqAccn <- unlist(heads(strsplit(names(TAIR10.1), " ", fixed=TRUE), n=1L))
> chrominfo <- getChromInfoFromNCBI("TAIR10.1")
> expected_RefSeqAccn <- chrominfo[ , "RefSeqAccn"]
> stopifnot(identical(expected_RefSeqAccn, current_RefSeqAccn))
> names(TAIR10.1) <- chrominfo[ , "SequenceName"]
>
> for (i in seq_along(TAIR10.1)) {
+     filename <- paste0(names(TAIR10.1)[[i]], ".fa")
+     cat("writing ", filename, "\n", sep="")
+     writeXStringSet(TAIR10.1[i], file=filename, width=50L)
+ }
writing 1.fa
writing 2.fa
writing 3.fa
writing 4.fa
writing 5.fa
writing MT.fa
writing Pltd.fa

## then I copied the seed file from the BSgenome dir to my working dir
## and changed the line that starts with seqs_srcdir: to be the same as the working dir
> library(BSgenome)
> forgeBSgenomeDataPkg("BSgenome.Athaliana.TAIR.TAIR10.1-seed")
Creating package in ./BSgenome.Athaliana.TAIR.TAIR10.1
Saving 'seqlengths' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/seqlengths.rds' ... DONE
Saving '1' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/1.rds' ... DONE
Saving '2' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/2.rds' ... DONE
Saving '3' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/3.rds' ... DONE
Saving '4' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/4.rds' ... DONE
Saving '5' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/5.rds' ... DONE
Saving 'MT' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/MT.rds' ... DONE
Saving 'Pltd' object to compressed data file './BSgenome.Athaliana.TAIR.TAIR10.1/inst/extdata/Pltd.rds' ... DONE

> install.packages("BSgenome.Athaliana.TAIR.TAIR10.1", repos = NULL, type = "source")
Installing package into 'C:/Users/jmacdon/AppData/Local/R/win-library/4.2'
(as 'lib' is unspecified)
* installing *source* package 'BSgenome.Athaliana.TAIR.TAIR10.1' ...
** using staged installation
** R
** inst
Warning message:
package 'GenomeInfoDb' was built under R version 4.2.1
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Warning: package 'GenomeInfoDb' was built under R version 4.2.1
** testing if installed package can be loaded from final location
Warning: package 'GenomeInfoDb' was built under R version 4.2.1
** testing if installed package keeps a record of temporary installation path
* DONE (BSgenome.Athaliana.TAIR.TAIR10.1)

## et voila!

1
Entering edit mode
bernatgel ▴ 150
@bernatgel-7226
Last seen 8 weeks ago
Spain

Hi

If you only want to use it to plot with karyoploteR, you do not actually need a full BSgenome. You can simply create a GRanges object or a bed-like file with the chromosome names and lengths and give it as the genome to plotKaryotype as if it were a custom genome. It's not ideal, but will work.

Hope this helps