Error with BSgenome, wanting to analyse non-model species with ATACseqQC
1
0
Entering edit mode
ronin • 0
@ea9e82bb
Last seen 8 months ago
Estonia

Hello,

I am wanting to analyze my ATACseq dataset, and have been struggling a bit. I am working with a non-model organism and am trying to work through some ATACseqQC tutorials, specifically this one regarding footprint plotting:

https://bioconductor.org/packages/release/bioc/vignettes/ATACseqQC/inst/doc/ATACseqQC.html#plot_Footprints

Since I am working with a non-model species, I had to make my own genome file. I did that


library(Rsamtools)
indexFa("pfluv-genome.fa")
genome <- FaFile("pfluv-genome.fa")

library(Biostrings)
dna <- readDNAStringSet("pfluv-genome.fa")
library(rtracklayer)
export(dna, "pfluv-genome.2bit")
genome <- TwoBitFile("pfluv-genome.2bit")

seqinfo(genome)

Which gives me the following output:


library(Rsamtools)
> indexFa("pfluv-genome.fa")
[1] "pfluv-genome.fa.fai"
> genome <- FaFile("pfluv-genome.fa")
> 
> library(Biostrings)
> dna <- readDNAStringSet("pfluv-genome.fa")
> library(rtracklayer)
> export(dna, "pfluv-genome.2bit")
> genome <- TwoBitFile("pfluv-genome.2bit")
> 
> seqinfo(genome)
Seqinfo object with 304 sequences from an unspecified genome:
  seqnames       seqlengths isCircular genome
  CM020909.1       48724115       <NA>   <NA>
  CM020910.1       48524041       <NA>   <NA>
  CM020911.1       46923577       <NA>   <NA>
  CM020912.1       44863486       <NA>   <NA>
  CM020913.1       44101041       <NA>   <NA>
  ...                   ...        ...    ...
  VHII01000120.1      19993       <NA>   <NA>
  VHII01000121.1      20003       <NA>   <NA>
  VHII01000122.1      19994       <NA>   <NA>
  VHII01000123.1      20205       <NA>   <NA>
  CM020933.1          16952       <NA>   <NA>

Yet when I try to run this code here:

sigs <- factorFootprints(bamfile4, pfm=CTCF[[1]],
                         genome=genome,
                         min.score="90%", seqlev=seqlev,
                         upstream=100, downstream=100)

I get this error message:


> sigs <- factorFootprints(bamfile4, pfm=CTCF[[1]],
+                          genome=genome,
+                          min.score="90%", seqlev=seqlev,
+                          upstream=100, downstream=100)
Error in factorFootprints(bamfile4, genome = genome, min.score = "90%",  : 
  is(genome, "BSgenome") is not TRUE

So somehow my genome object is not being recognized? Perhaps there is more going on here that I am missing, but any thoughts or assistance would be really appreciated :)

Thanks!

ATACseqQC • 366 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States

You are creating a TwoBitFile, which is not a BSgenome object. To make a BSgenome object, you will need to use forgeBSgenomeDataPkg from the BSgenome package. More information can be found in the vignette.

Login before adding your answer.

Traffic: 592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6