Question: Genome database and transcription database have different lengths
0
gravatar for sarac
4 weeks ago by
sarac20
sarac20 wrote:

While trying to assign transcript IDs to my tags using the CAGEfightR package:

TSSs <- assignTxID(TSSs, txModels = txdb, swap="thick")

I was met with the following error:

Error: seqlengths(object) not identical to seqlengths(txModels)

Unless I am mistaken, this seems to arise because

length(seqlengths(TSSs))

is 455 because it is created using the BSgenome.Hsapiens.UCSC.hg38 database using the quantifyCTSSs function, while

length(seqlengths(txdb))

is 595 because txdb is the TxDb.Hsapiens.UCSC.hg38.knownGene database. Unfortunately my R-Fu is not advanced enough to solve this. Is there a way to easily rectify this, or am I missing something?

cage cagefightr • 73 views
ADD COMMENTlink modified 4 weeks ago by maltethodberg130 • written 4 weeks ago by sarac20
Answer: Genome database and transcription database have different lengths
2
gravatar for maltethodberg
4 weeks ago by
maltethodberg130
Sweden
maltethodberg130 wrote:

You are correct, CAGEfightR is complaining that the two genomes (obtained via seqinfo()/seqlengths() ) are not identical. I have seen a couple of people having similar problems, so CAGEfightR is probably currently a bit too strict in enforcing this. We will probably remove this error in future versions of CAGEfightR and replace it with a warning instead.

Two ways around this:

1) Simply use the genome from seqinfo(txdb) in quantifyCTSSs. 2) Try and overwrite the seqinfo objects, see for example here: https://support.bioconductor.org/p/118989/#119085.

Hope this helps!

ADD COMMENTlink written 4 weeks ago by maltethodberg130
1

Thank you! Number 1 worked a treat.

ADD REPLYlink written 4 weeks ago by sarac20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 217 users visited in the last hour