Genome database and transcription database have different lengths
1
0
Entering edit mode
sarac ▴ 20
@sarac-21308
Last seen 5.4 years ago

While trying to assign transcript IDs to my tags using the CAGEfightR package:

TSSs <- assignTxID(TSSs, txModels = txdb, swap="thick")

I was met with the following error:

Error: seqlengths(object) not identical to seqlengths(txModels)

Unless I am mistaken, this seems to arise because

length(seqlengths(TSSs))

is 455 because it is created using the BSgenome.Hsapiens.UCSC.hg38 database using the quantifyCTSSs function, while

length(seqlengths(txdb))

is 595 because txdb is the TxDb.Hsapiens.UCSC.hg38.knownGene database. Unfortunately my R-Fu is not advanced enough to solve this. Is there a way to easily rectify this, or am I missing something?

CAGEfightR CAGE • 1.0k views
ADD COMMENT
2
Entering edit mode
maltethodberg ▴ 180
@maltethodberg-9690
Last seen 16 hours ago
Denmark

You are correct, CAGEfightR is complaining that the two genomes (obtained via seqinfo()/seqlengths() ) are not identical. I have seen a couple of people having similar problems, so CAGEfightR is probably currently a bit too strict in enforcing this. We will probably remove this error in future versions of CAGEfightR and replace it with a warning instead.

Two ways around this:

1) Simply use the genome from seqinfo(txdb) in quantifyCTSSs. 2) Try and overwrite the seqinfo objects, see for example here: https://support.bioconductor.org/p/118989/#119085.

Hope this helps!

ADD COMMENT
1
Entering edit mode

Thank you! Number 1 worked a treat.

ADD REPLY

Login before adding your answer.

Traffic: 882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6