Can't produce a generegiontrack from my txdb anymore
1
0
Entering edit mode
A.J. • 0
@aj-24333
Last seen 5 weeks ago
United States

Hi,

While this was working in the past, I can't produce a Gene Region Track from my txdb object anymore.

I produce a txdb with:

txdb <- makeTxDbFromGFF("/mnt/Prairie_Vole_Data/Arjen_Folder/Arjen/genomefiles_mo/NCBI/GCF_000317375.1_MicOch1.0_genomic.gff")


wbich results in some warnings:

Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning messages:
1: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID,  :
some transcripts have no "transcript_id" attribute ==> their name ("tx_name" column in the TxDb object) was set to NA
2: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID,  :
the transcript names ("tx_name" column in the TxDb object) imported from the "transcript_id" attribute are not unique
3: In .find_exon_cds(exons, cds) :
The following transcripts have exons that contain more than one CDS (only the first CDS was kept for each exon): rna-NM_001289870.1,
rna-NM_001290102.1, rna-NM_001290499.1, rna-NM_001291243.1


I don't think those errors were a problem in the past, but unsure about that. In any case, when I try to

Gene_track <- GeneRegionTrack(txdb, chromosome = "NW_004949099.1", start = 2632436, end = 26371904, options(ucscChromosomeNames=FALSE))


I get this back:

  Error in rownames<-(*tmp*, value = names(x)) :
missing values not allowed in rownamesenter code here


I did update Gviz and GenomicFeatures after this problem appeared, but that did not help. Advice is would be highly appreciated!

Arjen

gviz GenomicFeatures • 136 views
1
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

It's not Gviz, but the GFF, which apparently has lots of transcripts with no names?

> txdb <- makeTxDbFromGFF("https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/317/375/GCF_000317375.1_MicOch1.0/GCF_000317375.1_MicOch1.0_genomic.gff.gz")
> tx <- transcripts(txdb)
> tx.sub <- subsetByOverlaps(tx, GRanges("NW_004949099.1", IRanges(2632436, 26371904)))
> mcols(tx.sub)$tx_name [1] "XM_005364846.2" "XM_005364850.2" "XM_005364849.2" "XM_026788302.1" [5] "XM_005364856.3" "XM_026788298.1" "XM_026788299.1" "XM_026788300.1" [9] NA "XM_005364859.3" NA "XM_013352944.1" [13] NA NA NA NA [17] "XM_005365305.1" NA "XM_005364862.1" "XM_005364864.2" [21] "XM_005364865.2" "XM_005364863.2" "XM_005364868.1" "XM_005364869.3" [25] "XM_026788075.1" "XM_026788076.1" "XM_026788077.1" "XM_005364870.2" [29] "XM_013353040.2" "XM_005364871.3" "XM_005364872.3" "XM_005364873.2" [33] "XM_005364881.2" "XM_005364882.3" "XM_005364883.3" "XM_005364886.2" [37] "XM_005364889.3" "XM_005364890.3" "XM_026788021.1" "XM_026788022.1" [41] "XM_026788020.1" "XM_005364892.3" "XM_005364893.3" "XM_005364894.3" [45] "XM_013352888.2" "XM_005364897.2" "XM_013352977.2" "XM_026788176.1" [49] "XM_005364896.2" "XM_005364898.2" NA "XR_003378481.1" [53] "XR_003378499.1" "XR_003378505.1" "XM_005364915.3" "XM_026788062.1" [57] "XM_026788063.1" NA "XM_005364917.3" NA [61] "XM_005364920.3" "XM_026788273.1" "XM_026788274.1" "XM_005364927.1" [65] NA "XM_026788280.1" "XM_026788279.1" NA [69] NA "XM_005364941.3" "XM_005364949.2" "XM_026788286.1" [73] "XR_003378507.1" NA NA "XM_013352993.2" [77] "XM_005364953.2" "XM_013352991.1" "XM_013352992.2" "XM_013352990.2" [81] "XM_026788049.1" "XM_005364955.3" "XM_013353087.2" "XM_026788275.1" [85] "XM_026788276.1" "XM_026788158.1" "XM_026788152.1" "XM_026788153.1" [89] "XM_026788157.1" "XM_026788154.1" "XM_026788155.1" "XM_026788156.1" [93] "XM_005364956.2" "XM_013352997.2" "XM_013352998.1" "XM_005364961.3" [97] "XM_005364963.2" "XM_026788272.1" NA "XM_005364964.3" [101] "XM_026788287.1" "XM_005364970.3" "XM_005364972.3" "XM_005364973.3" [105] "XM_013352964.2" "XM_013352965.2" "XM_026788263.1" "XM_026788264.1" [109] "XM_026788265.1" "XM_026788266.1" "XM_005364968.3" "XM_013352962.2" [113] "XM_005364974.3" "XM_005364975.3" "XM_005364976.3" "XM_026788236.1" [117] "XM_013353050.2" "XM_026788235.1" "XM_013353051.2" "XM_005364981.2" [121] "XM_026788052.1" "XM_005364984.2" "XM_026788278.1" "XM_005364845.2" [125] "XM_005364847.2" "XM_005364848.2" "XM_026788289.1" "XM_026788290.1" [129] "XM_005364853.3" "XM_026788291.1" "XM_013353088.2" "XM_005364855.3" [133] NA "XM_005364857.2" "XR_003378487.1" "XM_005364858.2" [137] "XM_013352886.1" "XM_005364867.1" "XM_005365306.3" "XR_003378490.1" [141] "XM_005364877.3" "XM_013352979.2" "XM_026788144.1" "XM_013352978.2" [145] "XM_026788140.1" "XM_005364874.2" "XM_005364875.2" "XM_026788139.1" [149] "XM_026788142.1" "XM_026788143.1" "XM_005364880.3" "XM_026788310.1" [153] "XM_005364884.3" "XM_013352887.1" "XM_013352946.1" "XM_005364885.3" [157] "XM_026788247.1" "XM_005364887.3" "XM_026788246.1" NA [161] "XM_013353061.2" "XM_026788171.1" "XM_026788172.1" "XM_026788173.1" [165] NA "XM_005364899.2" "XM_005364900.3" "XM_005364901.3" [169] "XM_026788043.1" "XM_005364903.2" "XM_026788042.1" "XM_005364904.2" [173] "XM_026788040.1" "XM_026788041.1" "XM_005364905.3" NA [177] "XM_013352984.2" "XM_026788185.1" "XM_026788186.1" "XM_026788194.1" [181] "XM_013352986.2" "XM_005364906.3" "XM_026788182.1" "XM_026788184.1" [185] "XM_026788183.1" "XM_026788190.1" "XM_026788191.1" "XM_026788189.1" [189] "XM_005364911.3" "XM_026788187.1" "XM_026788188.1" "XM_026788195.1" [193] "XM_026788192.1" "XM_026788196.1" NA NA [197] NA "XM_005364916.2" "XM_013353011.2" "XM_026788061.1" [201] "XM_005364918.3" NA "XR_001229360.2" "XR_003378485.1" [205] "XM_005364921.3" "XM_005364922.3" "XM_005364923.2" "XR_003378504.1" [209] "XM_005364924.3" "XM_005364926.3" "XM_005364925.3" "XM_005365308.2" [213] "XM_005364928.3" "XM_026788213.1" "XM_005364937.2" "XM_026788257.1" [217] "XM_026788259.1" "XM_026788260.1" "XM_026788253.1" "XM_026788254.1" [221] "XM_026788255.1" "XM_026788256.1" "XM_005364939.3" "XM_005364943.2" [225] "XM_005364945.2" NA "XM_005364948.3" "XM_013353064.2" [229] "XM_026788305.1" NA "XM_026788227.1" "XM_026788228.1" [233] NA NA NA "XM_005365310.3" [237] NA NA "XM_005364957.3" "XM_005364959.3" [241] "XM_013352996.1" "XM_005364958.2" "XM_005364965.2" NA [245] NA NA "XR_003378498.1" NA [249] "XM_005364982.2" "XM_005364983.2" "XM_005364985.2" >  When you use a TxDb as input for GeneRegionTrack, lots of stuff goes on under the hood. You are getting jammed up when the exons are extracted, and then the transcript names are appended as the names to the exons. Since there are NA values for the transcript names, and those aren't allowed as names for the exons, you get an error. You were warned about this when you made the TxDb though: Warning messages: 1: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0\$ID,  :
some transcripts have no "transcript_id" attribute ==> their name ("tx_name" column in the TxDb object) was set to NA


There might be a way to fix this so you can use the TxDb directly, which is admittedly much better than generating a GRanges and using that (it's not nearly as informative). But maybe you are stuck with the GRanges option...

0
Entering edit mode

Hi James,

Thanks for the comment. I noticed this as well, but was unsure if that mattered. But thanks for showing me the exact problem, and I'll see if I can fix it. It is so weird tho, because I was able to use this code before and get nice gene region tracks (with UTR and CDS informatiion). I am now using the GRanges option, but it feels like going backwards (it actually is).

Thanks, Arjen

0
Entering edit mode

If anybody has a quick solution to this (as it has been working before), I would appreciate that a lot!