extractUpstreamSeqs with custom genome, no gene_id shown
0
0
Entering edit mode
wd ▴ 30
@wd-7410
Last seen 5.6 years ago
Germany

Hi

I'm trying to use extractUpstreamSeqs from the GenomicsFeatures package with a custom genome using the following commands:

txdb <- makeTxDbFromGFF("test.gff3",format="gff3")
genome<-FaFile("test_genome.fasta", index=sprintf("%s.fai", "test_genome.fasta"))
open(genome)
promoter_transcripts<-extractUpstreamSeqs(genome, transcripts(txdb), width=500)

I get an ouput like this (yellow shaded):

A DNAStringSet instance of length 17660
        width seq                                                    names               
    [1]   500 TCTCTTTTTTTTCTCCTTCTACTGAC...AATGAATGGAGAAGAAAAAATTTAA scaffold_1
    [2]   500 CTTTATATTAGTAAATGAATAATAAC...AAAAGAGATCCATACTTTTCAAGAA scaffold_1
    [3]   500 AGTCAAAGGTTATCTTCTTGGGAACT...ATAGGCTTTGTTGGATGCCTTATTA scaffold_1
    [4]   500 AATCTTTTTTTGCAAAGCAGCCAATA...CCTAACCTAACCTCTTTGTAAAACA scaffold_1
    [5]   500 TCTTCAGCTTACAACCATCTTCTAAT...TTGATTTGTTGATTATTTATCCCTT scaffold_1

but without gene/transcript_ids and with only the scaffold IDs being shown (see also mcols output)

mcols(promoter_transcripts)

DataFrame with 17660 rows and 3 columns
      gene_seqnames gene_strand  gene_TSS
              <Rle>       <Rle> <integer>
1        scaffold_1           +      6883
2        scaffold_1           +     23525
3        scaffold_1           +     29225
4        scaffold_1           +     31774
5        scaffold_1           +     38937

I would like to keep the geneIDs linked to the extracted upstream sequences

Any help would be much appreciated!

W

below you can find the txdb information:

# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: test.gff3
# Organism: NA
# Taxonomy ID: NA
# miRBase build ID: NA
# Genome: NA
# transcript_nrow: 17660
# exon_nrow: 66022
# cds_nrow: 62514
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2016-04-14 12:03:14 +0200 (Thu, 14 Apr 2016)
# GenomicFeatures version at creation time: 1.22.13
# RSQLite version at creation time: 1.0.0
# DBSCHEMAVERSION: 1.1

 

genomicranges txdb genomicfeatures featureextraction • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1122 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6