Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.2 years ago
I am trying to use derfinder but it requires I have my genome features
in the TranscriptDB from the GenomicFeatures R package.
Normally one can use the inbuilt function to make a TranscriptDB from
UNSC database however my organism, a plant, is not included in the
UNSC database.
Consequently I am trying to use the makeTranscriptDbFromGFF() function
to make the TranscriptDB using a gtf file I created using tophat and
cuffmerge on some RNA-Seq samples. However following the examples I
can't get it to work.
Here is what I've got:
chrominfo <- data.frame(chrom = c("chr1", "chr2", "chr3", "chr4",
"chr5", "chr6", "chr7", "chr8"),
length= c(52991155, 45729672, 55515152,
56582383, 43630510, 35275713, 49172423,45569985),
is_circular= rep(FALSE, 8))
exons <- makeTranscriptDbFromGFF(file = "~/merged.gtf",
format = "gtf",
exonRankAttributeName="exon_number",
chrominfo=chrominfo)
Unfortunately this is the output:
extracting transcript information
Estimating transcript ranges.
Extracting gene IDs
Processing splicing information for gtf file.
Prepare the 'metadata' data frame ... metadata: OK
Error in .normargTranscripts(transcripts) :
values in 'transcripts$tx_strand' must be "+" or "-"
In addition: Warning message:
In if is.na(chrominfo)) { :
the condition has length > 1 and only the first element will be used
What am I missing?? I also posted this question on biostar.
This is the top of my gtf file:
chr1 Cufflinks exon 6524 6620 . + . gene_id
"XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1";
gene_name "Medtr1g004940"; oId "Medtr1g004940.1"; nearest_ref
"Medtr1g004940.1"; class_code "="; tss_id "TSS1"; p_id "P1";
chr1 Cufflinks exon 7098 7366 . + . gene_id
"XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "2";
gene_name "Medtr1g004940"; oId "Medtr1g004940.1"; nearest_ref
"Medtr1g004940.1"; class_code "="; tss_id "TSS1"; p_id "P1";
chr1 Cufflinks exon 14514 14556 . + . gene_id
"XLOC_000002"; transcript_id "TCONS_00000002"; exon_number "1";
gene_name "Medtr1g004950"; oId "Medtr1g004950.1"; nearest_ref
"Medtr1g004950.1"; class_code "="; tss_id "TSS2"; p_id "P2";
chr1 Cufflinks exon 15503 15729 . + . gene_id
"XLOC_000002"; transcript_id "TCONS_00000002"; exon_number "2";
gene_name "Medtr1g004950"; oId "Medtr1g004950.1"; nearest_ref
"Medtr1g004950.1"; class_code "="; tss_id "TSS2"; p_id "P2";
chr1 Cufflinks exon 16283 16326 . + . gene_id
"XLOC_000003"; transcript_id "TCONS_00000003"; exon_number "1";
gene_name "Medtr1g004960"; oId "Medtr1g004960.1"; nearest_ref
"Medtr1g004960.1"; class_code "="; tss_id "TSS3"; p_id "P3";
chr1 Cufflinks exon 17061 17304 . + . gene_id
"XLOC_000003"; transcript_id "TCONS_00000003"; exon_number "2";
gene_name "Medtr1g004960"; oId "Medtr1g004960.1"; nearest_ref
"Medtr1g004960.1"; class_code "="; tss_id "TSS3"; p_id "P3";
chr1 Cufflinks exon 18242 18382 . + . gene_id
"XLOC_000003"; transcript_id "TCONS_00000003"; exon_number "3";
gene_name "Medtr1g004960"; oId "Medtr1g004960.1"; nearest_ref
"Medtr1g004960.1"; class_code "="; tss_id "TSS3"; p_id "P3";
-- output of sessionInfo():
R version 3.1.0 (2014-04-10)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C
LC_TIME=en_NZ.UTF-8
[4] LC_COLLATE=en_NZ.UTF-8 LC_MONETARY=en_NZ.UTF-8
LC_MESSAGES=en_NZ.UTF-8
[7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_NZ.UTF-8
LC_IDENTIFICATION=C
attached base packages:
[1] splines grid parallel stats graphics grDevices utils
datasets
[9] methods base
other attached packages:
[1] rtracklayer_1.22.7 GenomicFeatures_1.14.5
AnnotationDbi_1.24.0
[4] Biobase_2.22.0 GenomicRanges_1.14.4 XVector_0.2.0
[7] derfinder_1.0.2 locfdr_1.1-7 HiddenMarkov_1.7-0
[10] limma_3.18.13 Genominator_1.16.0 GenomeGraphs_1.22.0
[13] biomaRt_2.18.0 IRanges_1.20.7 BiocGenerics_0.8.0
[16] RSQLite_0.11.4 DBI_0.2-7
loaded via a namespace (and not attached):
[1] Biostrings_2.30.1 bitops_1.0-6 BSgenome_1.30.0
RCurl_1.95-4.1 Rsamtools_1.14.3
[6] stats4_3.1.0 tools_3.1.0 XML_3.98-1.1
zlibbioc_1.8.0
--
Sent via the guest posting facility at bioconductor.org.