Request to create an organism package
1
0
Entering edit mode
estevemp • 0
@estevemp-19501
Last seen 5.2 years ago

Dear bioconductor assistance group,

I am performing a specific research in leaf-cutting ants; and I will need to build the organism package of Atta cephalotes and Acromyrmex echinatior wide-genome package for a RNA-seq anlysis. I need that non-model annotation package to perform to obtain an enrichment analysis.

Nevertheless, considering that the annotation packages of these both organisms are not available in the Bioconductor platform, I would like to know if you could help me to recommend me a good tutorial of one package and a specific code to perform it in R-studio (because I am very new in using DESeq2 and similar tools), such as AnnotationForge or AnnotationHub to make my own annotation packages of those organisms because I do not know if I need some permissions to do that by myself.

I hope your soon answer, and thank you for your kind help.

annotation deseq2 • 561 views
ADD COMMENT
0
Entering edit mode

We provide some species information automatically in annotationhub and then rely on user community contributed resources for any additional data. Some recommendations we have given for creating contributed annotation resources in the past

For an OrgDb object, see makeOrgPackageFromNCBI() and makeOrgPackage() in the AnnotationForge package for help creating the sqlite file.

For BSgenome objects should be made according to the steps outline in the BSgenome vignette.

TxDb objects there is a GenomicFeatures::makeTxDbFromGRanges(). Data should be provided as a GRanges object. See GenomicRanges::makeGRangesFromDataFrame() or rtracklayer::import() for help creating the GRanges."

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

You can make the TxDb package directly from NCBI

z <- makeTxDbFromGFF("ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/204/515/GCF_000204515.1_Aech_3.9/GCF_000204515.1_Aech_3.9_genomic.gff.gz")
Import genomic features from the file as a GRanges object ... trying URL 'ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/204/515/GCF_000204515.1_Aech_3.9/GCF_000204515.1_Aech_3.9_genomic.gff.gz'
downloaded 4.5 MB

OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
> z
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/204/515/GCF_000204515.1_Aech_3.9/GCF_000204515.1_Aech_3.9_genomic.gff.gz
# Organism: NA
# Taxonomy ID: NA
# miRBase build ID: NA
# Genome: NA
# transcript_nrow: 22136
# exon_nrow: 174319
# cds_nrow: 154647
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2019-01-17 12:21:00 -0500 (Thu, 17 Jan 2019)
# GenomicFeatures version at creation time: 1.34.1
# RSQLite version at creation time: 2.1.1
# DBSCHEMAVERSION: 1.2
> transcriptsBy(z)
GRangesList object of length 12253:
$LOC105142901 
GRanges object with 1 range and 2 metadata columns:
            seqnames    ranges strand |     tx_id        tx_name
               <Rle> <IRanges>  <Rle> | <integer>    <character>
  [1] NW_011623728.1     5-200      + |         1 XM_011054566.1

$LOC105142902 
GRanges object with 2 ranges and 2 metadata columns:
            seqnames        ranges strand | tx_id        tx_name
  [1] NW_011626966.1 535276-538982      + |  3759 XM_011050787.1
  [2] NW_011626966.1 535762-538982      + |  3760 XM_011050788.1

$LOC105142903 
GRanges object with 2 ranges and 2 metadata columns:
            seqnames        ranges strand | tx_id        tx_name
  [1] NW_011626966.1 532702-534426      - |  3813 XM_011050789.1
  [2] NW_011626966.1 532704-535729      - |  3814 XM_011050790.1

...
<12250 more elements>
-------
seqinfo: 670 sequences from an unspecified genome; no seqlengths

If you are using DESeq2, you should consider using salmon for quantitation and tximport to read into R, in which case you need the transcript FASTA file, which you can get from NCBI as well. See the salmon docs for help.

You can (as Lori already mentioned) make an OrgDb package from NCBI, so long as you have the Taxon ID for each species. See the help for makeOrgPackageFromNCBI.

ADD COMMENT

Login before adding your answer.

Traffic: 813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6