Error ambiguity characters in sequences
0
0
Entering edit mode
Bruno • 0
@d209d072
Last seen 7 months ago
France

Hi all,

I am trying to compile the updated genome of the model plant Arabidopsis thaliana, from TAIR10. I am using the function forgeBSgenomeDataPkgFromNCBI but I am running to the error that the data contains ambiguity characters in sequences. I used Biostrings::replaceAmbiguities() but I am not sure how to save the updated version and I don't know what to do from that point.

forgeBSgenomeDataPkgFromNCBI(assembly_accession="GCF_000001735.4", pkg_maintainer="Bruno Guillotin", organism="Arabidopsis thaliana", destdir=tempdir())
Warning in .extract_NCBI_assembly_info(assembly_accession, chrominfo, organism = organism,  :
  "GCF_000001735.4" is a registered NCBI assembly for organism
  "Arabidopsis thaliana" --> ignoring supplied 'organism' argument
trying URL 'https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/735/GCF_000001735.4_TAIR10.1/GCF_000001735.4_TAIR10.1_genomic.fna.gz'
Content type 'application/x-gzip' length 37482399 bytes (35.7 MB)
==================================================
downloaded 35.7 MB

Error in .local(object, con, format, ...) : 
  One or more strings contain unsupported ambiguity characters.
Strings can contain only A, C, G, T or N.
See Biostrings::replaceAmbiguities().

#### i did 
filepath <- downloadGenomicSequencesFromNCBI("GCF_000001735.4", destdir=tempdir()) 
genomic_sequences <- readDNAStringSet(filepath) 
genomic_sequences
genomic_sequences2 <- replaceAmbiguities(genomic_sequences , new="N")
 #Then ?....

I would also like to rename the different strings of the DNAStringSet as each chromosome have names such as NC_003070.9 and not chr1, chr2 etc....

Thanks in advance and sorry if it is an obvious question. Bruno

BSgenomeForge • 294 views
ADD COMMENT

Login before adding your answer.

Traffic: 590 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6