How to load fasta file with openPrimeR?
1
0
Entering edit mode
@marongiuluigi-7134
Last seen 3.8 years ago
European Union

Hello, I am trying to load a fast file with the package 'openPrimeR'. The manual (https://www.bioconductor.org/packages/release/bioc/vignettes/openPrimeR/inst/doc/openPrimeR_vignette.html) says to use:

fasta.file <- system.file("extdata", "IMGT_data", "templates",
                "Homo_sapiens_IGH_functional_exon.fasta", package =
"openPrimeR")
# Load the template sequences from 'fasta.file'
seq.df.simple <- read_templates(fasta.file)

but if I give these commands to a local file:

fasta.file <- system.file("extdata", "IMGT_data", "templates",
                "stx.fa", package = "openPrimeR")
fasta.file <- system.file("stx.fa", package = "openPrimeR")

where stx.fa il the file I wanted to open and that is present in the working directly. I get only an empty object. What am I getting wrong? Thank you

sessionInfo( )

R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library("openPrimeR")
There are missing/non-functioning external tools.
To use the full potential of openPrimeR, please make sure
that the required versions of the speciied tools are

                installed and that they are functional:
o MELTING (http://www.ebi.ac.uk/biomodels/tools/melting/)
o ViennaRNA (http://www.tbi.univie.ac.at/RNA/)
o OligoArrayAux (http://unafold.rna.albany.edu/OligoArrayAux.php)
o MAFFT (http://mafft.cbrc.jp/alignment/software/)
o Pandoc (http://pandoc.org)
Warning messages:
1: In fun(libname, pkgname) :
  'Pandoc' is non-functional, since 'pdflatex' is not installed on your system.
2: In parallel_setup(default.nbr.cores) :
  Please install 'doParallel' to use multiple cores.
> fasta.file <- system.file("/home/gigiux/Documents/PCR_Book/stx.fa", package = "openPrimeR")
> fasta.file
[1] ""
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_GB.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] openPrimeR_1.12.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6              plyr_1.8.6              pillar_1.4.7
      compiler_4.0.3
 [5] RColorBrewer_1.1-2      GenomeInfoDb_1.26.2     XVector_0.30.0
      bitops_1.0-6
 [9] iterators_1.0.13        tools_4.0.3             zlibbioc_1.36.0
      lifecycle_1.0.0
[13] tibble_3.0.6            gtable_0.3.0            pkgconfig_2.0.3
      rlang_0.4.10
[17] foreach_1.5.1           rstudioapi_0.13         parallel_4.0.3
      GenomeInfoDbData_1.2.4
[21] stringr_1.4.0           dplyr_1.0.4             Biostrings_2.58.0
      generics_0.1.0
[25] S4Vectors_0.28.1        vctrs_0.3.6             IRanges_2.24.1
      stats4_4.0.3
[29] grid_4.0.3              tidyselect_1.1.0        glue_1.4.2
      R6_2.5.0
[33] reshape2_1.4.4          ggplot2_3.3.3           purrr_0.3.4
      magrittr_2.0.1
[37] scales_1.1.1            codetools_0.2-18        ellipsis_0.3.1
      BiocGenerics_0.36.0
[41] GenomicRanges_1.42.0    colorspace_2.0-0        stringi_1.5.3
      lpSolveAPI_5.5.2.0-17.7
[45] RCurl_1.98-1.2          munsell_0.5.0           crayon_1.4.1
openPrimeR • 2.1k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

When a package author wants to show how to do something that uses data provided in the package, the author has to be able to figure out where the data in the package exists on an arbitrary install (like yours) that the author can't have any a priori knowledge about. The system.file function is a way to do that, and is essentially asking R to provide the path to the library installation directory on your system.

But you as an end user should never need to use system.file to load up your own data! You already know where the data are, and in general they should be in your working directory (e.g., where you started R). So you can skip that first step and go right to the second, doing something like

seq.df.simple <- read_templates("styx.fa")
ADD COMMENT
0
Entering edit mode

I tried but:

> seq.df <- read_templates("styx.fa")
Error in read_templates_single(fname, hdr.structure = hdr.structure, delim = delim,  : 
  Unsupported template input file type or error reading data for file: 'styx.fa'
> fasta.file <- "stx.fa"
> seq.df <- read_templates(fasta.file)
> seq.df
                       ID                  Header   Group Identifier Sequence_Length Allowed_Start_fw Allowed_End_fw
1 >MW311073.1 Escheric... >MW311073.1 Escheric... default          1             180                1             30
  Allowed_Start_rev Allowed_End_rev              Allowed_fw             Allowed_rev Allowed_Start_fw_ali
1               151             180 atgaagaagatgtttatggc... cgctggaatctgcaaccgtt...                    1
  Allowed_End_fw_ali Allowed_Start_fw_initial Allowed_End_fw_initial Allowed_Start_fw_initial_ali
1                 30                        1                     30                            1
  Allowed_End_fw_initial_ali Allowed_Start_rev_ali Allowed_End_rev_ali Allowed_Start_rev_initial
1                         30                   151                 180                       151
  Allowed_End_rev_initial Allowed_Start_rev_initial_ali Allowed_End_rev_initial_ali                Sequence
1                     180                           151                         180 atgaagaagatgtttatggc...
            InputSequence Run
1 atgaagaagatgtttatggc... stx

Anyway, with this two-step approach the file is read, so case closed. Thanks

ADD REPLY
1
Entering edit mode

@marongiu.luigi If you carefully check your code, you will notice that you had a typo in the filename. The first time you entered styx.fasta, the second time stx.fa. Passing a variable or a value does not make a difference.

@Santiago Edilberto: Please check whether your file exists and is a properly formatted FASTA file.

ADD REPLY
0
Entering edit mode

How do I know if my file is properly formatted FASTA file?

ADD REPLY
1
Entering edit mode

A FASTA file is a raw text file (not MS Word or the likes), with a specific structure, along the likes of:

>seq_id_1 \ ACAGCACA \ >seq_id_2 \ TCGAAGA

More information is available on Wikipedia.

ADD REPLY
0
Entering edit mode

How can I download a FASTA file for BRCA1??

ADD REPLY
0
Entering edit mode

I tried with the code that you told me but now I have the error of quot not found

 > library(openPrimeR)}
Error: unexpected '}' in "library(openPrimeR)}"
>  library(openPrimeR)
There are missing/non-functioning external tools.
To use the full potential of openPrimeR, please make sure
that the required versions of the speciied tools are

                installed and that they are functional:
o MELTING (http://www.ebi.ac.uk/biomodels/tools/melting/)
o ViennaRNA (http://www.tbi.univie.ac.at/RNA/)
o OligoArrayAux (http://unafold.rna.albany.edu/OligoArrayAux.php)
The number of cores for was set to '2' by 'parallel_setup()'.
> read_templates("secuence.fasta"BRCA1.fasta)
Error: unexpected '&' in "read_templates(&"
> read_templates("/home/Downloads/BRCA1.fasta")
Error: unexpected '&' in "read_templates(&"
> read_templates(quot;/home/Downloads/BRCA1.fastaquot;)
Error: unexpected ';' in "read_templates(quot;"
> read_templates(quot/home/Downloads/BRCA1.fastaquot)
Error in read_templates(quot/home/Downloads/BRCA1.fastaquot) : 
  object 'quot' not found
ADD REPLY

Login before adding your answer.

Traffic: 504 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6