Question: DNAString - Long String
0
gravatar for pgreenbank
2.5 years ago by
pgreenbank0
pgreenbank0 wrote:

I get an error when storing a large string in DNAString. Is there a limit on the length of the string that can be stored.

 

library("Biostrings")    
DNA_BASES
DNA_ALPHABET
d <- DNAString("CGTCTCGCGTCTGGGAACGGCCGGGCCCCCAGCGGGCTGTGGTCGCGGGGTGGGGGCCGGAGCGGCGAGGCCCCCCTTACCGGGCTGCGCGGGCCGCCCAGGGCCCCCGGGCTGAGACGGGGCCGGAGCGGCGCCCCGGCCGCCCGCGCGGGGTCTCCCCCATGGTGCAGCGGGGTTCGGGATGTCGAAGACGCTGAAGAAGAAGAAGCACTGGCTCAGCAAGGTGCAGGAGTGCGCCGTGTCCTGGGCCGGGCCCCCGGGCGACTTCGGCGCGGAGATCCGCGGTGGCGCGGAGCGTGGCGAGTTCCCCTACCTGGGGCGGCTCCGCGAGGAGCCCGGCGGGGGCACCTGCTGCGTCGTCTCGGGCAAGGCGCCCAGCCCAGGCGATGTGCTGCTGGAGGTAAACGGGACGCCTGTCAGCGGGCTCACCAACCGGGACACCCTGGCTGTCATCCGCCACTTCCGCGAGCCCATCCGTCTCAAGACTGTGAAACCAGGCAAAGTCATTAATAAAGATTTGCGGCATTACCTAAGTCTTCAGTTTCAAAAAGGATCAATTGACCACAAACTGCAGCAAGTGATCAGAGATAATCTCTACTTGAGAACCATTCCATGCACTACAAGGGCCCCCAGGGATGGAGAAGTACCAGGAGTGGATTATAATTTCATTTCCGTTGAACAGTTCAAAGCACTGGAAGAGAGTGGAGCATTGTTAGAAAGTGGGACATATGATGGAAACTTCTATGGAACTCCCAAGCCTCCAGCAGAACCCAGCCCTTTTCAGCCAGATCCAGTTGATCAAGTCCTCTTTGATAATGAGTTTGATGCAGAATCTCAAAGAAAACGAACGACATCTGTCAGCAAGATGGAAAGAATGGATAGCTCTCTTCCTGAAGAGGAAGAAGATGAGGACAAGGAAGCTATTAATGGCAGTGGAAACGCAGAAAACAGAGAGAGGCATTCTGAGTCATCTGACTGGATGAAGACTGTTCCAAGTTACAACCAAACAAATAGCTCCATGGACTTTAGAAATTATATGATGAGAGATGAGACTCTGGAACCACTGCCCAAAAACTGGGAAATGGCCTACACTGACACAGGGATGATCTACTTCATTGACCACAATACCAAGACAACCACCTGGTTGGATCCTCGTCTTTGTAAGAAAGCCAAAGCCCCTGAAGACTGTGAAGATGGAGAGCTTCCTTATGGCTGGGAGAAAATAGAGGACCCTCAGTATGGGACATACTATGTTGATCACCTTAACCAGAAAACCCAGTTTGAAAATCCAGTGGAGGAAGCCAAAAGGAAAAAGCAGTTAGGACAGGTTGAAATTGGGTCTTCAAAACCAGATATGGAAAAATCACACTTCACAAGAGATCCATCCCAGCTTAAAGGTGTCCTTGTTCGAGCATCACTGAAAAAAAGCACAATGGGATTTGGTTTTACTATTATTGGTGGAGATAGACCTGATGAGTTCCTACAAGTGAAAAATGTGCTGAAAGATGGTCCCGCAGCTCAGGATGGGAAAATTGCACCAGGCGATGTTATTGTAGACATCAATGGCAACTGTGTCCTCGGTCACACTCATGCAGATGTTGTCCAGATGTTTCAATTGGTACCTGTCAATCAGTATGTAAACCTCACTTTATGTCGTGGTTATCCACTTCCTGATGACAGTGAAGATCCTGTTGTGGACATTGTTGCTGCTACCCCTGTCATCAATGGACAGTCATTAACCAAGGGAGAGACTTGCATGAATCCTCAGGATTTTAAGCCAGGAGCAATGGTTCTGGAGCAGAATGGAAAATCGGGACACACTTTGACTGGTGATGGTCTCAATGGACCATCAGATGCAAGTGAGCAGAGAGTATCCATGGCATCGTCAGGCAGCTCCCAGCCTGAACTAGTGACTATCCCTTTGATTAAGGGCCCTAAAGGGTTTGGGTTTGCAATTGCTGACAGCCCTACTGGACAGAAGGTGAAAATGATACTGGATAGTCAGTGGTGTCAAGGCCTTCAGAAAGGAGATATAATTAAGGAAATATACCATCAAAATGTGCAGAATTTAACACATCTCCAAGTGGTAGAGGTGCTAAAGCAGTTTCCAGTAGGTGCTGATGTACCATTGCTTATCTTAAGAGGAGGTCCTCCTTCACCAACCAAAACTGCCAAAATGAAAACAGATAAAAAGGAAAATGCAGGAAGTTTGGAGGCCATAAATGAGCCTATTCCTCAGCCTATGCCTTTTCCACCGAGCATTATCAGGTCAGGATCCCCAAAATTGGATCCTTCTGAGGTCTACCTGAAATCTAAGACTTTATATGAAGATAAACCACCAAACACCAAAGATTTGGATGTTTTTCTTCGAAAACAAGAGTCAGGGTTTGGCTTCAGGGTGCTAGGAGGAGATGGACCTGACCAGTCTATATATATTGGGGCTATTATTCCCCTGGGAGCAGCTGAGAAAGATGGTCGGCTCCGCGCAGCTGATGAACTAATGTGCATTGATGGAATTCCTGTTAAAGGGAAATCACACAAACAAGTCTTGGACCTCATGACAACTGCTGCTCGAAATGGCCATGTGTTACTAACTGTCAGACGGAAGATCTTCTATGGAGAAAAACAACCCGAGGACGACAGCTCTCAGGCCTTCATTTCAACACAGAATGGATCTCCCCGCCTGAACCGGGCAGAGGTCCCAGCCAGGCCTGCACCCCAGGAGCCCTATGATGTTGTCTTGCAACGAAAAGAAAATGAAGGATTTGGCTTTGTCATCCTCACCTCCAAAAACAAACCACCTCCAGGAGTTATTCCTCATAAAATTGGCCGAGTCATAGAAGGAAGTCCGGCTGACCGCTGTGGAAAACTGAAAGTTGGAGATCATATCTCTGCAGTGAATGGGCAGTCCATTGTTGAACTGTCTCATGATAACATTGTTCAGCTGATCAAAGATGCTGGTGTCACCGTCACACTAACGGTCATTGCTGAAGAAGAGCATCATGGTCCACCATCAGGAACAAACTCAGCCAGGCAAAGCCCAGCCCTGCAGCACAGGCCCATGGGACAGTCACAGGCCAACCACATACCTGGGGACAGAAGTGCCCTAGAAGGTGAAATTGGAAAAGATGTCTCCACTTCTTACAGACATTCTTGGTCAGACCACAAGCACCTTGCACAGCCTGACACCGCAGTAATTTCAGTTGTAGGCAGTCGGCACAATCAGAACCTTGGTTGTTATCCAGTAGAGCTGGAGAGAGGCCCCCGGGGCTTTGGATTCAGCCTCCGAGGGGGGAAGGAGTACAACATGGGGCTGTTCATCCTTCGTCTTGCTGAAGATGGTCCTGCCATCAAAGATGGCAGAATTCATGTTGGTGACCAGATTGTTGAAATCAATGGGGAACCTACACAAGGAATCACACATACTCGAGCAATTGAGCTCATTCAGGCTGGTGGAAATAAAGTTCTTCTTCTTTTGAGGCCAGGAACTGGCTTGATACCTGACCATGGTGATTGGGATATTAATAATCCTTCGTCTTCAAATGTGATTTATGATGAACAGTCACCATTACCCCCATCTTCACATTTTGCTTCCATATTTGAAGAGTCTCACGTGCCAGTAATTGAAGAATCTTTGAGAGTTCAGATATGTGAAAAGGCAGAAGAATTAAAGGACATTGTGCCTGAAAAGAAAAGCACTTTAAATGAAAATCAGCCTGAGATAAAGCATCAGTCTCTTCTCCAGAAAAATGTGAGTAAGAGGGATCCACCCAGCAGTCATGGGCACAGTAACAAGAAAAATCTATTAAAAGTAGAAAATGGTGTTACACGAAGAGGTAGATCGGTTAGTCCCAAAAAGCCAGCCAGTCAACATTCAGAGGAACATTTGGATAAGATTCCTAGTCCTCTAAAAAATAACCCCAAAAGAAGACCCAGAGATCAATCCCTCAGCCCCAGCAAAGGGGAAAATAAAAGTTGTCAGGTCAGCACCAGGGCAGGCTCTGGACAAGATCAGTGCAGAAAAAGCAGAGGTCGGTCGGCCAGCCCAAAAAAGCAGCAAAAAATTGAAGGAAGCA")
alphabet(d, baseOnly=TRUE) 
alphabetFrequency(d)

 

works

 

but adding one extra nucleotide fails

library("Biostrings")    
DNA_BASES
DNA_ALPHABET
d <- DNAString("CGTCTCGCGTCTGGGAACGGCCGGGCCCCCAGCGGGCTGTGGTCGCGGGGTGGGGGCCGGAGCGGCGAGGCCCCCCTTACCGGGCTGCGCGGGCCGCCCAGGGCCCCCGGGCTGAGACGGGGCCGGAGCGGCGCCCCGGCCGCCCGCGCGGGGTCTCCCCCATGGTGCAGCGGGGTTCGGGATGTCGAAGACGCTGAAGAAGAAGAAGCACTGGCTCAGCAAGGTGCAGGAGTGCGCCGTGTCCTGGGCCGGGCCCCCGGGCGACTTCGGCGCGGAGATCCGCGGTGGCGCGGAGCGTGGCGAGTTCCCCTACCTGGGGCGGCTCCGCGAGGAGCCCGGCGGGGGCACCTGCTGCGTCGTCTCGGGCAAGGCGCCCAGCCCAGGCGATGTGCTGCTGGAGGTAAACGGGACGCCTGTCAGCGGGCTCACCAACCGGGACACCCTGGCTGTCATCCGCCACTTCCGCGAGCCCATCCGTCTCAAGACTGTGAAACCAGGCAAAGTCATTAATAAAGATTTGCGGCATTACCTAAGTCTTCAGTTTCAAAAAGGATCAATTGACCACAAACTGCAGCAAGTGATCAGAGATAATCTCTACTTGAGAACCATTCCATGCACTACAAGGGCCCCCAGGGATGGAGAAGTACCAGGAGTGGATTATAATTTCATTTCCGTTGAACAGTTCAAAGCACTGGAAGAGAGTGGAGCATTGTTAGAAAGTGGGACATATGATGGAAACTTCTATGGAACTCCCAAGCCTCCAGCAGAACCCAGCCCTTTTCAGCCAGATCCAGTTGATCAAGTCCTCTTTGATAATGAGTTTGATGCAGAATCTCAAAGAAAACGAACGACATCTGTCAGCAAGATGGAAAGAATGGATAGCTCTCTTCCTGAAGAGGAAGAAGATGAGGACAAGGAAGCTATTAATGGCAGTGGAAACGCAGAAAACAGAGAGAGGCATTCTGAGTCATCTGACTGGATGAAGACTGTTCCAAGTTACAACCAAACAAATAGCTCCATGGACTTTAGAAATTATATGATGAGAGATGAGACTCTGGAACCACTGCCCAAAAACTGGGAAATGGCCTACACTGACACAGGGATGATCTACTTCATTGACCACAATACCAAGACAACCACCTGGTTGGATCCTCGTCTTTGTAAGAAAGCCAAAGCCCCTGAAGACTGTGAAGATGGAGAGCTTCCTTATGGCTGGGAGAAAATAGAGGACCCTCAGTATGGGACATACTATGTTGATCACCTTAACCAGAAAACCCAGTTTGAAAATCCAGTGGAGGAAGCCAAAAGGAAAAAGCAGTTAGGACAGGTTGAAATTGGGTCTTCAAAACCAGATATGGAAAAATCACACTTCACAAGAGATCCATCCCAGCTTAAAGGTGTCCTTGTTCGAGCATCACTGAAAAAAAGCACAATGGGATTTGGTTTTACTATTATTGGTGGAGATAGACCTGATGAGTTCCTACAAGTGAAAAATGTGCTGAAAGATGGTCCCGCAGCTCAGGATGGGAAAATTGCACCAGGCGATGTTATTGTAGACATCAATGGCAACTGTGTCCTCGGTCACACTCATGCAGATGTTGTCCAGATGTTTCAATTGGTACCTGTCAATCAGTATGTAAACCTCACTTTATGTCGTGGTTATCCACTTCCTGATGACAGTGAAGATCCTGTTGTGGACATTGTTGCTGCTACCCCTGTCATCAATGGACAGTCATTAACCAAGGGAGAGACTTGCATGAATCCTCAGGATTTTAAGCCAGGAGCAATGGTTCTGGAGCAGAATGGAAAATCGGGACACACTTTGACTGGTGATGGTCTCAATGGACCATCAGATGCAAGTGAGCAGAGAGTATCCATGGCATCGTCAGGCAGCTCCCAGCCTGAACTAGTGACTATCCCTTTGATTAAGGGCCCTAAAGGGTTTGGGTTTGCAATTGCTGACAGCCCTACTGGACAGAAGGTGAAAATGATACTGGATAGTCAGTGGTGTCAAGGCCTTCAGAAAGGAGATATAATTAAGGAAATATACCATCAAAATGTGCAGAATTTAACACATCTCCAAGTGGTAGAGGTGCTAAAGCAGTTTCCAGTAGGTGCTGATGTACCATTGCTTATCTTAAGAGGAGGTCCTCCTTCACCAACCAAAACTGCCAAAATGAAAACAGATAAAAAGGAAAATGCAGGAAGTTTGGAGGCCATAAATGAGCCTATTCCTCAGCCTATGCCTTTTCCACCGAGCATTATCAGGTCAGGATCCCCAAAATTGGATCCTTCTGAGGTCTACCTGAAATCTAAGACTTTATATGAAGATAAACCACCAAACACCAAAGATTTGGATGTTTTTCTTCGAAAACAAGAGTCAGGGTTTGGCTTCAGGGTGCTAGGAGGAGATGGACCTGACCAGTCTATATATATTGGGGCTATTATTCCCCTGGGAGCAGCTGAGAAAGATGGTCGGCTCCGCGCAGCTGATGAACTAATGTGCATTGATGGAATTCCTGTTAAAGGGAAATCACACAAACAAGTCTTGGACCTCATGACAACTGCTGCTCGAAATGGCCATGTGTTACTAACTGTCAGACGGAAGATCTTCTATGGAGAAAAACAACCCGAGGACGACAGCTCTCAGGCCTTCATTTCAACACAGAATGGATCTCCCCGCCTGAACCGGGCAGAGGTCCCAGCCAGGCCTGCACCCCAGGAGCCCTATGATGTTGTCTTGCAACGAAAAGAAAATGAAGGATTTGGCTTTGTCATCCTCACCTCCAAAAACAAACCACCTCCAGGAGTTATTCCTCATAAAATTGGCCGAGTCATAGAAGGAAGTCCGGCTGACCGCTGTGGAAAACTGAAAGTTGGAGATCATATCTCTGCAGTGAATGGGCAGTCCATTGTTGAACTGTCTCATGATAACATTGTTCAGCTGATCAAAGATGCTGGTGTCACCGTCACACTAACGGTCATTGCTGAAGAAGAGCATCATGGTCCACCATCAGGAACAAACTCAGCCAGGCAAAGCCCAGCCCTGCAGCACAGGCCCATGGGACAGTCACAGGCCAACCACATACCTGGGGACAGAAGTGCCCTAGAAGGTGAAATTGGAAAAGATGTCTCCACTTCTTACAGACATTCTTGGTCAGACCACAAGCACCTTGCACAGCCTGACACCGCAGTAATTTCAGTTGTAGGCAGTCGGCACAATCAGAACCTTGGTTGTTATCCAGTAGAGCTGGAGAGAGGCCCCCGGGGCTTTGGATTCAGCCTCCGAGGGGGGAAGGAGTACAACATGGGGCTGTTCATCCTTCGTCTTGCTGAAGATGGTCCTGCCATCAAAGATGGCAGAATTCATGTTGGTGACCAGATTGTTGAAATCAATGGGGAACCTACACAAGGAATCACACATACTCGAGCAATTGAGCTCATTCAGGCTGGTGGAAATAAAGTTCTTCTTCTTTTGAGGCCAGGAACTGGCTTGATACCTGACCATGGTGATTGGGATATTAATAATCCTTCGTCTTCAAATGTGATTTATGATGAACAGTCACCATTACCCCCATCTTCACATTTTGCTTCCATATTTGAAGAGTCTCACGTGCCAGTAATTGAAGAATCTTTGAGAGTTCAGATATGTGAAAAGGCAGAAGAATTAAAGGACATTGTGCCTGAAAAGAAAAGCACTTTAAATGAAAATCAGCCTGAGATAAAGCATCAGTCTCTTCTCCAGAAAAATGTGAGTAAGAGGGATCCACCCAGCAGTCATGGGCACAGTAACAAGAAAAATCTATTAAAAGTAGAAAATGGTGTTACACGAAGAGGTAGATCGGTTAGTCCCAAAAAGCCAGCCAGTCAACATTCAGAGGAACATTTGGATAAGATTCCTAGTCCTCTAAAAAATAACCCCAAAAGAAGACCCAGAGATCAATCCCTCAGCCCCAGCAAAGGGGAAAATAAAAGTTGTCAGGTCAGCACCAGGGCAGGCTCTGGACAAGATCAGTGCAGAAAAAGCAGAGGTCGGTCGGCCAGCCCAAAAAAGCAGCAAAAAATTGAAGGAAGCAT")
alphabet(d, baseOnly=TRUE) 
alphabetFrequency(d)

 

Any Ideas

 

 

 

 

dnastring • 379 views
ADD COMMENTlink modified 2.5 years ago by Hervé Pagès ♦♦ 14k • written 2.5 years ago by pgreenbank0
Answer: DNAString - Long String
0
gravatar for James W. MacDonald
2.5 years ago by
United States
James W. MacDonald51k wrote:

If I rename the second string as d2, I get:

> alphabetFrequency(d)
   A    C    G    T    M    R    W    S    Y    K    V    H    D    B    N    -
1208  936 1038  894    0    0    0    0    0    0    0    0    0    0    0    0
   +    .
   0    0
> alphabetFrequency(d2)
   A    C    G    T    M    R    W    S    Y    K    V    H    D    B    N    -
1208  936 1038  895    0    0    0    0    0    0    0    0    0    0    0    0
   +    .
   0    0
> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] Biostrings_2.44.1   XVector_0.16.0      IRanges_2.10.2     
[4] S4Vectors_0.14.3    BiocGenerics_0.22.0

loaded via a namespace (and not attached):
[1] zlibbioc_1.22.0 compiler_3.4.0  tools_3.4.0   

 

ADD COMMENTlink written 2.5 years ago by James W. MacDonald51k
Answer: DNAString - Long String
0
gravatar for Hervé Pagès
2.5 years ago by
Hervé Pagès ♦♦ 14k
United States
Hervé Pagès ♦♦ 14k wrote:

The max length of a DNAString object is .Machine$integer.max, that is 2^31-1 on any platform I know.

H.

ADD COMMENTlink written 2.5 years ago by Hervé Pagès ♦♦ 14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 465 users visited in the last hour