Entering edit mode
mauede@alice.it
▴
870
@mauedealiceit-3511
Last seen 10.2 years ago
Please, find attached the crude script that worked.
Regards,
Maura
-----Messaggio originale-----
Da: mauede at alice.it
Inviato: gio 02/07/2009 8.11
A: Miichael Watson; Sean Davis; Steve Lianoglou
Oggetto: Help with symbol names mapping between miRecords and BioMart
I extracted some VALIDATED miRNAs and *hopefully* I paired them with
their respective VALIDATED genes 3utr sequence.
I am NOT sure about my mapping between BioMart and miRecords objects
name.
Clearly the output of my algorithm depends upon the correct (is it ?)
names mapping.
> [miR-130a]
[1] "TAAACTACCTAACATTATTTATTCAGCTTCATTTGTGTCAATGGGCAATGACAGGTAAATTAAGA
CATGCACTATGAGGAATAATTATTTATTTAATAACAATTGTTTGGGGTTGAAAATTCAAAAAGTGTTTAT
TTTTCATATTGTGCCAATATGTATTGTAAACATGTGTTTTAATTCCAATATGATGACTCCCTTAAAATAG
AAATAAGTGGTTATTTCTCAACAAAGCACAGTGTTAAATGAAATTGTAAAACCTGTCAATGATACAGTCC
CTAAAGAAAAAAAATCATTGCTTTGAAGCAGTTGTGTCAGCTACTGCGGAAAAGGAAGGAAACTCCTGAC
AGTCTTGTGCTTTTCCTATTTGTTTTCATGGTGAAAATGTACTGAGATTTTGGTATTACACTGTATTTGT
ATCTCTGAAGCATGTTTCATGTTTTGTGACTATATAGAGATGTTTTTAAAAGTTTCAATGTGATTCTAAT
GTCTTCATTTCATTGTATGATGTGTTGTGATAGCTAACATTTT"
> hsa-let-7c
[1] "ATTGTCATTGGAGGAGTCCAGGATAGCTCTTCATGTTATTTTCACCTTGAGGAATTGTCCATTAC
ATCTATGAGCCTTATGTGTGGCTTTCTCCGATATAGAAACCTATCAGGTGTCTTTTAGATCATTTCAAAA
CACTGGCTTATTCTTTCTTATGTTTCCAACTGAAGTCTGCATCCCAAGATGTAGTTTCACTGCTACCCCA
TATGGCACCCTTGTACGAATTTGAAAAAAGTACTCACTCTAGGCACATGCAGAGCCATGCCTGCGGGGAC
AGCTTAGAGAGTAGAGGGTGGGCTGAACTCCAGTTACTCTCGTACAGGGATCCACCTTTTTGCAGAAATC
ACAGTGTGGCTATGGTGTGGTTTGATTTCATAAAACAGATGCTT"
[2] "TTGCATTTCCTAGGTTTCTGTGTTTGGGGTGTGTGTGCGTGTCTCTCTCTCTCTCTCTCTCTTTC
TCTTTCTCTCTCTTTTTGAATTTCAAAGAAGAAACAGTCTCAGGGAAATTTCTTTTTTCTTTTTTTTTTT
TAAAGAGAACAAGAAAAGTACAACATTGCTTAAGTCCTACCTCATCTTTATTTTTTTACAGATGAATGTA
CTTATCTTTTCTGCAGGGATTGAGCCTGTGAAGTGATAATTTCTATCTACCTCATAAATCTTTACATTTC
CTTCTGCAACAGGCCCTCTTCCCCTCCTCAGTGGAGTTTGCATTTCCCTCTTCCCCTGCGTGGGGCATGA
TATGCACAAGCCTGGCATCTGTATGGCTGGGAGGGCACTGGATGTGTGTGGTGGGGTGTATTCTGTAGAT
TGAGCCAAGGAAACACAAAAAAAAACTACTAAGT"
[3] "Sequence unavailable"
[4] "GCCACCCACCTTGGCCTCTCAAAGTGCTGGGAATACAGGCGTGAGCCATCGTGCCTGGTCTAAAA
AATGTCTATTAGTGTTCCATCACTAGATCTCTTCTGAGGTATTCATGCCATATGCCCCATCCTGATGTCA
TATCCACAGGACAATCTACTACCAAGAACCAGCTCCAAGAAGAAAACATCTCTGGGAAACAGTACCAAAA
GGAGTCACTGAATTGTCATTGGAGGAGTCCAGGATAGCTCTTCATGTTATTTTCACCTTGAGGAATTGTC
CATTACATCTATGAGCCTTATGTGTGGCTTTCTCCGATATAGAAACCTATCAGGTGTCTTTTAGATCATT
TCAAAACACTGGCTTATTCTTTCTTATGTTTCCAACTGAAGTCTGCATCCCAAGATGTAGTTTCACTGCT
ACCCCATATGGCACCCTTGTACGAATTTGAAAAAAGTACTCACTCTAGGCACATGCAGAGCCATGCCTGC
GGGGACAGCTTAGAGAGTAGAGGGTGGGCTGAACTCCAGTTACTCTCG"
[5] "GGGGCGCCAACGTTCGATTTCTACCTCAGCAGCAGTTGGATCTTTTGAAGGGAGAAGACACTGCA
GTGACCACTTATTCTGTATTGCCATGGTCTTTCCACTTTCATCTGGGGTGGGGTGGGGTGGGGTGGGGGA
GGGGGGGGTGGGGTGGGGAGAAATCACATAACCTTAAAAAGGACTATATTAATCACCTTCTTTGTAATCC
CTTCACAGTCCCAGGTTTAGTGAAAAACTGCTGTAAACACAGGGGACACAGCTTAACAATGCAACTTTTA
ATTACTGTTTTCTTTTTTCTTAACCTACTAATAGTTTGTTGATCTGATAAGCAAGAGTGGGCGGGTGAGA
AAAACCGAATTGGGTTTAGTCAATCACTGCACTGCATGCAAACAAGAAACGTGTCACACTTGTGACGTCG
GGCATTCATATAGGAAGAACGCGGTGTGTAACACTGTGTACACCTCAAATACCACCCCAACCCACTCCCT
GTAGTGAATCCTCTGTTTAGAACACCAAAGATAAGGACTAGATACTACTTTCTCTTTTTCGTATAATCTT
GTAGACACTTACTTGATGATTTTTAACTTTTTATTTCTAAATGAGACGAAATGCTGATGTATCCTTTCAT
TCAGCTAACAAACTAGAAAAGGTTATGTTCATTTTTCAAAAAGGGAAGTAAGCAAACAAATATTGCCAAC
TCTTCTATTTATGGATATCACACATATCAGCAGGAGTAATAAATTTACTCACAGCACTTGTTTTCAGGAC
AACACTTCATTTTCAGGAAATCTACTTCCTACAGAGCCAAAATGCCATTTAGCAATAAATAACACTTGTC
AGCCTCAGAGCATTTAAGGAAACTAGACAAGTAAAATTATCCTCTTTGTAATTTAATGAAAAGGTACAAC
AGAATAATGCATGATGAACTCACCTAATTATGAGGTGGGAGGAGCGAAATCTAAATTTCTTTTGCTATAG
TTATACATCAATTTAAAAAGCAAAAAAAAAAAAGGGGGGGGCAATCTCTCTCTGTGTCTTTCTCTCTCTC
TCTTCCTCTCCCTCTCTCTTTTCATTGTGTATCAGTTTCCATGAAAGACCTGAATACCACTTACCTCAAA
TTAAGCATATGTGTTACTTCAAGTAATACGTTTTGACATAAGATGGTTGACCAAGGTGCTTTTCTTCGGC
TTGAGTTCACCATCTCTTCATTCAAACTGCACTTTTAGCCAGAGATGCAATATATCCCCACTACTCAATA
CTACCTCTGAATGTTACAACGAATTTACAGTCTAGTACTTATTACATGCTGCTATACACAAGCAATGCAA
GAAAAAAACTTACTGGGTAGGTGATTCTAATCATCTGCAGTTCTTTTTGTACACTTAATTACAGTTAAAG
AAGCAATCTCCTTACTGTGTTTCAGCATGACTATGTATTTTTCTATGTTTTTTTAATTAAAAATTTTTAA
AATACTTGTTTCAGCTTCTCTGCTAGATTTCTACATTAACTTGAAAATTTTTTAACCAAGTCGCTCCTAG
GTTCTTAAGGATAATTTTCCTCAATCACACTACACATCACACAAGATTTGACTGTAATATTTAAATATTA
CCCTCCAAGTCTGTACCTCAAATGAATTCTTTAAGGAGATGGACTAATTGACTTGCAAAGACCTACCTCC
AGACTTCAAAAGGAATGAACTTGTTACTTGCAGCATTCATTTGTTTTTTCAATGTTTGAAATAGTTCAAA
CTGCAGCTAACCCTAGTCAAAACTATTTTTGTAAAAGACATTTGATAGAAAGGAACACGTTTTTACATAC
TTTTGCAAAATAAGTAAATAATAAATAAAATAAAAGCCAACCTTCAAAGAAACTTGAAGCTTTGTAGGTG
AGATGCAACAAGCCCTGCTTTTGCATAATGCAATCAAAAATATGTGTTTTTAAGATTAGTTGAATATAAG
AAAATGCTTGACAAATATTTTCATGTATTTTACACAAATGTGATTTTTGTAATATGTCTCAACCAGATTT
ATTTTAAACGCTTCTTATGTAGAGTTTTTATGCCTTTCTCTCCTAGTGAGTGTGCTGACTTTTTAACATG
GTATTATCAACTGGGCCAGGAGGTAGTTTCTCATGACGGCTTTTGTCAGTATGGCTTTTAGTACTGAAGC
CAAATGAAACTCAAAACCATCTCTCTTCCAGCTGCTTCAGGGAGGTAGTTTCAAAGGCCACATACCTCTC
TGAGACTGGCAGATCGCTCACTGTTGTGAATCACCAAAGGAGCTATGGAGAGAATTAAAACTCAACATTA
CTGTTAACTGTGCGTTAAATAAGCAAATAAACAGTGGCTCATAAAAATAAAAGTCGCATTCCATATCTTT
GGATGGGCCTTTTAGAAACCTCATTGGCCAGCTCATAAAATGGAAGCAATTGCTCATGTTGGCCAAACAT
GGTGCACCGAGTGATTTCCATCTCTGGTAAAGTTACACTTTTATTTCCTGTATGTTGTACAATCAAAACA
CACTACTACCTCTTAAGTCCCAGTATACCTCATTTTTCATACTGAAAAAAAAAGCTTGTGGCCAATGGAA
CAGTAAGAACATCATAAAATTTTTATATATATAGTTTATTTTTGTGGGAGATAAATTTTATAGGACTGTT
CTTTGCTGTTGTTGGTCGCAGCTACATAAGACTGGACATTTAACTTTTCTACCATTTCTGCAAGTTAGGT
ATGTTTGCAGGAGAAAAGTATCAAGACGTTTAACTGCAGTTGACTTTCTCCCTGTTCCTTTGAGTGTCTT
CTAACTTTATTCTTTGTTCTTTATGTAGAATTGCTGTCTATGATTGTACTTTGAATCGCTTGCTTGTTGA
AAATATTTCTCTAGTGTATTATCACTGTCTGTTCTGCACAATAAACATAACAGCCTCTGTGATCCCCATG
TGTTTTGATTCCTGCTCTTTGTTACAGTTCCATTAAATGAGTAATAAAGTTTGGTCAAAAC"
I downloaded the VALIDATED xls file from miRecords and discarded the
records that do not pertain to Homo Sapiens. I also
dropped some columns that, as far as I can tell, do not carry data
relevant for my goal.
Then throug BioMart functions I extracted the following fields:
'hgnc_symbol','ensembl_gene_id','external_gene_id','refseq_dna'.
The filters I used assume that
BioMart data named "hgnc_automatic_gene_name" is what miRecords
calls "Target.gene_name" and
BioMart data named "refseq_dna" is what miRecords calls
"Target.gene_Refseq_acc"
Please, check my objects name mapping and let me know if I go it right
/ wrong.
There are a few cases which are not dealt with by my algorithm yet.
That is, some records in miRecords xls file contain non standard miRNA
identifier that I do not understand. For instance:
"hsa-miR-15a/hsa-miR-16" (what does the "/" mean ?)
"[miR-106b]" (what do the square brakets mean ?)
Moreover, there are many redundant lines in my script. It is just a
starting point.
It maybe possible to get the 3UTR sequences of VALIDATED targets
downloading the conjoined information from miRecords and miRDB ...
but it must be harder because the miRecords organization does not
provide any interface library.
I have attached the pruned version of miRecords xls file and my crude
script.
I look forward to your feedback.
Thank you a lot,
Maura
e tutti i telefonini TIM!
Vai su
e tutti i telefonini TIM!
Vai su
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: My_miRec_Validated_Targets.txt
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20090703="" b9355dc8="" attachment.txt="">