Search
Question: I: Help with symbol names mapping between miRecords and BioMart
0
gravatar for mauede@alice.it
8.4 years ago by
mauede@alice.it850 wrote:
Please, find attached the crude script that worked. Regards, Maura -----Messaggio originale----- Da: mauede at alice.it Inviato: gio 02/07/2009 8.11 A: Miichael Watson; Sean Davis; Steve Lianoglou Oggetto: Help with symbol names mapping between miRecords and BioMart I extracted some VALIDATED miRNAs and *hopefully* I paired them with their respective VALIDATED genes 3utr sequence. I am NOT sure about my mapping between BioMart and miRecords objects name. Clearly the output of my algorithm depends upon the correct (is it ?) names mapping. > [miR-130a] [1] "TAAACTACCTAACATTATTTATTCAGCTTCATTTGTGTCAATGGGCAATGACAGGTAAATTAAGA CATGCACTATGAGGAATAATTATTTATTTAATAACAATTGTTTGGGGTTGAAAATTCAAAAAGTGTTTAT TTTTCATATTGTGCCAATATGTATTGTAAACATGTGTTTTAATTCCAATATGATGACTCCCTTAAAATAG AAATAAGTGGTTATTTCTCAACAAAGCACAGTGTTAAATGAAATTGTAAAACCTGTCAATGATACAGTCC CTAAAGAAAAAAAATCATTGCTTTGAAGCAGTTGTGTCAGCTACTGCGGAAAAGGAAGGAAACTCCTGAC AGTCTTGTGCTTTTCCTATTTGTTTTCATGGTGAAAATGTACTGAGATTTTGGTATTACACTGTATTTGT ATCTCTGAAGCATGTTTCATGTTTTGTGACTATATAGAGATGTTTTTAAAAGTTTCAATGTGATTCTAAT GTCTTCATTTCATTGTATGATGTGTTGTGATAGCTAACATTTT" > hsa-let-7c [1] "ATTGTCATTGGAGGAGTCCAGGATAGCTCTTCATGTTATTTTCACCTTGAGGAATTGTCCATTAC ATCTATGAGCCTTATGTGTGGCTTTCTCCGATATAGAAACCTATCAGGTGTCTTTTAGATCATTTCAAAA CACTGGCTTATTCTTTCTTATGTTTCCAACTGAAGTCTGCATCCCAAGATGTAGTTTCACTGCTACCCCA TATGGCACCCTTGTACGAATTTGAAAAAAGTACTCACTCTAGGCACATGCAGAGCCATGCCTGCGGGGAC AGCTTAGAGAGTAGAGGGTGGGCTGAACTCCAGTTACTCTCGTACAGGGATCCACCTTTTTGCAGAAATC ACAGTGTGGCTATGGTGTGGTTTGATTTCATAAAACAGATGCTT" [2] "TTGCATTTCCTAGGTTTCTGTGTTTGGGGTGTGTGTGCGTGTCTCTCTCTCTCTCTCTCTCTTTC TCTTTCTCTCTCTTTTTGAATTTCAAAGAAGAAACAGTCTCAGGGAAATTTCTTTTTTCTTTTTTTTTTT TAAAGAGAACAAGAAAAGTACAACATTGCTTAAGTCCTACCTCATCTTTATTTTTTTACAGATGAATGTA CTTATCTTTTCTGCAGGGATTGAGCCTGTGAAGTGATAATTTCTATCTACCTCATAAATCTTTACATTTC CTTCTGCAACAGGCCCTCTTCCCCTCCTCAGTGGAGTTTGCATTTCCCTCTTCCCCTGCGTGGGGCATGA TATGCACAAGCCTGGCATCTGTATGGCTGGGAGGGCACTGGATGTGTGTGGTGGGGTGTATTCTGTAGAT TGAGCCAAGGAAACACAAAAAAAAACTACTAAGT" [3] "Sequence unavailable" [4] "GCCACCCACCTTGGCCTCTCAAAGTGCTGGGAATACAGGCGTGAGCCATCGTGCCTGGTCTAAAA AATGTCTATTAGTGTTCCATCACTAGATCTCTTCTGAGGTATTCATGCCATATGCCCCATCCTGATGTCA TATCCACAGGACAATCTACTACCAAGAACCAGCTCCAAGAAGAAAACATCTCTGGGAAACAGTACCAAAA GGAGTCACTGAATTGTCATTGGAGGAGTCCAGGATAGCTCTTCATGTTATTTTCACCTTGAGGAATTGTC CATTACATCTATGAGCCTTATGTGTGGCTTTCTCCGATATAGAAACCTATCAGGTGTCTTTTAGATCATT TCAAAACACTGGCTTATTCTTTCTTATGTTTCCAACTGAAGTCTGCATCCCAAGATGTAGTTTCACTGCT ACCCCATATGGCACCCTTGTACGAATTTGAAAAAAGTACTCACTCTAGGCACATGCAGAGCCATGCCTGC GGGGACAGCTTAGAGAGTAGAGGGTGGGCTGAACTCCAGTTACTCTCG" [5] "GGGGCGCCAACGTTCGATTTCTACCTCAGCAGCAGTTGGATCTTTTGAAGGGAGAAGACACTGCA GTGACCACTTATTCTGTATTGCCATGGTCTTTCCACTTTCATCTGGGGTGGGGTGGGGTGGGGTGGGGGA GGGGGGGGTGGGGTGGGGAGAAATCACATAACCTTAAAAAGGACTATATTAATCACCTTCTTTGTAATCC CTTCACAGTCCCAGGTTTAGTGAAAAACTGCTGTAAACACAGGGGACACAGCTTAACAATGCAACTTTTA ATTACTGTTTTCTTTTTTCTTAACCTACTAATAGTTTGTTGATCTGATAAGCAAGAGTGGGCGGGTGAGA AAAACCGAATTGGGTTTAGTCAATCACTGCACTGCATGCAAACAAGAAACGTGTCACACTTGTGACGTCG GGCATTCATATAGGAAGAACGCGGTGTGTAACACTGTGTACACCTCAAATACCACCCCAACCCACTCCCT GTAGTGAATCCTCTGTTTAGAACACCAAAGATAAGGACTAGATACTACTTTCTCTTTTTCGTATAATCTT GTAGACACTTACTTGATGATTTTTAACTTTTTATTTCTAAATGAGACGAAATGCTGATGTATCCTTTCAT TCAGCTAACAAACTAGAAAAGGTTATGTTCATTTTTCAAAAAGGGAAGTAAGCAAACAAATATTGCCAAC TCTTCTATTTATGGATATCACACATATCAGCAGGAGTAATAAATTTACTCACAGCACTTGTTTTCAGGAC AACACTTCATTTTCAGGAAATCTACTTCCTACAGAGCCAAAATGCCATTTAGCAATAAATAACACTTGTC AGCCTCAGAGCATTTAAGGAAACTAGACAAGTAAAATTATCCTCTTTGTAATTTAATGAAAAGGTACAAC AGAATAATGCATGATGAACTCACCTAATTATGAGGTGGGAGGAGCGAAATCTAAATTTCTTTTGCTATAG TTATACATCAATTTAAAAAGCAAAAAAAAAAAAGGGGGGGGCAATCTCTCTCTGTGTCTTTCTCTCTCTC TCTTCCTCTCCCTCTCTCTTTTCATTGTGTATCAGTTTCCATGAAAGACCTGAATACCACTTACCTCAAA TTAAGCATATGTGTTACTTCAAGTAATACGTTTTGACATAAGATGGTTGACCAAGGTGCTTTTCTTCGGC TTGAGTTCACCATCTCTTCATTCAAACTGCACTTTTAGCCAGAGATGCAATATATCCCCACTACTCAATA CTACCTCTGAATGTTACAACGAATTTACAGTCTAGTACTTATTACATGCTGCTATACACAAGCAATGCAA GAAAAAAACTTACTGGGTAGGTGATTCTAATCATCTGCAGTTCTTTTTGTACACTTAATTACAGTTAAAG AAGCAATCTCCTTACTGTGTTTCAGCATGACTATGTATTTTTCTATGTTTTTTTAATTAAAAATTTTTAA AATACTTGTTTCAGCTTCTCTGCTAGATTTCTACATTAACTTGAAAATTTTTTAACCAAGTCGCTCCTAG GTTCTTAAGGATAATTTTCCTCAATCACACTACACATCACACAAGATTTGACTGTAATATTTAAATATTA CCCTCCAAGTCTGTACCTCAAATGAATTCTTTAAGGAGATGGACTAATTGACTTGCAAAGACCTACCTCC AGACTTCAAAAGGAATGAACTTGTTACTTGCAGCATTCATTTGTTTTTTCAATGTTTGAAATAGTTCAAA CTGCAGCTAACCCTAGTCAAAACTATTTTTGTAAAAGACATTTGATAGAAAGGAACACGTTTTTACATAC TTTTGCAAAATAAGTAAATAATAAATAAAATAAAAGCCAACCTTCAAAGAAACTTGAAGCTTTGTAGGTG AGATGCAACAAGCCCTGCTTTTGCATAATGCAATCAAAAATATGTGTTTTTAAGATTAGTTGAATATAAG AAAATGCTTGACAAATATTTTCATGTATTTTACACAAATGTGATTTTTGTAATATGTCTCAACCAGATTT ATTTTAAACGCTTCTTATGTAGAGTTTTTATGCCTTTCTCTCCTAGTGAGTGTGCTGACTTTTTAACATG GTATTATCAACTGGGCCAGGAGGTAGTTTCTCATGACGGCTTTTGTCAGTATGGCTTTTAGTACTGAAGC CAAATGAAACTCAAAACCATCTCTCTTCCAGCTGCTTCAGGGAGGTAGTTTCAAAGGCCACATACCTCTC TGAGACTGGCAGATCGCTCACTGTTGTGAATCACCAAAGGAGCTATGGAGAGAATTAAAACTCAACATTA CTGTTAACTGTGCGTTAAATAAGCAAATAAACAGTGGCTCATAAAAATAAAAGTCGCATTCCATATCTTT GGATGGGCCTTTTAGAAACCTCATTGGCCAGCTCATAAAATGGAAGCAATTGCTCATGTTGGCCAAACAT GGTGCACCGAGTGATTTCCATCTCTGGTAAAGTTACACTTTTATTTCCTGTATGTTGTACAATCAAAACA CACTACTACCTCTTAAGTCCCAGTATACCTCATTTTTCATACTGAAAAAAAAAGCTTGTGGCCAATGGAA CAGTAAGAACATCATAAAATTTTTATATATATAGTTTATTTTTGTGGGAGATAAATTTTATAGGACTGTT CTTTGCTGTTGTTGGTCGCAGCTACATAAGACTGGACATTTAACTTTTCTACCATTTCTGCAAGTTAGGT ATGTTTGCAGGAGAAAAGTATCAAGACGTTTAACTGCAGTTGACTTTCTCCCTGTTCCTTTGAGTGTCTT CTAACTTTATTCTTTGTTCTTTATGTAGAATTGCTGTCTATGATTGTACTTTGAATCGCTTGCTTGTTGA AAATATTTCTCTAGTGTATTATCACTGTCTGTTCTGCACAATAAACATAACAGCCTCTGTGATCCCCATG TGTTTTGATTCCTGCTCTTTGTTACAGTTCCATTAAATGAGTAATAAAGTTTGGTCAAAAC" I downloaded the VALIDATED xls file from miRecords and discarded the records that do not pertain to Homo Sapiens. I also dropped some columns that, as far as I can tell, do not carry data relevant for my goal. Then throug BioMart functions I extracted the following fields: 'hgnc_symbol','ensembl_gene_id','external_gene_id','refseq_dna'. The filters I used assume that BioMart data named "hgnc_automatic_gene_name" is what miRecords calls "Target.gene_name" and BioMart data named "refseq_dna" is what miRecords calls "Target.gene_Refseq_acc" Please, check my objects name mapping and let me know if I go it right / wrong. There are a few cases which are not dealt with by my algorithm yet. That is, some records in miRecords xls file contain non standard miRNA identifier that I do not understand. For instance: "hsa-miR-15a/hsa-miR-16" (what does the "/" mean ?) "[miR-106b]" (what do the square brakets mean ?) Moreover, there are many redundant lines in my script. It is just a starting point. It maybe possible to get the 3UTR sequences of VALIDATED targets downloading the conjoined information from miRecords and miRDB ... but it must be harder because the miRecords organization does not provide any interface library. I have attached the pruned version of miRecords xls file and my crude script. I look forward to your feedback. Thank you a lot, Maura e tutti i telefonini TIM! Vai su e tutti i telefonini TIM! Vai su -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: My_miRec_Validated_Targets.txt URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20090703="" b9355dc8="" attachment.txt="">
ADD COMMENTlink written 8.4 years ago by mauede@alice.it850
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 210 users visited in the last hour