how to get the miRNA, identifier from database Ensembl through biomaRt functions ???
1
0
Entering edit mode
@mauedealiceit-3511
Last seen 9.6 years ago
Thanks for teaching me how to get plenty of target genes information from Human data set in Ensembl using biomaRt functions A key info I have no idea how to get , though, is the miRNA identifier (for example "hsa-miR-647"). I wonder whether there is an attribute that extract exactly such info from Ensembl .... is there any ? Actually I need both the miRNA identifier and sequence (for instance : >hsa-miR-93 MIMAT0000093 Homo sapiens miR-93 CAAAGUGCUGUUCGUGCAGGUAG) I wonder whether there are any attributes that extract exactly the miRna identifier (hsa-miR-93) and its correspondent sequence (CAAAGUGCUGUUCGUGCAGGUAG) from Ensembl or any other data base that can be related to the data extracted from Ensembl ....are there any ? In fact, my goal is to generate a file with the miRNA identifiers (Ex: hsa-miR-93), the correspondent miRNAs sequence (Ex: CAAAGUGCUGUUCGUGCAGGUAG) followed by the list of target genes identifier and relative 3'UTR sequence. Thanks a lot. Maura -----Messaggio originale----- Da: Sean Davis [mailto:seandavi@gmail.com] Inviato: mer 24/06/2009 18.28 A: mauede@alice.it Cc: bioconductor@stat.math.ethz.ch Oggetto: Re: [BioC] how to find the validated pair (miRNA, gene-3'UTR- sequence) On Wed, Jun 24, 2009 at 11:45 AM, <mauede@alice.it> wrote: > Sorry for my misuse of Biology nomenclature. I am still very confused. > > My first task (very trivial for you) is to generate a text files containing > a list of Homo-Sapiens validated miRNAs (microRNA-identifier, sequence) > and relative 3'UTR regions (gene-identifier, 3'UTR-sequence). Hi, Maura. See here: http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl If you download the text file for human, it looks like: Similarity hsa-miR-647 miRanda miRNA_target 2 120824263 120824281 + . 16.3205 3.701400e-06 ENST00000295228 INHBB Similarity hsa-miR-130a miRanda miRNA_target 2 120825363 120825385 + . 16.5359 1.687830e-02 ENST00000295228 INHBB >From here, you have the miR name, the chromosome (2 in this case), the chromosome start and end positions, and the strand. You can use this to get the sequence from the genome (the fasta sequence for those locations). The transcript name (ENST....) is from the Ensembl database, so there is plenty of information via biomaRt, if necessary, but the HUGO gene symbol is given in the last column. Several of the code snippets you give below give similar information. If you are concerned about what a specific data source is giving you, you should probably contact that data source directly via email. Most websites offer a "contact us" link. If this isn't what you need, then perhaps you can show more specifically how this information is not meeting your needs. Know that you may have to do a little bit of programming to get things into exactly the formats that you like. Sean > > I realize this is just a matter of retrieving all known information. The > difficulty for me is where to find the pair (miRNA, gene-3'UTR) matching > information. > In the following I downloaded a lot of stuff but I do not know how to put > the pieces together to fulfill my task. > I think the 3'UTR sequences can be retrieved through function "getSequence" > from package "biomaRt"m .... if only I knew which parameters to pass to such > a function to achieve my goal. > > 1) Function "hsSeqs" from package "microRNA" produces 677 miRNAs entries > ex. hsa-let-7a "UGAGGUAGUAGGUUGUAUAGUU" > Are such miRNAs validated ? > If the answer is "yes" then how can I retrieve the correspondent > gene-3'UTR regions ? > > 2) Function "hsSeqs" from package "microRNA" produces a matrix 709015x 6 > contaiing miRNA identifiers > and apparently some data from the paired gene. > ex. name target chrom start end > strand > [1,] "hsa-miR-647" "ENST00000295228" "2" "120824263" "120824281" "+" > [2,] "hsa-miR-130a" "ENST00000295228" "2" "120825363" "120825385" "+" > > > Again. how can I retrieve the correspondent gene-3'UTR regions from the > above data ? Note my answer above. The gene 3'UTR information is there, but you may need to do some calculations, depending on what you want. Also, note that "genes" do not have 3'UTRs--only transcripts have that. > > > 3) Function "s3utr" from package "microRNA" produces 112 3'UTR entries > ex. > "CCTGCCCGCCCGCATGGCCAGCCAGTGGCAAGCTGCCGCCCCCACTCTCCGGGCACCGTCTCCTGCC TGTGCGTCCGCCC > > ACCGCTGCCCTGTCTGTTGCGACACCCTCCCCCCCACATACACACGCAGCGTTTTGATAAATTATTGG TTTTCAACG" > > Where do such 3'UTR come from ? Which (miRNA, gene) do they belong to ? > > 4) I downloaded the file "mature.fa" (Fasta format sequences of all mature > miRNA sequences) from http://microrna.sanger.ac.uk/sequences/ftp.shtml > The file contais a number of records starting withthe miRNA identifier. > ex: hsa-miR-943 miRanda miRNA_target 9885484 9885504 15.6748 > 4.721740e-02 + . URL " > http://www.ensembl.org/homo_sapiens/geneview?gene=ENST00000302092" > hsa-miR-944 miRanda miRNA_target 9885188 9885209 16.602 > 1.659470e-03 + . URL " > http://www.ensembl.org/homo_sapiens/geneview?gene=ENST00000302092" > > Where are the 3'UTR regions indicated in the above records ? > > > 5) I downloaded miRNA Validated Targets from > http://mirecords.umn.edu/miRecords/download.php. > It generated a huge XLS file with alot of data. > ex: Pubmed_id Target gene_species_scientific Target gene_species_common > Target gene_name Target gene_Refseq_acc Target site_number > miRNA_species miRNA_mature_ID miRNA_regulation Reporter_target > gene/region Reporter link element Test_method_inter Target gene > mRNA_level Original description Mutation_target region Post > mutation_method Original description_mutation_region Target > site_position A Reporter_target site Reporter link element > Test_method_inter_site Original description_inter_site Mutation_target site > Post mutation_method_site Original description_mutation_site > Mutiple site mutation note Additional note > 12808467 Homo sapiens human Hes1 NM_198155.2 1 > Homo sapiens hsa-miR-23a mutation Western > blotting Next, to examine whether expression of the gene for > Hes1 is regulated by miR-23, we introduced synthetic miR-23 or mutant miR-23 > (Fig. 2a) into undifferentiated NT2 cells. When synthetic miR-23 was > introduced at 2 mMinto undifferentiated NT2 cells,the intracellular level of > Hes1 fell significantly (Fig. 2b).By contrast,in the presence of synthetic > mutant miR-23,the level of Hes1 in undifferentiated NT2 cells remained > unchanged and similar to that in untreated wild-type NT2 cells (Fig. 2b). > 801 overexpression by mature miRNA transfection > luciferase target site(five copies of the target sequence) activity > assay Furthermore, the luciferase activity of LucSTS23 in undifferentiated > NT2 cells that had been treated with synthetic miR-23 was lower than that in > untreated wild-type NT2 cells (Fig. 3c). Yes Luciferase activity > assay Furthermore, the luciferase activity of LucSTS23 in > undifferentiated NT2 cells that had been treated with synthetic miR-23 was > lower than that in untreated wild-type NT2 cells (Fig. 3c). > > Thank you in advance for helping me out of my misery. > Maura > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > tutti i telefonini TIM! [[alternative HTML version deleted]]
miRNA Homo sapiens biomaRt miRNA Homo sapiens biomaRt • 3.5k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Sat, Jun 27, 2009 at 4:51 AM, <mauede@alice.it> wrote: > Thanks for teaching me how to get plenty of target genes information from > Human data set in Ensembl using biomaRt functions > A key info I have no idea how to get , though, is the miRNA identifier (for > example "hsa-miR-647"). > I wonder whether there is an attribute that extract exactly such info from > Ensembl .... is there any ? > Actually I need both the miRNA identifier and sequence > (for instance : > >hsa-miR-93 MIMAT0000093 Homo sapiens miR-93 > CAAAGUGCUGUUCGUGCAGGUAG) > > > > I wonder whether there are any attributes that extract exactly the miRna > identifier (hsa-miR-93) > and its correspondent sequence (CAAAGUGCUGUUCGUGCAGGUAG) from Ensembl or > any other data base that can be related > to the data extracted from Ensembl ....are there any ? > miRBase contains the fasta sequences for the miRNAs. http://microrna.sanger.ac.uk/sequences/ftp.shtml The information you want is available from that link. > > In fact, my goal is to generate a file with the miRNA identifiers (Ex: > hsa-miR-93), the correspondent miRNAs sequence (Ex: CAAAGUGCUGUUCGUGCAGGUAG) > followed by the list of target genes identifier and relative 3'UTR > sequence. > You'll want to apply the information from previous emails to complete this task. Sean > > -----Messaggio originale----- > Da: Sean Davis [mailto:seandavi@gmail.com <seandavi@gmail.com>] > Inviato: mer 24/06/2009 18.28 > A: mauede@alice.it > Cc: bioconductor@stat.math.ethz.ch > Oggetto: Re: [BioC] how to find the validated pair (miRNA, > gene-3'UTR-sequence) > > On Wed, Jun 24, 2009 at 11:45 AM, <mauede@alice.it> wrote: > > > Sorry for my misuse of Biology nomenclature. I am still very confused. > > > > My first task (very trivial for you) is to generate a text files > containing > > a list of Homo-Sapiens validated miRNAs (microRNA-identifier, sequence) > > and relative 3'UTR regions (gene-identifier, 3'UTR-sequence). > > > Hi, Maura. See here: > > http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl > > If you download the text file for human, it looks like: > > Similarity hsa-miR-647 miRanda miRNA_target 2 > 120824263 120824281 + . 16.3205 3.701400e-06 > ENST00000295228 INHBB > Similarity hsa-miR-130a miRanda miRNA_target 2 > 120825363 120825385 + . 16.5359 1.687830e-02 > ENST00000295228 INHBB > > From here, you have the miR name, the chromosome (2 in this case), the > chromosome start and end positions, and the strand. You can use this to > get > the sequence from the genome (the fasta sequence for those locations). The > transcript name (ENST....) is from the Ensembl database, so there is plenty > of information via biomaRt, if necessary, but the HUGO gene symbol is given > in the last column. > > Several of the code snippets you give below give similar information. If > you are concerned about what a specific data source is giving you, you > should probably contact that data source directly via email. Most websites > offer a "contact us" link. > > If this isn't what you need, then perhaps you can show more specifically > how > this information is not meeting your needs. Know that you may have to do a > little bit of programming to get things into exactly the formats that you > like. > > Sean > > > > > > I realize this is just a matter of retrieving all known information. The > > difficulty for me is where to find the pair (miRNA, gene-3'UTR) matching > > information. > > In the following I downloaded a lot of stuff but I do not know how to put > > the pieces together to fulfill my task. > > I think the 3'UTR sequences can be retrieved through function > "getSequence" > > from package "biomaRt"m .... if only I knew which parameters to pass to > such > > a function to achieve my goal. > > > > 1) Function "hsSeqs" from package "microRNA" produces 677 miRNAs entries > > ex. hsa-let-7a "UGAGGUAGUAGGUUGUAUAGUU" > > Are such miRNAs validated ? > > If the answer is "yes" then how can I retrieve the correspondent > > gene-3'UTR regions ? > > > > 2) Function "hsSeqs" from package "microRNA" produces a matrix 709015x 6 > > contaiing miRNA identifiers > > and apparently some data from the paired gene. > > ex. name target chrom start end > > strand > > [1,] "hsa-miR-647" "ENST00000295228" "2" "120824263" "120824281" > "+" > > [2,] "hsa-miR-130a" "ENST00000295228" "2" "120825363" "120825385" > "+" > > > > > > Again. how can I retrieve the correspondent gene-3'UTR regions from the > > above data ? > > > Note my answer above. The gene 3'UTR information is there, but you may > need > to do some calculations, depending on what you want. Also, note that > "genes" do not have 3'UTRs--only transcripts have that. > > > > > > > > 3) Function "s3utr" from package "microRNA" produces 112 3'UTR entries > > ex. > > > "CCTGCCCGCCCGCATGGCCAGCCAGTGGCAAGCTGCCGCCCCCACTCTCCGGGCACCGTCTCCTGCC TGTGCGTCCGCCC > > > > > ACCGCTGCCCTGTCTGTTGCGACACCCTCCCCCCCACATACACACGCAGCGTTTTGATAAATTATTGG TTTTCAACG" > > > > Where do such 3'UTR come from ? Which (miRNA, gene) do they belong to ? > > > > 4) I downloaded the file "mature.fa" (Fasta format sequences of all > mature > > miRNA sequences) from http://microrna.sanger.ac.uk/sequences/ftp.shtml > > The file contais a number of records starting withthe miRNA > identifier. > > ex: hsa-miR-943 miRanda miRNA_target 9885484 9885504 15.6748 > > 4.721740e-02 + . URL " > > http://www.ensembl.org/homo_sapiens/geneview?gene=ENST00000302092" > > hsa-miR-944 miRanda miRNA_target 9885188 9885209 16.602 > > 1.659470e-03 + . URL " > > http://www.ensembl.org/homo_sapiens/geneview?gene=ENST00000302092" > > > > Where are the 3'UTR regions indicated in the above records ? > > > > > > 5) I downloaded miRNA Validated Targets from > > http://mirecords.umn.edu/miRecords/download.php. > > It generated a huge XLS file with alot of data. > > ex: Pubmed_id Target gene_species_scientific Target > gene_species_common > > Target gene_name Target gene_Refseq_acc Target site_number > > miRNA_species miRNA_mature_ID miRNA_regulation Reporter_target > > gene/region Reporter link element Test_method_inter Target > gene > > mRNA_level Original description Mutation_target region Post > > mutation_method Original description_mutation_region Target > > site_position A Reporter_target site Reporter link element > > Test_method_inter_site Original description_inter_site Mutation_target > site > > Post mutation_method_site Original description_mutation_site > > Mutiple site mutation note Additional note > > 12808467 Homo sapiens human Hes1 NM_198155.2 1 > > Homo sapiens hsa-miR-23a mutation Western > > blotting Next, to examine whether expression of the gene > for > > Hes1 is regulated by miR-23, we introduced synthetic miR-23 or mutant > miR-23 > > (Fig. 2a) into undifferentiated NT2 cells. When synthetic miR-23 was > > introduced at 2 mMinto undifferentiated NT2 cells,the intracellular level > of > > Hes1 fell significantly (Fig. 2b).By contrast,in the presence of > synthetic > > mutant miR-23,the level of Hes1 in undifferentiated NT2 cells remained > > unchanged and similar to that in untreated wild-type NT2 cells (Fig. 2b). > > 801 overexpression by mature miRNA > transfection > > luciferase target site(five copies of the target sequence) > activity > > assay Furthermore, the luciferase activity of LucSTS23 in > undifferentiated > > NT2 cells that had been treated with synthetic miR-23 was lower than that > in > > untreated wild-type NT2 cells (Fig. 3c). Yes Luciferase activity > > assay Furthermore, the luciferase activity of LucSTS23 in > > undifferentiated NT2 cells that had been treated with synthetic miR-23 > was > > lower than that in untreated wild-type NT2 cells (Fig. 3c). > > > > Thank you in advance for helping me out of my misery. > > Maura > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e > tutti i telefonini TIM! > Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6