Question: identifying drosophila miRNA targets
0
gravatar for Fiona
6.4 years ago by
Fiona70
United Kingdom
Fiona70 wrote:
This is great! This seems to be running fine now and the results are making more sense. I hoped the problem was something to do with not having all my lists of RNAs in the right format, but working out how to code the conversion was beyond me, so thanks very much for taking the time to work that through. Thanks! Fiona Dr Fiona C Ingleby Postdoctoral Research Fellow University of Sussex Email: F.Ingleby@sussex.ac.uk Website: fionaingleby.weebly.com Tel: +44(0)1273678559 On 29 Mar 2013, at 22:11, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Fiona, > > Probably the easiest way to do this is to convert the flybase_cg ids to ensembl IDs. > > ## read sanger data in > ## there is some weird cruft in line 4685, best to just remove the thirteenth column > dat <- read.table("v5.txt.drosophila_melanogaster", sep = "\t", stringsAsFactors = FALSE)[,-13] > library(drosophila.db) > ## map flybase_cg IDs to ensembl > x <- select(org.Dm.eg.db, gsub("-[A-Z]+", "",dat[,12]), c("ENSEMBL"), "FLYBASECG") > ## there are some duplicates here, but I don't think it will matter > ## merge back together and write back out > dat$merge <- gsub("-[A-Z]+", "",dat[,12]) > dat2 <- merge(dat, x, by.x="merge", by.y=1, all.x = TRUE) > write.table(dat2, "v5.txt.drosophila_melanogaster2", sep = "\t", col.names = FALSE, row.names = FALSE, quote = FALSE) > > ## note that I say the file is not sanger, and then tell mirna2mrna() which columns to use. > test <- mirna2mrna(miRNA, "v5.txt.drosophila_melanogaster2", mRNA, "org.Dm.eg.db","drosophila2.db", FALSE, 2,14) > > With the truncated mRNA and miRNA probe IDs you give below, I get no mappings, but I assume you have way more mRNA transcripts. > > Let me know if this works for you. > > Best, > > Jim > > > > On 3/29/2013 8:09 AM, Fiona Ingleby wrote: >> Hi Jim, >> >> Thanks very much for pointing that out - it seems mirna2mrna is exactly what I was after, I don't know how I managed to overlook it . >> >> I'm a bit puzzled about the results I'm getting, however, and so if you have a minute to think this through then I'd be really grateful. The help pages are pretty clear, and so I've managed to get the function to run with my data without any problems .but I'm getting 'named list()' as output. Which might simply suggest that there are no correlations between the miRNAs and mRNAs in my data (?). But I'm not convinced and I'm wondering if I've done something wrong somewhere along the way (I'm looking at 39 differentially expressed miRNAs along with 2638 differentially expressed mRNAs, so I'd be surprised if there were none that correlate with each other). >> >> I'm wondering if I'm doing something daft like using RNA IDs in the wrong format (which might be one explanation for getting 0 matches returned from the database?). At the moment I'm just taking character vectors directly from the ExpressionSet. So I have 2 ExpressionSets, each representing only the probes which are significantly differentially expressed in each dataset - I've called these sigmRNA (2638 x 12 samples) and sigmiRNA (39 x 12 samples) for mRNA and miRNA respectively. >> >> >featureNames(sigmRNA) >> [1] "1622906_at" "1622915_at" "1622917_a_at" "1622920_at" "1622926_at" "1622932_s_at" "1622935_at" "1622940_at" "1622946_at" >> [10] "1622952_at" "1622956_at" "1622959_at" "1622960_at" "1622965_s_at" "1622974_at" "1622975_at" "1622978_at" "1622992_at" >> [19] "1623002_at" "1623004_a_at" "1623008_at" "1623019_a_at" "1623022_at" "1623025_at" "1623026_a_at" "1623030_at" "1623031_a_at" >> >> and so on for 2638 entries. >> >> >featureNames(sigmiRNA) >> [1] "dme-miR-1002_st" "dme-miR-1004_st" "dme-miR-1017_st" "dme-miR- 124_st" "dme-miR-2500_st" "dme-miR-286_st" >> [7] "dme-miR-2a_st" "dme-miR-306_st" "dme-miR-310_st" "dme-miR- 311_st" "dme-miR-312_st" "dme-miR-313_st" >> >> etc. So I'm using mirna2mrna like this: >> >> test<-mirna2mrna(miRNAids=featureNames(sigmiRNA), >> miRNAannot="v5.txt.drosophila_melanogaster", #downloaded from the rbi website and saved in the working directory >> mRNAids=featureNames(sigmRNA), >> orgPkg="org.Dm.eg.db",chipPkg="drosophila2.db", >> sanger=T,miRNAcol=NULL,mRNAcol=NULL,transType="ensembl") >> >> and then I get: >> >> > test >> named list() >> >> I've put the sessionInfo() output at the bottom of the email. I also looked through the source code on the Bioconductor code search website, pulled out the 'convertIDs' function, and ran this as an independent function on the lists of RNAs to check to see what it was doing, but I can't see anything that looks odd to me - it removes the '_st'/'_at' as I expected. >> >> So I'm a bit stuck. I'm sure I've misunderstood something, but can't pick out what it is myself. I suppose it's totally possible that the analysis is fine and there are just no correlations between the miRNAs and mRNAs of interest in my data - but I thought I would check. If you (or anyone) has any ideas, I'd really appreciate the help. >> >> Thanks again, >> >> Fiona >> >> Dr Fiona C Ingleby >> >> Postdoctoral Research Fellow >> University of Sussex >> >> Email: F.Ingleby@sussex.ac.uk <mailto:f.ingleby@sussex.ac.uk> >> Website: fionaingleby.weebly.com <http: fionaingleby.weebly.com=""> >> Tel: +44(0)1273678559 >> >> > sessionInfo() >> R version 2.15.2 (2012-10-26) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] drosophila2.db_2.8.1 org.Dm.eg.db_2.8.0 RSQLite_0.11.2 DBI_0.2-5 AnnotationDbi_1.20.7 Biobase_2.18.0 >> [7] BiocGenerics_0.4.0 >> >> loaded via a namespace (and not attached): >> [1] IRanges_1.16.6 parallel_2.15.2 stats4_2.15.2 tools_2.15.2 >> >> >> >> On 28 Mar 2013, at 16:43, James W. MacDonald <jmacdon@uw.edu <mailto:jmacdon@uw.edu="">> wrote: >> >>> Hi Fiona, >>> >>> I have a function called mirna2mrna (yeah, I know, lame function name...) in my affycoretools package that does this, based on the sanger microcosm targets data that you can download here: >>> >>> http://www.ebi.ac.uk/enright-srv/microcosm/cgi- bin/targets/v5/download.pl >>> >>> there is also a function makeHmap() that will create a heatmap with the miRNA/mRNA pairs, where the color of the cells is based on the correlation between the two RNA species (with the intent to show negative correlations, indicating that the miRNA is hypothetically causing premature degradation of the mRNA). >>> >>> I think the help pages for these two functions are reasonable, but please let me know if you have any questions. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> On 3/28/2013 12:30 PM, Fiona Ingleby wrote: >>>> Hi everyone, >>>> >>>> I am working with mRNA data from Affy 'drosophila2' arrays and miRNA data from Affy 'mirna3' arrays. I have identified a list of differentially expressed mRNAs and miRNAs. I'm having a bit of trouble with some downstream analyses and I'm hoping someone might be able to offer some help. >>>> >>>> I would like to use my list of differentially expressed miRNAs to access online databases (e.g. miRBase, microRNA.org ) and extract the names of all the potential target mRNAs. Then I'd like to use this list of mRNAs to look through my mRNA expression data. I'm aware of packages like 'RmiR' and 'microRNA' which have built-in functions for finding miRNA targets, but as far as I can tell, 'RmiR' uses miRNA databases for humans only and 'microRNA' works with human and mouse data only. So is there a package I am unaware of (or another application of 'RmiR'/'microRNA' that I am unaware of) for looking at drosophila data? >>>> >>>> So far I have also considered the 'biomaRt' package to see if the database query function on there can help me, but I haven't had much luck. For instance, if I try an example list of miRNAs: >>>> >>>> mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973") >>>> library(biomaRt) >>>> ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl") >>>> getBM(attributes="mirbase_accession",filters="mirbase_id",values= mirna,mart=ensembl) >>>> >>>> then 'logical(0)' is returned, as if there are no records for those miRNAs - but by searching the database manually I know the records are there. >>>> >>>> Alternatively I can try: >>>> >>>> miRNA<- getBM(c("mirbase_accession","mirbase_id", "ensembl_gene_id", "start_position", "chromosome_name"), filters = c("with_mirbase"), values = list(T), mart = ensembl) >>>> >>>> which returns a table of various bits of information on miRNAs, but I cannot adapt this command to just look at my list of miRNAs of interest (ie. the 'mirna' vector above). I've included the sessionInfo() output for these at the bottom of the email, but I suspect my problem is more to do with the fact I'm not going about this the right way (as opposed to a problem with package versions and coding etc.). I'm not even sure that using 'biomaRt' will give me the information I eventually want (the target mRNAs of these miRNAs), I was just trying it out, to see what it was capable of in terms of querying these databases. So I apologise for the vagueness. Since I haven't managed to get very far by myself then it's difficult to be more specific, but I'd really appreciate it if anyone could offer some advice, even just to point me in the direction of a useful package which might have gone unnoticed by me. >>>> >>>> Many thanks, >>>> >>>> Fiona >>>> >>>> Dr Fiona C Ingleby >>>> Postdoctoral Research Fellow >>>> University of Sussex >>>> Email: F.Ingleby@sussex.ac.uk >>>> Website: fionaingleby.weebly.com >>>> >>>> >>>>> sessionInfo() >>>> R version 2.15.2 (2012-10-26) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0 BiocGenerics_0.4.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2 lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17 >>>> [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2 XML_3.95-0.2 zlibbioc_1.4.0 >>>> [[alternative HTML version deleted]] >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> >>> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.4 years ago by Fiona70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 278 users visited in the last hour