Search
Question: Regarding extraction of 3' and 5'UTRs and exonic region of a gene.
0
gravatar for abdul rawoof
4.4 years ago by
abdul rawoof60
United States
abdul rawoof60 wrote:
Hello everyone, Could anyone show me the way how can I extract the *3' and 5' UTRs and exonic regions *of all *Human genes* from *Ensembl and Kegg database* that are involved in particular cancer specially *breast cancer *using R/Biocondutor. Thanks in advance. Abdul Rawoof [[alternative HTML version deleted]]
ADD COMMENTlink modified 4.4 years ago by Hervé Pagès ♦♦ 13k • written 4.4 years ago by abdul rawoof60
0
gravatar for Hervé Pagès
4.4 years ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:
Hi Abdul, Suggested workflow: 1. Build the list of genes involved in the particular cancer you're interested in. Could be a vector of gene ids or transcript ids (not all transcripts are necessarily linked to a gene). Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages, maybe the DO.db package, etc... I'm not sure what would be the best tool for this. But maybe you already have your list of genes? 2. Use the TxDb.Hsapiens.UCSC.hg19.knownGene + GenomicFeatures packages to extract the coordinates of the 5'UTRs and 3'UTRs. Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions for this. They'll return the result in a GRangesList object (you'll have to become a bit familiar with those objects first). 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the extractTranscriptsFromGenome() function from the GenomicFeatures package to extract the UTR sequences. The name of the function is misleading but it can be used to extract CDS or UTR sequences in addition to transcript sequences. If you've never used those tools before, it will take you some time to get familiarized with them. Your best friends are the man pages for the individual functions/classes you're going to run into (don't miss the examples section) and the vignettes in the GenomicRanges and GenomicFeatures package. Let us know if you have specific questions or run into specific problems (show us what you've done and explain the problem -- don't forget your sessionInfo()). Good luck, H. On 06/27/2013 01:58 AM, Abdul Rawoof wrote: > Hello everyone, > > > Could anyone show me the way how can I extract the *3' and 5' UTRs and > exonic regions *of all *Human genes* from *Ensembl and Kegg database* that > are involved in particular cancer specially *breast cancer *using > R/Biocondutor. > > Thanks in advance. > > Abdul Rawoof > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENTlink written 4.4 years ago by Hervé Pagès ♦♦ 13k
Thanks for your kind suggestion and I will try to follow your suggested workflow and obviously it will take time to learn all this packages as I never go through it. One more thing I want to ask that how can I download the list of all available cancer genes for human from Kegg database for wnt signaling pathways?? Please forgive me if I asked any senseless question as I have not tried that mentioned packages till now. Thanks, Abdul Rawoof On Thu, Jun 27, 2013 at 11:01 PM, Hervé Pagès <hpages@fhcrc.org> wrote: > Hi Abdul, > > Suggested workflow: > > 1. Build the list of genes involved in the particular cancer you're > interested in. Could be a vector of gene ids or transcript ids (not > all transcripts are necessarily linked to a gene). > > Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages, > maybe the DO.db package, etc... I'm not sure what would be the best > tool for this. But maybe you already have your list of genes? > > 2. Use the TxDb.Hsapiens.UCSC.hg19.**knownGene + GenomicFeatures packages > to extract the coordinates of the 5'UTRs and 3'UTRs. > Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions > for this. They'll return the result in a GRangesList object (you'll > have to become a bit familiar with those objects first). > > 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the > extractTranscriptsFromGenome() function from the GenomicFeatures > package to extract the UTR sequences. > The name of the function is misleading but it can be used to extract > CDS or UTR sequences in addition to transcript sequences. > > If you've never used those tools before, it will take you some time to > get familiarized with them. Your best friends are the man pages for the > individual functions/classes you're going to run into (don't miss the > examples section) and the vignettes in the GenomicRanges and > GenomicFeatures package. > > Let us know if you have specific questions or run into specific problems > (show us what you've done and explain the problem -- don't forget your > sessionInfo()). > > Good luck, > H. > > > On 06/27/2013 01:58 AM, Abdul Rawoof wrote: > >> Hello everyone, >> >> >> Could anyone show me the way how can I extract the *3' and 5' UTRs and >> exonic regions *of all *Human genes* from *Ensembl and Kegg database* that >> are involved in particular cancer specially *breast cancer *using >> >> R/Biocondutor. >> >> Thanks in advance. >> >> Abdul Rawoof >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by abdul rawoof60
Hi Abdul, Good that you mention KEGG and I should probably have mentioned the KEGG.db package for step 1 of the proposed workflow. Even though I've no direct experience with it. Unfortunately, my understanding is that it's about to be deprecated (because of licensing issues). I heard there are some alternatives though. Hopefully more knowledgeable people will chime in with helpful suggestions. Cheers, H. On 06/27/2013 09:57 PM, Abdul Rawoof wrote: > Thanks for your kind suggestion and I will try to follow your suggested > workflow and obviously it will take time to learn all this packages as I > never go through it. > > One more thing I want to ask that how can I download the list of all > available cancer genes for human from Kegg database for wnt signaling > pathways?? > Please forgive me if I asked any senseless question as I have not tried > that mentioned packages till now. > > Thanks, > Abdul Rawoof > > > > On Thu, Jun 27, 2013 at 11:01 PM, Hervé Pagès <hpages at="" fhcrc.org=""> <mailto:hpages at="" fhcrc.org="">> wrote: > > Hi Abdul, > > Suggested workflow: > > 1. Build the list of genes involved in the particular cancer you're > interested in. Could be a vector of gene ids or transcript ids (not > all transcripts are necessarily linked to a gene). > > Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages, > maybe the DO.db package, etc... I'm not sure what would be the best > tool for this. But maybe you already have your list of genes? > > 2. Use the TxDb.Hsapiens.UCSC.hg19.__knownGene + GenomicFeatures > packages > to extract the coordinates of the 5'UTRs and 3'UTRs. > Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions > for this. They'll return the result in a GRangesList object (you'll > have to become a bit familiar with those objects first). > > 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the > extractTranscriptsFromGenome() function from the GenomicFeatures > package to extract the UTR sequences. > The name of the function is misleading but it can be used to extract > CDS or UTR sequences in addition to transcript sequences. > > If you've never used those tools before, it will take you some time to > get familiarized with them. Your best friends are the man pages for the > individual functions/classes you're going to run into (don't miss the > examples section) and the vignettes in the GenomicRanges and > GenomicFeatures package. > > Let us know if you have specific questions or run into specific problems > (show us what you've done and explain the problem -- don't forget your > sessionInfo()). > > Good luck, > H. > > > On 06/27/2013 01:58 AM, Abdul Rawoof wrote: > > Hello everyone, > > > Could anyone show me the way how can I extract the *3' and 5' > UTRs and > exonic regions *of all *Human genes* from *Ensembl and Kegg > database* that > are involved in particular cancer specially *breast cancer *using > > R/Biocondutor. > > Thanks in advance. > > Abdul Rawoof > > [[alternative HTML version deleted]] > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org <mailto:hpages at="" fhcrc.org=""> > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLYlink written 4.4 years ago by Hervé Pagès ♦♦ 13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 256 users visited in the last hour