Regarding extraction of 3' and 5'UTRs and exonic region of a gene.
1
0
Entering edit mode
abdul rawoof ▴ 60
@abdul-rawoof-5869
Last seen 5.3 years ago
United States
Hello everyone, Could anyone show me the way how can I extract the *3' and 5' UTRs and exonic regions *of all *Human genes* from *Ensembl and Kegg database* that are involved in particular cancer specially *breast cancer *using R/Biocondutor. Thanks in advance. Abdul Rawoof [[alternative HTML version deleted]]
Cancer Cancer • 2.0k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 8 hours ago
Seattle, WA, United States
Hi Abdul, Suggested workflow: 1. Build the list of genes involved in the particular cancer you're interested in. Could be a vector of gene ids or transcript ids (not all transcripts are necessarily linked to a gene). Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages, maybe the DO.db package, etc... I'm not sure what would be the best tool for this. But maybe you already have your list of genes? 2. Use the TxDb.Hsapiens.UCSC.hg19.knownGene + GenomicFeatures packages to extract the coordinates of the 5'UTRs and 3'UTRs. Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions for this. They'll return the result in a GRangesList object (you'll have to become a bit familiar with those objects first). 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the extractTranscriptsFromGenome() function from the GenomicFeatures package to extract the UTR sequences. The name of the function is misleading but it can be used to extract CDS or UTR sequences in addition to transcript sequences. If you've never used those tools before, it will take you some time to get familiarized with them. Your best friends are the man pages for the individual functions/classes you're going to run into (don't miss the examples section) and the vignettes in the GenomicRanges and GenomicFeatures package. Let us know if you have specific questions or run into specific problems (show us what you've done and explain the problem -- don't forget your sessionInfo()). Good luck, H. On 06/27/2013 01:58 AM, Abdul Rawoof wrote: > Hello everyone, > > > Could anyone show me the way how can I extract the *3' and 5' UTRs and > exonic regions *of all *Human genes* from *Ensembl and Kegg database* that > are involved in particular cancer specially *breast cancer *using > R/Biocondutor. > > Thanks in advance. > > Abdul Rawoof > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT
0
Entering edit mode
Thanks for your kind suggestion and I will try to follow your suggested workflow and obviously it will take time to learn all this packages as I never go through it. One more thing I want to ask that how can I download the list of all available cancer genes for human from Kegg database for wnt signaling pathways?? Please forgive me if I asked any senseless question as I have not tried that mentioned packages till now. Thanks, Abdul Rawoof On Thu, Jun 27, 2013 at 11:01 PM, Hervé Pagès <hpages@fhcrc.org> wrote: > Hi Abdul, > > Suggested workflow: > > 1. Build the list of genes involved in the particular cancer you're > interested in. Could be a vector of gene ids or transcript ids (not > all transcripts are necessarily linked to a gene). > > Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages, > maybe the DO.db package, etc... I'm not sure what would be the best > tool for this. But maybe you already have your list of genes? > > 2. Use the TxDb.Hsapiens.UCSC.hg19.**knownGene + GenomicFeatures packages > to extract the coordinates of the 5'UTRs and 3'UTRs. > Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions > for this. They'll return the result in a GRangesList object (you'll > have to become a bit familiar with those objects first). > > 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the > extractTranscriptsFromGenome() function from the GenomicFeatures > package to extract the UTR sequences. > The name of the function is misleading but it can be used to extract > CDS or UTR sequences in addition to transcript sequences. > > If you've never used those tools before, it will take you some time to > get familiarized with them. Your best friends are the man pages for the > individual functions/classes you're going to run into (don't miss the > examples section) and the vignettes in the GenomicRanges and > GenomicFeatures package. > > Let us know if you have specific questions or run into specific problems > (show us what you've done and explain the problem -- don't forget your > sessionInfo()). > > Good luck, > H. > > > On 06/27/2013 01:58 AM, Abdul Rawoof wrote: > >> Hello everyone, >> >> >> Could anyone show me the way how can I extract the *3' and 5' UTRs and >> exonic regions *of all *Human genes* from *Ensembl and Kegg database* that >> are involved in particular cancer specially *breast cancer *using >> >> R/Biocondutor. >> >> Thanks in advance. >> >> Abdul Rawoof >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Abdul, Good that you mention KEGG and I should probably have mentioned the KEGG.db package for step 1 of the proposed workflow. Even though I've no direct experience with it. Unfortunately, my understanding is that it's about to be deprecated (because of licensing issues). I heard there are some alternatives though. Hopefully more knowledgeable people will chime in with helpful suggestions. Cheers, H. On 06/27/2013 09:57 PM, Abdul Rawoof wrote: > Thanks for your kind suggestion and I will try to follow your suggested > workflow and obviously it will take time to learn all this packages as I > never go through it. > > One more thing I want to ask that how can I download the list of all > available cancer genes for human from Kegg database for wnt signaling > pathways?? > Please forgive me if I asked any senseless question as I have not tried > that mentioned packages till now. > > Thanks, > Abdul Rawoof > > > > On Thu, Jun 27, 2013 at 11:01 PM, Hervé Pagès <hpages at="" fhcrc.org=""> <mailto:hpages at="" fhcrc.org="">> wrote: > > Hi Abdul, > > Suggested workflow: > > 1. Build the list of genes involved in the particular cancer you're > interested in. Could be a vector of gene ids or transcript ids (not > all transcripts are necessarily linked to a gene). > > Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages, > maybe the DO.db package, etc... I'm not sure what would be the best > tool for this. But maybe you already have your list of genes? > > 2. Use the TxDb.Hsapiens.UCSC.hg19.__knownGene + GenomicFeatures > packages > to extract the coordinates of the 5'UTRs and 3'UTRs. > Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions > for this. They'll return the result in a GRangesList object (you'll > have to become a bit familiar with those objects first). > > 3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the > extractTranscriptsFromGenome() function from the GenomicFeatures > package to extract the UTR sequences. > The name of the function is misleading but it can be used to extract > CDS or UTR sequences in addition to transcript sequences. > > If you've never used those tools before, it will take you some time to > get familiarized with them. Your best friends are the man pages for the > individual functions/classes you're going to run into (don't miss the > examples section) and the vignettes in the GenomicRanges and > GenomicFeatures package. > > Let us know if you have specific questions or run into specific problems > (show us what you've done and explain the problem -- don't forget your > sessionInfo()). > > Good luck, > H. > > > On 06/27/2013 01:58 AM, Abdul Rawoof wrote: > > Hello everyone, > > > Could anyone show me the way how can I extract the *3' and 5' > UTRs and > exonic regions *of all *Human genes* from *Ensembl and Kegg > database* that > are involved in particular cancer specially *breast cancer *using > > R/Biocondutor. > > Thanks in advance. > > Abdul Rawoof > > [[alternative HTML version deleted]] > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fhcrc.org <mailto:hpages at="" fhcrc.org=""> > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLY

Login before adding your answer.

Traffic: 699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6