Bioconductor Forum

I have a list of uniprot protein IDs and NCBI taxonomy IDs for the organism that contains the protein. For each combination, I want the DNA...sequence that codes for the protein (CDS) in the organism. I'm trying to find the most straight forward approach that can be automated...I'm good with the idea of reading the uniprot protein and NCBI taxonomy IDs into R as a table or a data frame. …

ensembldb biodbUniprot

updated 3.0 years ago • Peter

TU Graz) is seeking a highly motivated PhD candidate to join a research project in the field of enzyme engineering. This project leverages Deep Mutational Scanning (DMS) assays and Oxford Nanopore sequencing (ONT) data...to investigate, through machine learning, the impact of distant amino acid interactions on enzyme function. Candidates should have: - A Master's degree in Bioinformatics, Inf…

NGS AI Bioconductor

updated 17 months ago • Gerhard Thallinger

Hi, I am trying to get amino acid sequence for a protein-isoform from UniProt but not able to find a solution. I can get canonical sequence but the method to obtain...isoform sequence is not available. ```r library(UniProt.ws) up <- UniProt.ws(taxId=9606) select(up, keys = "Q2V2M9", columns = c("SEQUENCE"), keytype...UNIPROTKB") # select(up, keys = "Q2V2M9-4", columns = c("SEQUE…

UniProt.ws biomaRt

updated 4.4 years ago • sgupt46

I am trying to do a "classical" match of uniprot ids, using protein IDs identified in a Zebrafish mass-spec experiment, to find the corresponding ensembl gene ids...for which my biomaRt query fails to retrive any information, although they are present in the Uniprot database and with an attributed ensembl gene id. Am I missing something?   Here is my code: <pre> prot_ids = c("F…

biomaRt uniprot ensembl

updated 11.1 years ago • António Miguel de Jesus Domingues

all, I'm new to this mailing list and have a very simple question about digestion with restriction enzyme for the whole genome. I'm using package BSgenome.Hsapiens.UCSC.hg18 to find the cut site of enzyme. Is there a faster...way (ie. pre-built package) to get the fragment sequence and it's location in UCSC genome browser format (Chr#: start bp - end bp). I'm planning on finding the restrictio…

BSgenome BSgenome BSgenome BSgenome

updated 17.7 years ago • Anh Tran

In UniProt, an entry such as <span style="background-color:transparent">Q8N4C6 has Gene: </span>NIN but in the Gene Names field...the genes column, I get both names.</span> <pre> > select(up, "Q8N4C6", "GENES", "UNIPROTKB") Getting extra data for Q8N4C6 'select()' returned 1...1 mapping between keys and columns UNIPROTKB GENES 1 Q…

UniProt.ws

updated 7.4 years ago • Dario Strbenac

Is there a way to map Uniprot names (e.g. RASH\_HUMAN) to other identifiers using Bioconductor? I have tried biomaRt, the AnnotationDbi-based...numbers. I had more hope with uniprot.ws since it is possible to do this conversion from the uniprot web site itself

uniprot.ws

updated 4.8 years ago • Diego Diez

I am trying to map ecoli k12 UNIPROT ids to ENTREZID and GENENAME using the annotation package org.EcK12.eg.db, and while available in most of the annotation...packages (to my surprise) the keytype UNIPROT is not available in org.EcK12.eg.db. Questions: - Why is not available? any plan to make it available? - Any alternative

annotation

updated 6.9 years ago • biodavidjm

Hi I'm using Tximport for assembling transcript level expression data into gene-level expression data The tool itself works very well But In the output there are four columns abundance" "counts" "length...1.99999704253414 18.3494 "lengthScaledTPM" I couldn't figure out the exact defnitition of abundance and counts. Does it mean that the …

Tximport

updated 6.9 years ago • hong

full information that apparently is available in the UniProt database. To be specific; among others I would like to retrieve the preferred gene name, which is labeled "Gene names...primary)" in the resulting table when using UniProt's web-based ID mapping interface. As example, when 'manually' retrieving annotation info for the (rat) UniProt ID "Q6MGA6...28GeneID%29%2Cyourlist%28M2015022613L2TB…

uniprot.ws uniprot

updated 10.9 years ago • Guido Hooiveld

On Sat, 8/31/13, Nick <edforum at="" gmail.com=""> wrote: Subject: Re: Change enzyme code (EC) into gene symbol Cc: "BioC Help" <bioconductor at="" r-project.org=""> Date: Saturday, August 31, 2013, 4:32 PM Hi I think I figured...email very well :( Thanks a lot! But one more question, how do you treat multiple isoforms of an enzyme for one EC code in the package? Weiwei …

Pathways graph pathview Pathways graph pathview

updated 12.4 years ago • Luo Weijun

the documentation from the above link to see if there is any tool that can convert GO annotations to Enzyme Commission numbers automatically. I went through the file at http:/www.geneontology.org/mapping/ec2go I have a big...file which contains GO Annotations and my objective is to convert those GO Annotations to Enzyme Commission numbers for further analysis. I would like to check if KEGG.db w…

GO convert GO convert

updated 12.9 years ago • Guest User

data to the keggmap of my organism 'mez'. I want to know how pathview deal with the colour of enzyme with several conding genes, for example: In pathway: Glycolysis / Gluconeogenesis, mez00010, the gene node (enzyme 4.1.2.13...has 3 genes related to it. Gene Mtc_1383/aroF; Mtc_2501/fbaA; Mtc_0384/fbaB; Log2fc -1.22;-0.118;1.645; aroF and fbaB all significantly...regulated and has a |log2 …

RNASeq Organism pathview RNASeq Organism pathview

updated 11.9 years ago • 刘鹏飞

the original gene identifier that comes with the mm10 and make its own, but they do keep the gene name in its record, so I will just take gene...name as identifier in my process. I have already used the gene names for building the assayed gene vector and the DE gene vector...was built too. 2, Then it comes to the transcript length issue, I noticed one of cufflink output file genes.fpkm_tracking …

Annotation GO PROcess goseq Annotation GO PROcess goseq

updated 12.6 years ago • Xulong Wang

uc002uyb.4 uc002uyc.2 # 16 8 ---- The strategy 1) DNA sequence for the introns and exons of a gene, is best understood as "sequence for the introns and exons for each known transcript...3) Once you have such a GRangesList, for one or more geneIDs, you want to ignore (for now) that gene information, and just extract the names of all of the transcripts. To do this extra…

updated 12.5 years ago • Paul Shannon

Hey everyone, I am trying to obtain exon sequences and ensembl_transcript_id_versions from the database. Depending on the time of day, most of the times I execute...0.6" header="1" requestid="biomaRt" uniquerows="1" virtualschemaname="default"> <dataset name="mmusculus_gene_ensembl"><attribute name="gene_exon"></attribute><attribute name="ensembl_transcript_id_version"&…

biomaRt getBM

updated 5.8 years ago • laurenz.holcik

I am a scientist specializing in [enzyme engineering][1], which is the technology for mass production and application of enzyme preparations. With the development...of technologies such as genetic engineering and enzyme isolation and purification, the high cost, complexity and slow reaction time of many enzymes have been improved. [1]: https...bio-fermen.bocsci.com/services/enzyme-enginee…

BioTIP

updated 21 months ago • bocchemistry

div class="preformatted">Hi, I'm unable to use the Uniprot Biomart. Are there new settings to connect to this mart? Thanks! > require(biomaRt) > uniprotV <- useMart("unimart...uniprot") > d <- getBM(attributes=c("accession", "name"), mart = uniprotV) Error in getBM(attributes = c("accession", "name"), mart = uniprotV...Query error occurred at web service b…

biomaRt biomaRt

updated 14.2 years ago • kenny daily

the original gene identifier that comes with the mm10 and make its own, but they do keep the gene name in its record, so I will just take gene...name as identifier in my process. I have already used the gene names for building the assayed gene vector and the DE gene vector...was built too. 2, Then it comes to the transcript length issue, I noticed one of cufflink output file genes.fpkm_tracking …

Annotation GO PROcess goseq Annotation GO PROcess goseq

updated 12.6 years ago • Guest User

mice, at 4 different time points between P17 and P84. I'm attempting to analyze the differentially abundant taxa between genotypes at each postnatal age sampled. In addition, I'm hoping to make use of some previously acquired...metabolite data to extract some differentially abundant taxa using SCFA levels as a continuous predictor variable. I have a small sample size due to the pilot nature of th…

deseq2 microbiome R

updated 5.3 years ago • greenm11

<div class="preformatted">Hello all, The brief communication describing the metagenomeSeq package is out in Nature Methods. metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are...<div class="preformatted">Hello all, The brief communication describing the metagenomeSeq package is out in Nature Methods. metagenomeSeq is desig…

Normalization metagenomeSeq Normalization metagenomeSeq

updated 12.3 years ago • Hector Corrada Bravo

1.38.2) and org.Mm.eg.db (3.4.1) in R 3.3.3. I've tried to retrieve matching SYMBOL with UNIPROT ID with this command : select(org.Mm.eg.db,"P53784",column="SYMBOL",keytype="UNIPROT") I've obtained this error : Error in...testForValidKeys(x, keys, keytype, fks) :   None of the keys entered are valid keys for 'UNIPROT'. Please use the keys method to see a listing of valid arguments…

Annotation mouse UNIPROT

updated 8.3 years ago • julie.chevalier

div class="preformatted">Dear list, Context : I'd like to calculate GO enrichments for a list of UniProt identifiers (note that they are "ID" or "Entry name" and NOT "AC" or "Accession"). So I tried to use BioMart to extract the GO-IDs...for my list of UniProt identifiers, see code below. Basically after calling getBM() R doesn't return the command-line any more for more than...require(biomaR…

GO biomaRt GO biomaRt

updated 14.8 years ago • Wolfgang RAFFELSBERGER

Hi all, Question: is it possible to extract the "Gene type" information from (e.g.) the org.Mm.eg.db package? I would like to identify/extract all genes labelled "ncRNA" and "protein...coding" in the "Gene type" field at the summary section of the EntrezGene database. See e.g. (for ncRNA) http://www.ncbi.nlm.nih.gov/gene/?term...102638436 and (for protein coding) http://www.ncbi.nlm.nih.gov/ge…

annotation org.Mm.eg.db

updated 11.1 years ago • Guido Hooiveld

when using txOut = FALSE (default) because I want the values from StringTie quantification but gene level summarized (as they are more robust). I am using "StringTie" output files "t_data.ctab" as input files for tximport...gives the lists with matrices, “abundance”, “counts”, and “length” where the transcript level information is summarized to the gene-level. I want to ask whether...do they co…

tximport

updated 6.9 years ago • HKS

Is it possible to keep sequence names when haplotypes are collapsed using the Collapse tool of the QSutils package? I have a number of fasta formatted...sequences, some of which are the same, and therefore redundant for a phylogenetic analysis; since I would like to remove those...am using the Collapse function of QSutils to remove them. My issue is that after using the tool, the sequences hav…

QSutils Collapse Haplotypes

updated 5.9 years ago • al.gar.aber

div class="preformatted">Hi there, I am wondering if you can change the ECs for each enzyme on the kegg graph like pyrimidine metabolism into gene symbol, for example? thanks, Weiwei [[alternative HTML version

updated 12.4 years ago • Ed

ensembl") mart = useDataset("hsapiens\_gene\_ensembl", mart) db = getBM(attributes = c("uniprot\_swissprot", "uniprot\_genename", "illumina\_humanht\_12\_v4"),             filters = "uniprot\_swissprot...I . get this error   \`Error in getBM(attributes = c("UniProtKB/Swiss-Prot", "UniProtKB Gene Name",  :&…

r biomart

updated 8.5 years ago • al14

div class="preformatted"> Is there a way for me to use Bioconductor to take a list of gene names and give me back a list of genomic sequences, preferably with the exons and introns easily differentiable (e.g. exons

updated 12.6 years ago • Guest User

I'm trying biomaRt (a bioconductor package) for converting a column of Uniprot IDs to their corresponding protein names using RStudio. But I'm still not able to see the names changing. I'd appreciate...ensembl <- useMart("ensembl", dataset = "hsapiens_gene_ensembl") # Get the protein names using biomaRt search_ids <- c("protein") protein_names <- biomaRt:…

Proteomics UniProtKeywords

updated 2.9 years ago • a.a.houfani

lt;- new_chr_names[valid_indices] # # # # # Confirm lengths match # # length(original_chr_names) == length(new_chr_names) # # # # Assign the new chromosome names # names(BSgenome.Mfascicularis.NCBI.6.0...tig00001446_obj # | # | Tips: call 'seqnames()' on the object to get all the sequence names, call 'seqinfo()' to get the full sequence info, use the '$' or '[[' operator to access a given sequenc…

BSgenome.Mfascicularis.NCBI.6.0

updated 11 months ago • Genevieve

Hi, I am doing some experiment and at a loss modeling by DESeq2. I want to __estimate abundance__ of genes across the three conditions (namely treatment1, treatment2 and treatment3), adjusted for covariates of age. I make DESeq...Hi, I am doing some experiment and at a loss modeling by DESeq2. I want to __estimate abundance__ of genes across the three conditions (namely treatment1, treatment2 and…

deseq2

updated 7.9 years ago • migimimi0

How to convert multiple protein fasta sequences into their gene/protein name

GenePrediction

updated 3.2 years ago • AROCKIYA

check that I correctly understand each column of summary.xls in output of CRISPRseek. Thank you. names : name of gRNA   forViewInUCSC : coordinates   extendedSequence : extended sequence of gRNA   gRNAefficacy...www.nature.com/articles/nbt.3026\](Rational design of highly active sgRNAs for CRISPR-Cas9–mediated gene inactivation)   gRNAsPlusPAM : sequen…

crisprseek

updated 7.6 years ago • Ou, Jianhong

dds<-ddsHTSeq dds <- DESeq(dds, quiet = T) Then I used String Tie for the abundance. But I found some genes with low TPM and FPKM values in String Tie output where the HTseq count says Zero. So my question

DESeq2

updated 3.3 years ago • Vapin

for the naive post but I'm wondering if there is a way of exporting information I have on a gene acquired from ensembldb to a GenBank file? I have the following code, where I fetch the sequence and annotation of the Actin...beta gene: library(EnsDb.Hsapiens.v86) library(dplyr) Hs_edb <- EnsDb.Hsapiens.v86 Hs_dna <- getGenomeTwoBitFile(Hs_edb...gt;% …

granges genomic ranges genbankr ensembldb

updated 5.2 years ago • biomiha

div class="preformatted"> I have independent event sequences for example as follows : Independent event sequence 1 : A , B , C , D Independent event sequence 2 : A, C , B Independent event...sequence 3 :D, A, B, X,Y, Z Independent event sequence 4 :C,A,A,B Independent event sequence 5 :B,A,D I want to able to find that most common...sequence patters as {A, B } = > 3 from l…

spade spade

updated 13.5 years ago • Guest User

nbsp; Hi all, I'm trying to find the names (e.g. SYMBOLS) corresponding to gene ID using the "org.Dm.eg.db" and "TxDb.Dmelanogaster.UCSC.dm3.ensGene" libraries...1] "ENTREZID" "ACCNUM" "ALIAS" "CHR" "CHRLOC" [6] "CHRLOCEND" "ENZYME" "MAP" "PATH" "PMID" [11] "REFSEQ" "SYMBOL" "UNIGENE" "ENSEMBL" "…

annotationdbi org.mm.eg.db org.dm.eg.db select

updated 10.8 years ago • Patrick Schorderet

in processAmplicons("Index2.Plate10.fastq", barcodefile = "Samples1.txt",  :   Hairpin sequence length is set to 22, there are hairpin sequences not with specified length.__ I checked the length of my hairpin sequences

processamplicons

updated 9.7 years ago • lucia.caceres

enriched GO terms using the topGO package.  However, I am having difficulty formatting the gene names for input.  Here's what I have so far. <pre> ## create named vector of p values GO_genes = setNames(res$padj, row.names...res)) ## create a gene selection function to select significant genes sig_genes <- function(pval) {return (pval < 10^-5)} ## cr…

topgo biomart deseq2 gene ontology

updated 8.7 years ago • krc3004

curatedmetagenomicsData uses HuMAnNx for functional analysis on microbiome, including mainly pathway abundance and pathway coverage. In HuMAnNx doc, the pathway abundance values seems to be over 1 in the most cases, but just wondering...why the values in pathway abundance datasets acquired from curatedmetagenomics are all falling between 0~1. Is there any additional normalization

curatedMetagenomicData ExperimentHubData

updated 13 months ago • Edward

Hi, My question is whether the statistical methods used by edgeR are suitable for detecting significant differences in abundance of UNIref protein family features as generated by HUMAnN2? HUMAnN2 maps all the microbial (non-host) reads from my...whether the statistical methods used by edgeR are suitable for detecting significant differences in abundance of UNIref protein family features as …

edgeR HUMAnN2.0

updated 5.1 years ago • Nick

table with samples as columns and tags as rows. In addition, how to normalize the counts to the length of each gene. That is, all gene counts should be normalized from 0 to 1 in gene length and then draw a distribution of counts

updated 14.7 years ago • Andrew Wang

div class="preformatted"> Hi all, I have a sequence file (fasta format) and want to calculate the rho statistics for dinucleotide abundance value on my data.. the code...cc cg ct ga gc gg gt ta tc tg tt I will be grateful if anyone solve this.. I've also attached the sequence below.. Thanks in advance.. >gi|270279749|gene0003 ATGTATATGAGAAAGGAAGAGCCTAGCGGCTCAGACAAGATTATGACTTCAGTTGTTG…

updated 14.0 years ago • Guest User

Dear all,   We are pleased to announce that the new Ensembl marts for release 80 are now live on [www.ensembl.org](http://www.ensembl.org/index.html). If you are using biomaRt, you can change your host to access our most recent data (With R 2.2 and Bioconductor version 3.1) ensembl\_mart\_80 <- useEnsembl(biomart=“ensembl")   * Ensembl..…

biomart Ensembl zebrafish Rat 1000 genomes phase 3 News

updated 10.7 years ago • Thomas Maurel

Hi  I am currently working on some metagenome data and I'd like to use DESeq2 tool to find the genes that are differentially abundant in different treatment over time. Due to the type of the study, I need to normalize the...count data (genes abundance) per 16S rRNA copy number which changes the distribution and nature of data from count to decimal. I know

deseq2 metagenomics

updated 8.4 years ago • ghanbari.msc

Hello everyone: I would like to add "ORF IDs to Chromosomal Location" to my data frame named genes in the DGEList-object. The organism is yeast, and I downloaded org.Sc.sgd.db.. ``` > columns(org.Sc.sgd.db) [1] "ALIAS...annotation data package org.Sc.sgdALIAS Map Open Reading Frame (ORF) Identifiers to Alias Gene Names org.Sc.sgdCHR Map ORF IDs to Chromosomes org.Sc.sg…

org.Sc.sgd.db

updated 2.1 years ago • Chih

div class="preformatted">Hello, How would I configure a hypergeom object to use uniprot ids for both the universe of genes and the list of interesting genes? Thanks! Paul [[alternative HTML version deleted

updated 14.9 years ago • Paul Rigor

Hi people, I want to use DESeq2 for differentially expression analysis of orthologous genes between two different species. I am not experienced at all using R and DESeq2, but I think at the moment I know most of the...what is the best way to compare the data between orthologs of two different species, with different gene and transcipt names? How should I import them and then compare them? What I…

DESeq2 tximport

updated 3.1 years ago • Konstantinos

succeeded in conducting my DESeq2 without any problem> However, I want to go further to show 30 most expressed genes so I used the following code: Note: The count.data.set.object is the DESeqDataset vsd <- vst(count.data.set.object...count.data.set.object) norm.data = assay(vsd) #for the heat map for the most expressed genes library("pheatmap") select…

DESeq2 pheatmap apeglm

updated 4.9 years ago • amoaristotle

computer science, bioinformatics, much less R and DESeq2. I am having issues with R removing the gene ID names in my countData. My raw countData appears as follows: ``` 2416X12 2416X13 2416X10 ENSOARG00000000001 0 0 0 ENSOARG00000000002...0 0 2 319 30 524 3 0 0 0 ``` As you can see, the ge…

DESeq2

updated 17 months ago • Olivia

where a given event has taken place. These regions can be an exon, an intronic region, or similar. Most (all) of these events have taken place within the boundaries of genes, and I would like to retrieve the gene names (ensemble...t","end"), values=list(10,17317394,17317851), mart=ensembl) [1] ensembl_gene_id <0 rows> (or 0-length row.names) But since no whole GENE is within these…

updated 16.0 years ago • Boel Brynedal

__Background:__ I would like to identify differentially expressed gene orthologs across multiple related organisms. I predicted a set of single copy orthologs from transcriptome de...__The orthologs for which I want to perform differential expression analysis are not of the same sequence length - does the procedure above "normalize" the counts with respect to the different transcript len…

deseq2 sequence length normalization RSEM tximport differential expression

updated 7.6 years ago • al-ash

name="start_position"></attribute><attribute name="end_position"></attribute><filter name="chromosome_name" value="11,2"></filter><filter...0.6" requestid='\"biomaRt\"' uniquerows="1" virtualschemaname="default"> <dataset name="hsapiens_gene_ensembl"><attribute name="hgnc_symbol"></attribute><filter name="chromosome_name" value="11,2">…

Alignment convert biomaRt Alignment convert biomaRt

updated 16.1 years ago • steffen@stat.Berkeley.EDU

I was wondering if you could help me with a query about the DESeq2 package in R please. I have sequenced RNA transcripts from six separate species and am comparing gene expression between pairs of species. The issue...I am facing is that I cannot work out how to account for gene length when normalising the read count data, but I understand this is essential to do so when comparing two differen…

Normalization DESeq2

updated 2.3 years ago • cm15245

Hi All, I am working with the Affy data set for which the max fold change is around 1.7. Most of the probesets have a fold change between 1.2 and 1.5. I have done the standard data analysis procedure reccommanded...2-groups, t-test, p-val & corrected p-val etc). After doing this, I have got only 20 significant genes, which has no biological significance with the experiment. Now, I am t…

affy affy

updated 20.1 years ago • Sharon Anbu

Dear all,  please would you advise, what is the easiest way to get the SEQUENCES of 3'UTR and the GENE\_NAMES for all mouse or human genes  ? After glancing through previous posts, I think I shall

GenomeFeatures

updated 7.1 years ago • Bogdan

the following tasks as a first contact with the bioconductor project: # Task 1: # find: # * mRNA sequence (5'UTR, Coding region, 3'UTR) # * position of start codon in sequence # * position of stop codon in sequence # * ID (Which ID(s) would...I choose to reference my # sequence hits? Embl, ensembl transcript id, # Entrez Gene id, RefSeq, etc.?) # * name of associated protein pr…

GLAD biomaRt GLAD biomaRt

updated 16.5 years ago • Simon

Hello, I wonder if there is a way to use msaPrettyPrint() with sorted alignement by sequence name without rewriting the output fasta file. I know I can read the output fasta file with python, and re-order the sequences...by the sequence name, but I'm pretty sure there is a quickest way to achieve that. I mean i want to see in fasta file and pdf file (output

msa alignement output sorted

updated 9.0 years ago • Olorin

nbsp;I'm analysing a 16S microbial community dataset, and am using DESeq2 to test for differential abundances. When I do this, I supply raw count information to DESeq() as per the vignette rationale that the model fitting implicitly...nbsp;  The [PICRUSt](http://picrust.github.io/picrust/index.html) package provides inferred gene content for microbial communities, by referencing 16S taxo…

deseq2 picrust

updated 8.8 years ago • handibles