Bioconductor Forum

and WebGestalR. In clusterprofiler: ```r x <- unique(unlist(as.list(org.Bt.egGO2ALLEGS))) length(x)#5586 ``` there are 5586 genes with GO annotation. In WebGestalR: ```r enrichD_BP <- loadGeneSet(organism = "btaurus",enrichDatabase...geneontology_Biological_Process_noRedundant") geneSet_BP <- enrichD_BP$geneSet length(unique(geneSet_BP$gene))#9011 enrichD_CC &am…

clusterProfiler WebGestalR

updated 3.6 years ago • pengmin.wang.1

Hi community!!! This is my first post in this forum. I am working with WGS metagenome data profiled by MetaPhlAn software which gives relative abundance of OTUs (read count data of the OTUs is not provided). Now, I am doing statistical analysis with phyloseq package. I have...forum. I am working with WGS metagenome data profiled by MetaPhlAn software which gives relative abundance of OTUs (r…

metagenomics deseq2 PHYLOSEQ

updated 5.5 years ago • dc

For many reasons I'm interested in retrieving the annotation score used in UNIPROT. However when using the connection to the specie Rattus norvegicus I cant get any annotation score.  select(up...SCORE", "RGD","ID"), "ENTREZ\_GENE" ,na.rm=T) If I check the website (http://www.uniprot.org/uniprot/P18088), the annotation score is available. Am I missing something? Or it's just the versio…

uniprot.ws uniprot

updated 10.1 years ago • cerikahp

div class="preformatted">Hi, suppose I have a list of gene names like below, is there a "fuzzy"-matching based algorithm to convert gene name to locuslink id's? > head(anno[,4]) [1] Comp Rheb

convert convert

updated 18.6 years ago • Weiwei Shi

and *ENSG00000258724*), I'm wondering about the right timing of the mapping from ENSGs to gene names, and the right time to filter out non protein-coding genes. The way I'm currently doing this, is 1. Use `filterByExpr()` on...with **F** rows. 3. Using **biomaRt**, filtering the `MArrayLM` to include only protein-coding genes, shrinking the from **F** rows to **P** rows. 4. Using **biomaRt**, c…

limma edgeR biomaRt

updated 12 months ago • Jonathan

I have few RefSeq protein IDs eg. `NP_853513.2, NP_000517.2`. Is there a to find corresponding UniProt IDs in Bioconductor

AnnotationDbi

updated 5.1 years ago • sgupt46

class="preformatted">Dear all, I have been using the bioMart and DAVIDQuery R-packages to convert gene identifiers into different formats. Unfortunately, both of these packages require direct internet access for every...conversion. Is anybody aware of an R-package or script which supports "offline" gene name conversion, i.e. based on previously downloaded gene name database files (disk space …

biomaRt DAVIDQuery biomaRt DAVIDQuery

updated 16.0 years ago • Rainer Tischler

Hi all, I'm trying to normalise my bulk RNAseq dataset for both GC-content and gene-length, as both are causing bias in my data. I have worked through the EDAseq vignette, which helps with doing one of these...by EDAseq I can't see any reason why I can't just sequentially do this for GC-content first and then gene-length, but would be great to here if others agree this is acceptable, or if anyon…

norma EDASeq EDA Normalization

updated 22 months ago • samuel.channon

I did not explain my question very well. Maybe my question is not about the alignment. For each gene, UCSC browser gives you a set of probeset Ids and their location by clicking Affymetrix expression track. For my purpose...at=""> wrote: > Hi Shirley > Many tools outside R are available to do this and are all free sequence > alignment software. Check out any of Blast, T-C…

Alignment probe Alignment probe

updated 16.8 years ago • shirley zhang

sample, and also with library concentration for sample. The PCA analysis is performed on read abundance summarized to gene data processed by *vst()* in DESeq2, which I understand is correcting for sequence depth. So why does...this correlation remain? Is it indicative of a problem with the sequencing, or expected for clinical samples (because RNA concentration might relate to real variation e…

RNA-seq PCA control DESeq2 Quality

updated 3.9 years ago • 9906201a

read_tsv 1 2 3 4 5 6 7 8 9 10 11 12 13 transcripts missing from tx2gene: 1 summarizing abundance summarizing counts summarizing length Error in rowsum.default(x[sub.idx, , drop = FALSE], geneId) : incorrect length...sample tsv file but apparently only 11 of them are in the tximport object when I check the infReps names(txi$infReps) I have checked the folder …

error tximport

updated 3.4 years ago • hana

I have functionally annotated these using Blast2Go, and have thus GO terms, possibly EC number and Uniprot ID for the closest match for most of the CDS's. My question is thus: How do I best proceed with this data in the Bioconductor...framework, when I want to do things suchs as gene set enrichment analysis etc. Is the best approach to build my own Annotation packages for each strain or is there…

Proteomics GO genomes Proteomics GO genomes

updated 12.7 years ago • Thomas Lin Pedersen

15:21:33 An: "Mike Walter" <michael_walter at="" email.de=""> Betreff: Re: [BioC] retrieve genes names after KEGG hypergeometric test > >Hi Mike, > > > >Could ou explain me the difference between the db and "db...a function exists. I use two little helper > >functions > >> to retrieve probe IDs or gene symbols of …

Annotation Pathways GO rat2302 probe affy Category Annotation Pathways GO rat2302 probe

updated 15.2 years ago • Mike Walter

I have abundance data generated using RSEM. The data is organized into columns with the transcript and estimated counts. I am trying...them using EdgeR. I’m having some problems using tximport. Has anyone used this pipeline for RSEM abundance data

rnaseq rsem tximport

updated 7.5 years ago • raeganlynn123

been working with SeqGSEA and I was wondering if there were plans to add the ability to account for gene length similar to how it's done in GOseq. Thanks, Julie Julie Leonard Computational Biologist Global Bioinformatics

goseq SeqGSEA goseq SeqGSEA

updated 10.2 years ago • Julie Leonard

in these kind of experiments. "pitfall" (one of them): try taking a look few probesets of this nature (multiple for each transcript) in some experiments and your favorite expression summary measure and see what you...at="" comcast.net=""> > Sent: Friday, August 19, 2005 5:25 AM > Subject: RE: [BioC] duplicate genes in Affy arrays > > > > Dear Sure…

hu6800 probe affy hu6800 probe affy

updated 20.4 years ago • Suresh Gopalan

redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors...gt; > # Plotting tagwise dispersion estimates against log- concentration(i.e. tag abundance) > plot(log2(DGE_List_Object$conc$conc.common), DGE_List_Ob…

Normalization limma edgeR Normalization limma edgeR

updated 13.8 years ago • Javerjung Sandhu

it to MAlist. After that I convert the MAlist to exprSet for further analysis. Now, I want to filter genes in exprSet according to the names or probeID, but I found there is no geneNames Slot for my exprSet. How to find or keep the...geneNames or ID for each genes? And how to filter by gene names? my partial code is like follows: rg<-read.maimages(...) ma.p<-normalizeWithinArrays(rg

Microarray convert Microarray convert

updated 19.4 years ago • yanju@liacs.nl

Hi, I have a fastq file named `` fastq1 `` which I upload in R using: `` StreamFastq1=ShortRead::FastqStreamer(con=fastq1,n=2000000) `` `` fastq1yield=ShortRead...narrow(fastq1yield@sread,start=1,end=Cut_pos) `` Where `` Cut_pos ``is a vector of same length as `` fastq1yield ``, but with different values for each entry. So I cut the sequences in different places resulting in different...len…

ShortRead FastqStreamer

updated 8.2 years ago • ioannis.vardaxis

Hello everyone, I am currently getting information of gene type (eg. miRNA, mRNA, lncRNA) from probes name. For example: 1007\_s\_at for a protein-coding gene (mRNA type), 225799\_at for

Map

updated 8.5 years ago • landscape95

are of 3 types: -1, 1 and 0. 0 means normal probe; -1 mean negative control, i guess, and the probe names are like (-)3xSLv1, NC1_00000002, etc[no corresponding probe sequence]; 1 means positive control, i guess, and the probe names...types: > -1, 1 and 0. 0 means normal probe; -1 mean negative control, i guess, and > the probe names are like (-)3xSLv1, NC1_00000002, etc[no correspo…

aCGH Sequencing miRNA Microarray Annotation Normalization GO Preprocessing Visualization

updated 17.2 years ago • McGee, Monnie

avgTxLength ... H cooks rownames(11060): FBgn0000003 FBgn0000008 ... FBgn0288846 FBgn0288856 rowData names(66): baseMean baseVar ... deviance maxCooks colnames(198): 1 2 ... 1291 1425 colData names(10): library sample ... condition ``` the 'rownames...I added an extra column in res_log2FC which consists in a concatenation of the accession number and gene name : genes <- res_log2FC$row g…

DESeq2 ComplexHeatmap

updated 8 months ago • caroline.zanchi

frequency of only perfect matches between a data seq and seq target file both are set of nucleotide sequences but in large numbers. I tried for (i in 1:100) #for (i in 1:nrow(urfreq)) { pos1<-which(glr4[,1]==urfreq[i,1]) pos2<-which(glr5[,1]==urfreq...i,1]) pos3<-which(glr6[,1]==urfreq[i,1]) if(length(pos1>0)) { urfreq[i,2]<-length(pos1) } if(le…

updated 14.4 years ago • chawla

lt;- keys(txdb, keytype = "TXNAME") tx2gene <- select(txdb, k, "GENEID", "TXNAME") dim(tx2gene) length(k) write.table(tx2gene, "tx2gene.gencode.v28.csv", sep = "\t", row.names = FALSE) files <- file.path(dir,"salmon_quant", samples...sample, "quant.sf") names(files) <- samples$sample tx2gene <- read_csv(file.path(dir, "tx2gene.gencode.v28.csv")) txi &lt…

software error RNAseq tximport R

updated 6.5 years ago • caranlove

I am trying to know gene names from a heatmap made by heatmap. 2. Since I have many genes, the gene names (row labels) are not readable. I typed following...Please teach me how to know gene names from heatmap. data <- read.table("sample.txt") data <- as.matrix(data) out\_f <- "cluster2.png" library(gplots) heatmap.2

R

updated 8.6 years ago • nakanomasayuki265

7 x 7 in image Using Sample as id variables Saving 7 x 7 in image Saving 7 x 7 in image Error in names(res) <- c("Reads", "Map%", "Filt%", "Dup%", "ReadL", "FragL", : 'names' attribute [9] must be the same length as the vector [7] In addition: Warning message

software error

updated 5.9 years ago • sonyuna90

my experiment, I have two age groups (6 months and 14 months) and one continuous variable (cell abundance). The bulk RNA sequencing samples and samples used to quantify cell abundance are biological replicates acquired...from mouse BXD stains. To compare these samples, I independently averaged the count data and cell abundance data for these samples by group (strain and age combinations). My g…

DESeq2

updated 3.2 years ago • bgurdon

different from control conditions. The interesting bit: when I manually search the unmapped sequences using BLAT, most seem to be stress-related genes (small sample size, so far). More perplexingly, they have 100% sequence...identity with a single gene. My question, then, is why featureCounts is failing to map these sequences? I've thought to change the stringency settings...subread-align -t 0 -…

featurecounts rnaseq subread

updated 7.4 years ago • rogangrant

i am tryn to analysize RNAseq data by Kallisto to get the gene read count. after I get them, how can I reach to the gene I want? because it wont give me the every name of the gene, right? would

genetics GeneticsPed Genetics

updated 2.2 years ago • Gülsüm

div class="preformatted">I am looking for a way to convert geneid to gene name. Specifically, I am calling for variants and then using VariantAnnotation to output using a predictCoding() function...CONSEQUENCE" "REFCODON" "VARCODON" "REFAA" "VARAA". I am interested in gene name moreso than GENEID and so I have been looking at how to do this including using the biomaRt p…

VariantAnnotation convert biomaRt VariantAnnotation VariantAnnotation convert biomaRt

updated 13.0 years ago • Matthew Liebers

div class="preformatted">i am suing DEseq, but i donot know how to set length of gene information, does it use length information in finding differential expressed genes? and either i donot know

edgeR DESeq edgeR DESeq

updated 13.9 years ago • wang peter

TxDb.Btaurus.UCSC.bosTau9.refGene") # CanFam4 library(TxDb.Cfamiliaris.UCSC.canFam4.refGene) genes(TxDb.Cfamiliaris.UCSC.canFam4.refGene) |> length() # 53 genes were dropped because they have exons located on both...genes(TxDb.Cfamiliaris.UCSC.canFam3.refGene) |> length() # 54 genes were dropped because they have exons located on both...lt;- makeTxDbFromGFF("references/CanFa…

TxDb.Cfamiliaris.UCSC.canFam4.refGene GenomicFeatures

updated 2.5 years ago • Geert

div class="preformatted">Hi BioC, How can I access gene names from KEEG Gene ID ? for example : KEGG Gene ID : mmu:11479 Thank you in advance, Saurin</div

updated 20.9 years ago • SAURIN

probe set IDs. This is after normalisation and linear model fitting. However when I try to apply gene names the error below occurs. Error in text.default(fit$coef[topGenes],fit$lods[topGenes], labels=substring(genelist[topGenes...zero length 'labels' The volcano plot is produced but with no labels. Also the TopTable does not contain gene names or functions. This

affylmGUI affylmGUI

updated 14.1 years ago • Guest User

Hi all, Using Gviz I am trying to create a figure where I have the exons from a gene in one track and below that the protein domains mapped to the genome sequence. I am able to map the transcript/exon information...rstarts = "exonStarts", rends = "exonEnds", gene = "name", symbol = "name", …

Gviz protein visualization UcscTrack protein domains

updated 5.4 years ago • paul.jaschke

1. I am trying to add gene names to my result table from DESeq2 using the mapIds functions as outlined in the tutorial for differential analysis...seq data. However I get the message:  " could not find funtion "mapIds""   How do I add gene names to my table?   2. Is there a way to add the gene names to the dds generated using the DESeq2 function:  dds = D…

deseq2

updated 10.4 years ago • mtsompana

components of my count data. I would like to extract the list of geneIDs that are contributing most to each component. I can get the value of PC1 and PC2 for each sample using returnData=TRUE, but I would like to extract the...top and bottom genes from each component. Any ideas for me? Thank you

deseq2 pca

updated 9.7 years ago • Emily

I have always assumed that reads mapping to exons are used as the input for differential gene expression analysis in DESeq2 (and other DGE analysis packages) primarily because poly(A) capture protocols are favored...artifacts (Gaidatzis et. al, Nature Biotechnology 2015, Ameur et. al, Nat Struct Mol Biol, 2011), is there a reason why counting only exonic reads is still the...recommended approach …

deseq2 intron

updated 7.3 years ago • Mthabisi Moyo

hope you didn't take it that way. Simply I am currently in need of information regarding what probe length is best and I thought following up your comment might be a way to find references. Of course, Affy says 25-mers are best...gt; I should have said it was just a logical guess. What I meant was that if you had 2 homologous genes, obviously it is going to be harder to avoid homologous region…

SNP affy gcrma SNP affy gcrma

updated 21.5 years ago • Michael Barnes

hope you didn't take it that way. Simply I am currently in need of information regarding what probe length is best and I thought following up your comment might be a way to find references. Of course, Affy says 25-mers are best...gt; I should have said it was just a logical guess. What I meant was that if you had 2 homologous genes, obviously it is going to be harder to avoid homologous region…

SNP probe affy gcrma SNP probe affy gcrma

updated 21.5 years ago • Paul Boutros

It is mentioned that > One inherent bias of the illumina platform is the preferential sequencing of longer genes. Hence, longer genes are more likely declared as DE. Is it true for the current Illumina platforms...as well? And as a result, we observe low counts for some of the genes. For example, I am looking at the gene PYCR1, and after performing DEseq2 I have got good Log2fold…

DESeq2 sequencing

updated 4.8 years ago • simplyphage

div class="preformatted"> Hello, I am trying to convert a file of gene names to corresponding affy probe names. I managed to write a script that puts the genes in an array then I use the feat = getFeature...mart = mart) in biomaRt however I seem to hit a snag when there is more than probe for a gene name. Does anyone know of an existing script that can do this? thanks Ruppert ________________…

probe convert biomaRt probe convert biomaRt

updated 17.7 years ago • Ruppert Valentino

Hi. How to convert 'lncRNA probe name (ex. p2966)' or 'mRNA probename (ex. A_19_P00315593)' to gene symbol? I use 'GSE115018' data for practice. To convert probe name to...gene symbol, I tried to use 'AnnotationHub', 'biomaRt', 'org.Hs.eg.db' and etc. package. Which package should I use to convert probe...name into gene symbol

lncRNA probeID mRNA genesymbol

updated 2.9 years ago • Sooni

div class="preformatted"> I am hoping to get appropriate GO mappings for a list of genes used in a microarray experiment with a view to identifying significantly regulated processes. I was planning on using...study is not a supported organism. I have attempted to use the blast2GO software to generate the gene to GO mapping, but this approach seems to be very time consuming (after generating t…

Microarray GO Organism PROcess GOstats Microarray GO Organism PROcess GOstats

updated 11.9 years ago • Guest User

We used some methods very similar to what had been described in Tsai's 2015 GUIDE-seq Nature paper: break DNA, insert barcode, ligate together, PCR amplification. The major if not only difference is that we used...We used some methods very similar to what had been described in Tsai's 2015 GUIDE-seq Nature paper: break DNA, insert barcode, ligate together, PCR amplification. The major if not only …

annotation software software error

updated 9.6 years ago • mousheng xu

gene_id")) intragenic_seq<-getSeq(Amellifera,gaps(tx)) #####Error in .starfreeStrand(strand(names)) : cannot mix "*" with other strand values Yating > Hi, > > On Tue, Nov 29, 2011 at 6:25 AM, Yating Cheng <yating.cheng at="" charite.de...gt; wrote: >> Dear All, >> >> Does anyone know that how to extract intragenic sequ…

Annotation Cancer Organism BSgenome BSgenome GenomicFeatures Annotation Cancer Organism

updated 14.1 years ago • Yating Cheng

community, I'm currently exploring the implementation of edgeR for assessing differential abundance of metagenomics datasets. Briefly, I start with short Illumina reads, assemble them into large contigs. Gene calling...is done on these contigs and raw reads are mapped on these genes to get read counts values for each genes. This gives large matrices of up to a couple million genes (1 gene = 1 ro…

edger metagenomics gene abundance gene filtering

updated 10.3 years ago • jtremblay514

preformatted">Dear all, Is there a way with Bioconductor in which I can convert such EnSemBL probe names into the standard gene names? AFFX-M27830_5_at AFFX-M27830_M_at ENSG00000000003_at ENSG00000000005_at ENSG00000000419_at

probe probe

updated 17.3 years ago • Gundala Viswanath

a question. What is the better way to perform a filter from my microarray data?  If I use the UniProt ID to filter replicated genes, is it correct? Thanks in advances &nbsp

Filtering Genes

updated 8.3 years ago • Sanches

Hello, I just started working with limma and so far followed the limma users-guide to recive a TestResults Object. I've printed the results into a file using write.fit() and a question arose while looking at the data. Say I was looking at differential expressed genes in samples X vs. Y. From my understanding, the values in column Res.X-Y indicate significant differential expressed ge…

limma decidetests

updated 9.3 years ago • bi_Scholar

multiple RNAseq experiments using a targeted approach (custom Illumina panel looking at only genes of interest). I followed each run by salmon, then tximport. I used a targeted tx2gene file containing only the genes of interest...of a particular read to one transcript for a gene versus another for that same gene). It is possible I misunderstood the comment. So my main question is: - is it app…

tximport RNASeq salmon

updated 2.9 years ago • Ivana

I may get staphylococcus aureus is 3 log2FC upregulated in treatment compared to control? Instead of gene, I would be saying particular organism is abundant or decreased abundance with respect to Fold change. Please do clarify

edger

updated 5.5 years ago • megha.hs28

gt; t( t(counts.mat / gene.length) * 1e6 / colSums(counts.mat / gene.length) ) I estimated gene length via the Ensembldb::lengthof function, where: > "the length is the sum of the lengths of all exons of a transcript or...gt; a gene. In the latter case the exons are first reduced so that the > length corresponds to the part of the genomic sequence covered

RNASeq

updated 3.0 years ago • SciencyKr

I will have to remove any unknown AA from my input file. I have searched terms such as: gap sequences, unknown aa, ambiguous aa, translate() documentation, DNAstring documentation, and have not been able to come up with...a solution. I would appreciate any pointers or tips. ```r #Create DNAstring containing all 645k sequences and headers orthologs = readDNAStringSet('protein_coding_orthologs_…

Translation Biostrings

updated 4.4 years ago • emmanuel.verde2939

nbsp;function to generate DESeq data set Question: Do I need to generate gene-level unnormalized counts during the import\*? or just to use transcript-level counts\*\*? --- I found the article "[Importing transcript...abundance datasets with tximport](https://bioconductor.org/packages/3.7/bioc/vignettes/tximport/inst/doc/tximport.html...paragraph: __Note__: there are two suggested ways of im…

deseq2 rnaseq kallisto tximport

updated 7.8 years ago • Yunlu Zhu

For example, both logCPM and logTPM is accepted with kcdf="Gaussian". Hover, TPM has normalized gene length while CPM hasn't . I am not sure if this will cause any difference in terms of the output result. Really appreciate

GSVA

updated 4.1 years ago • xanthexu

Hello. I am trying to remove the gene names from a heat map I generated because there are many genes, and the gene names on the right side of my heatmap do not correspond

DESeq2

updated 4.5 years ago • Emma

for length bias, right? But if I got it right, because of DESeqDataSetFromTximport function: > Note: there are two suggested ways...of importing estimates for use with differential gene expression (DGE) methods. The first method, which we show below for edgeR and for DESeq2, is to use the gene-level estimated...counts from the quantification tools, and additionally to use the transcript…

kallisto tximport deseq2 counts

updated 5.8 years ago • Mozart

div class="preformatted">Hi to all, I am trying to use the R-devel branch for easyRNASeq to get gene counts from filtered BAM files and I get the following error: > exp<-simpleRNASeq(bamFiles = bfl, param = rParam, nnodes...BamFileList(), param = RnaSeqParam(), nnodes = 1, verbose = FALSE) When I check the nature of my bfl object, I get this: > bfl BamFileList o…

easyRNASeq easyRNASeq

updated 11.8 years ago • Sylvain Foisy

There are a series of amino acid sequences I want to import into R for alignment ,using DECIPHER, with some sequences I imported as FASTA sequences.  The...___M1 type-specific region: mature product residues 1-50. Streptococcus pyogenes M type 1 gene (emm1) 5' partial sequence                   CDC re…

r biostrings decipher

updated 8.0 years ago • reubenmcgregor88