Bioconductor Forum

I want to save the assignment of each gene on corresponding modules after __cutreeDynamic()__, and similarly, for merged __moduleEigengenes()__.  I used __cbind...__\[or__ merge()__ \] function to combine the gene names and the module/color symbols. Although in this case the order of the rows is always the same, this seems not safe to...me. The purpose I do this is to get the hub genes …

WGCNA cbind merge match gene names and Eigengene/module/colorlabels

updated 8.1 years ago • yifangt

I’m trying to run GOSeq with human RNA Seq data that has been processed with Salmon, summarized to gene level with tximport, and DGE run using DeSeq2.  I was reading the vignette for GOSeq and got stuck/confused on the part...what human genome build you used... It says you need to know this to access the correct transcript length in their database. This may be a stupid question/lack of…

deseq2 tximport goseq

updated 7.8 years ago • casey.rimland

Error in apply(reads.clr[these.rows, ], 2, function(x) { :` ` dim(X) must have a positive length` `Calls: aldex.clr ... aldex.clr.function -> aldex.set.mode -> iqlr.features -> apply` *Execution halted* The full script...Reformat_Basepaws_WGS2_and_Combine/Additional_Alignments/Bacteria11.bed). However, I think the most relevant part of the script is as follows: ```r # …

Metagenomics IQLR ALDEx2

updated 3.1 years ago • cwarden45

and cigs\_week. The question is how to pick which variables to include in the model when different genes may show different effects of the variables? Instead of trying to pick one best model for all genes, I had the idea to first...some of the modules had the variable of interest, trt, included in the best model, and for these genes I want to get fold-change values for trt. My approach was to run…

WGCNA limma voom

updated 9.3 years ago • Jenny Drnevich

Salmon) as input to DESeq2. We are using GRCh38 to leverage the known variation (alternate sequences) together with the strength of Salmon that probabilistically assigns reads to sequences without double-counting...Thus, when supplied with alternate sequences, reads may still be mapped to an alternate sequence (annotated as patches in GRCh38) if the reads contain true variation...with re…

deseq2 tximport

updated 9.2 years ago • Chakravarthi Kanduri

<div class="preformatted">Dear list, How do I load a custom annotation package into an R session? I've created a custom annotation database by following the steps laid out in the SQLForge vignette using the gui version of R with windows XP. Up until now, all I've had to do to load an annotation package into an R session is to use the convenient drop down menus within the gui to retrieve t…

Annotation GO GUI zebrafish probe Annotation GO GUI zebrafish probe

updated 17.0 years ago • Scott Ochsner

single-cell RNA-seq datasets. For example, Keren-Shaul et al made count matrices with 34016 gene x 37248 samples (= cells) available on [NCBI GEO.](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE98969) I am interested...much more memory (and storage space). I can simply find a machine with lots of RAM or apply abundance filters before creating an SCESet, but I am curious…

scater simpleSingleCell workflows

updated 8.6 years ago • sandmann.t

Hi all, I am having an issue with DESeq2. One is related to its use in galaxy (did not get an answer on galaxy forum so I thought why not ask here) and one is related to the introduction of coldata information in the matrix before running DESeq2 when using featureCounts data. 1) using Galaxy with 2 factors (2 batches/ 2 discinct studies from the litterature), 3 levels in each factor that ar…

DESeq2

updated 5.1 years ago • NGS_enthusiast

that i got by using this function from the biomart package, getBM(). > > How can I know the sequence of the genes? and the promoters? is there an > > function that does that? > > > > Thank you, > > Leonor </div

biomaRt biomaRt

updated 13.1 years ago • lcarvalh@btk.fi

3000), TxDb=txdb, annoDb="org.Hs.eg.db") ``` As we know, 'Intergenic' is DNAs located between genes. This 'intergenic' region doesn't include Promoter, 5' UTR, and 3' UTR. Then, could you tell me how you defined "Distal Intergenic...It sounds like it excludes 5' UTR and 3' UTR, so it is named as 'distal intergenic.' Could you tell how you defined 'distal intergenic'? Do you have specific s…

ChIPseeker

updated 3.6 years ago • Sanghoon

Dear community, I am currently analysing a set of time-course (~10 time points) stranded paired-end RNA-seq data with the particular objective of identifying time-dependent changes in alternative splicing. However, I am still undecided whether _alpine_ or _Salmon_ or another method would be better suited for the estimation of transcript abundance. Up to this point, I used _STAR_ to al…

rna-seq salmon alpine transcript expression

updated 7.5 years ago • relathman

the left (nct 1-1039) footprints are mainly in the 0-frame (the plot is colored in red). The second most abundant reading frame is frame 2 (indicated by the horizontal blue line on the top) -on the right (nct 1477-3667) footprints...are mainly in the frame 1 (the plot is colored in green). The second most abundant reading frame is frame 0 (indicated by the horizontal red line on the top…

riboSeqR plotTranscript function

updated 11.2 years ago • Laura.Fancello

div class="preformatted">Dear Srinivas, Have you given all the column names which are in the dataset file? Files from ArrayExpression usually contain columns "Name" and "DBidentifier", in which case...you would use the argument annotation=c("Name","DBidentifier") when using read.maimages() to read them. Best wishes Gordon >To: Srinivas Iyyer <srini_iyyer_bio at="" yahoo.com.…

limma ArrayExpress limma ArrayExpress

updated 19.8 years ago • Gordon Smyth

Hi, I'm trying to change the font size of the gene name and increase the arrow head size. I've tried changing the cex and cex.id but it does'nt seem to be effecting the font...to=maxbase, trackType="GeneRegionTrack", rstarts="exonStarts", rends="exonEnds", gene="name", symbol="name2", transcript="name", strand="strand", fill="darkblu…

DMRcate Gviz

updated 4.1 years ago • Ahdee

The array is for Anopheles gambiae, and consists of about 13,500 cDNA spots from PCR plates - probe sequences between 150 and 500 bp in length. The manufacturer of my array provided a .GAL file with it - this was made in GenePix...and lists ensembl gene transcripts under the column "name" and ensembl gene identifiers under the column "ID". What I would really like is to add...use: I understand …

Microarray Annotation GO Anopheles gambiae probe annotate limma AnnBuilder biomaRt GO

updated 19.8 years ago • Amy Mikhail

ensembl_gene_id ensembl_exon_id 1 ENSG00000110395 ENSE00003791646;ENSE00003787952;ENSE00003787287 Sequence unavailable CBL 3;1;2 ENST00000634301 HGNC:1541 ``` So, the column names don't seem to be assigned in the correct order...Furthermore, there is "Sequence unavailable" and HGNC:1541" instead of cDNA and peptide sequence. That seems odd, because if we go for ``` getBM(attr…

annotation

updated 5.5 years ago • sarah.sandmann

illumina 2x150 bp I have cleaned reads and assembled reads into contigs. For each contig i have done gene prediction and extracted bacterial marker genes (based on Campbell et All paper). 2- With bacterial markers genes i have...using kaiju (actually diamond or blast will work as well).  3. So i end up building an abundance table for all my samples (normalized table in RPM). &nbsp…

ape phyloseq dist metagenomics

updated 8.7 years ago • David

Hello, I want to add gene names to rlog transformation for heatmap, I'm trying to install Biomart but I have some problems with. So I'm asking for another

deseq2 annotation

updated 6.8 years ago • saida3112

derived from Salmon output): ``` > count a b c d A 27 50 67 36 B 0 0 0 0 > length a b c d A 310.08 395.79 504.53 342.48 B 1026.00 1008.00 1009.00 1021.00 > abundance [,1] [,2] [,3] [,4] [1,] 3.14 4.95 3.84 4.91 [2,] 0.00...0.00 0.00 0.00 > f=makeCountsFromAbundance(count,abundance,length,countsFromAbundance=…

tximport

updated 6.7 years ago • capricygcapricyg

pre-wrap">getReadCountsFromBAM.R", I noticed the following code line: </span> <pre> for (i in 1:length(refSeqName)) </pre> It turned out that I am able to use the function with multiple reference sequence names like this, for...will give an output in the console as following: <pre> <em>Identified the following reference sequences: 1,2,3,4,5,6,7,8,9,10,11,12,…

cn.mops

updated 8.9 years ago • Mohammad Alkhamis

there is any command in R/BioC that > would allow to identify long consensus sites (16-20 nt length) in a set of > DNA sequences. > > Thanks a lot ! > > Arici > > [[alternative HTML version deleted]] > > _______________________________________________...gt; Bioc-sig-sequencing mailing list > Bioc-sig-sequencing at …

updated 16.1 years ago • Wolfgang Huber

KEGGREST bug

updated 8.4 years ago • olj23

AddModuleScore”) and UCell/ssGSEA (via “escape”) to try and look for differences in pathway/gene set representation between these groups. While looking at the results of hundreds of pathways/gene sets, I’ve noticed...that most of these results look very similar to one another. I am now quite certain that – in most cases – the (many) differences I see between...the experimental groups, in terms of…

escape UCell GSVA

updated 2.9 years ago • Omer

the way to the end, plots etc. Then I did a second try, where I wanted to trim my sequence by 14 bp(so the new length is 36 bp),I did the mapping again, created bam files but then in the r3cseq pipeline I started...No reads count per regions found in the r3Cseq object. Is there a specific limit to the sequence length I can use? I am also attaching part of my object's output: Slot…

r3cseq bam trimmed

updated 5.7 years ago • ch_el

is no longer part of BioMart, is there a way of using Bioconductor annotation packages to convert UniProt IDs (e.g. P15172) into protein names e.g. (MYOD1\_HUMAN). These names don't seem to be available with `` UniProt.ws

uniprot uniprot.ws

updated 9.9 years ago • Dario Strbenac

div class="preformatted">Hi, Is there a way to find out whether a gene is contaminated by ALU sequence? Thanks, -Jack [[alternative HTML version deleted]] </div

updated 15.3 years ago • Jack Luo

be able to help suggest a good way to generate a mapping of human/mouse orthologs, using ENSEMBL gene IDs? I found a [very old post](https://support.bioconductor.org/p/36348/) which pointed me to the [inpIDMapper](https://www.rdocumentation.org...in the AnntoationDbi package. The example usage shows how this function can be used to map a gene-indexed list of UniProt protein IDs, however…

human mouse ensembl annotationdbi homologue

updated 8.6 years ago • Keith Hughitt

To understand this difference, I checked how many genes from my universe are properly annotated in the org.Mm.eg.db database using the following code: # list of gene symbols...keytype = "SYMBOL") # Retrieve the associated GO terms for each of the mouse gene symbols. go_terms <- mapIds(org.Mm.eg.db, keys = gene_symbols, column …

GO clusterProfiler enr

updated 16 months ago • Fatima

<div class="preformatted">Hi, I am getting some problem in using 'BiomaRt' package to obtain the 3' UTR sequences of refseq predicted genes. I am using the following command: > getSequence (id=refseqID,type="refseq_dna_predicted",seqType="3utr",mart=ensembl), where refseqId is a list of my refseq predicted genes. I am getting the following error: Error in getBM(c(seqType, type), …

updated 17.2 years ago • Harpreet Saini

overwrite = T) ``` log gives a Warning, that apparently the extended Sequence is not right: ```r Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics...Loading required package: Biobase [...] Validating input ... >>> Finding all hits in sequence chr1 ... >>> DONE searching Building feature vectors …

CRISPRseek

updated 3.6 years ago • Miguel

interactions between DNA segments. I got the data from the ENCODE publication of Shen et al 2012, Nature, in which, for each chromosome, interactions are represented as a file containing the matrix of interaction. Considering...that each bin represent 40kb in the chromosome, I could isolate rows and columns covering my sequence of interest and I used heatmap.2 to generate the 'half' heatmap of in…

Gviz Hi-C Heatmap

updated 9.7 years ago • Merienne Nicolas

preformatted">Hi, Aedin: I am using the heatplot in made4, and want to output the re-ordered gene list and column names on heatplot map. Do you know any commands to do it? Xiang </div

made4 made4

updated 15.9 years ago • xiangxue Guo

ranges = IRanges(start = c(1, 10, 20), end = c(5, 15, 25)), strand = "+") names(ORF) <- c("tx1", "tx1", "tx1") grl <- GRangesList(tx1_1 = ORF) cov <- coverageByTranscript(riboseq, grl) # this is fast, < 1 min for whole...is only 450 MB, so in theory the frames should be 1/3 of that. A funny note on that is that the names of the Rle is 95% of …

Rle Big data coverageByTranscript

updated 7.7 years ago • hauken_heyken

to genes..." to "Limit to genes (external references)..." and "Limit to genes (microarray probes/probesets)..." * Renamed "ID list limit  \[Max...pages * Renamed "external\_transcript\_id" attribute internal name to "external\_transcript\_name" * Added "Transcript length" in the Features,…

biomart ensembl release news GRCh37 News

updated 10.8 years ago • Thomas Maurel

div class="preformatted">An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060819/ f4d88024/attachment.pl</div

updated 19.4 years ago • Marco Blanchette

is there a quicker way of doing this? What I've done is read the file as a DNAStringSet, select the sequences to keep, reorder the data and write sequences out. In addition, is there a way to read the sequence description separate...GCA_000001405.26_GRCh38.p11_genomic.fa") y <- data.frame(seqID=sapply(strsplit(names(x)," "), `[`, 1),         …

biostrings R bsgenome

updated 8.6 years ago • rockyb

div class="preformatted">I am using the following code to retrieve the exon sequences of gene Tcfap2b with GeneID:21419. There are 8 exons for this gene. for (i in sequence(50)) { + x <- getSequence(id=21419,type...This gives 44 NULL results and 6 correct results. 'correct' means getSequence() outputs the sequences of the exons. > sessionInfo() R version 2.8.1 (2008-12-22) x86_…

updated 16.8 years ago • Straubhaar, Juerg

presenting with multiple areas of cancer. You will become part of a highly successful UK Genome DNA Sequencing (WGS) consortium (ICGC UK Prostate). You will learn to process next-generation sequencing data and call somatic variants...Bayesian Dirichlet analysis and non-negative matrix factorization that have underpinned our recent Nature Genetics and Nature publications. First, the project will i…

Phd NGS cancer prostate Job

updated 9.9 years ago • Daniel Brewer

Hi, I have my TopGO results table, and I want to pull out the genes that are annotated to a specific GO term, I only want the significant genes, not all the genes annotated to the term. This...r library(topGO) go.mappings <- readMappings(file = "ALL_go_terms") all_genes <- names(go.mappings) #AGA aga_genes <- read.csv("treat_vs_control_aga.csv", header = TRUE) …

topGO

updated 3.5 years ago • Lucía

is.null(ext)) slides <- paste(slides, ext, sep = ".") nslides <- length(slides) if (is.null(names)) names <- removeExt(files) if (is.null(columns)) columns <- switch(source, agilent = list(Gf = "gMeanSignal...c("Block", "Row", "Column", "ID", "Name"), smd = c("Spot", "Clone ID", "Gene Symbol…

Annotation Annotation

updated 20.5 years ago • Gregory Lefebvre

I have a paired-end RNA-seq data in bam. It was read into R using readGAlignmentPairs, named gal2. The compatible reads pairs was parsed using findCompatibleOverlaps function. The following codes only specifically...examines compatible read pairs for 4th gene in the dm3_transcripts list. There was only one paired-end aligned to this gene. It's clear that the two mates overlap at...iranges> …

GenomicAlignments RNA-seq paired-end read coverage

updated 6.5 years ago • Jiping Wang

specific 454 pipelines, I am searching a convenient method to delimit OTUs from a simple (sanger) sequence fasta file such as this one (fungal ITS sequences), with the possibility to specify e.g. sequence similarity of 97% over...at least 90% length. The fasta header >VASmic02 says "sequence 02 from host plant VASmic. This example file consists of 10 sequences from

OTUbase OTUbase

updated 13.4 years ago • Martin Unterseher

did not mention the use of Tximport or any other tools but the read counts are already collapsed to gene-level, resulting in a matrix with gene names in the first column (e.g. TP53, A2M), genome-coordination in the second column...matrix) ##the matrix with gene names as row names and counts in columns norm-mat <- calcNormFactors(dgelist, method = "TMM") norm-mat <- cpm(nor…

DESeq2 tximport Kallisto TMM edgeR

updated 4.3 years ago • BioNovice247

science background), so I might have overlooked something really obvious. I have a couple of genes that I need to get the 1500 base pairs upstream and the first 500 bases of (so 2000 bases in total). I then want to find motifs...in those sequences. I want to do so using FIMO and MAST from the MEME suite. For this reason, I want my output to be [formatted appropriately...agat_sp_extract_sequences…

MAST

updated 22 months ago • maximilian.beikirch

1:3, end=4:6)) rbind(rd1, rd2) RangedData: 6 ranges by 0 columns columns(0): sequences(0): rd1 <- RangedData(IRanges(start=1:3, end=4:6, name=letters[1:3])) rd2 <- RangedData(IRanges(start=1:3, end=4:6, name=letters...6 1 [3, 6] | rd1 &lt…

IRanges IRanges

updated 16.0 years ago • Robert Castelo

redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors...iranges> <rle> | <factor> <factor> <numeric> <integer> [1] ctg123 1000-9000 + | NA gene NA &…

rtracklayer

updated 4.4 years ago • Charles

div class="preformatted">Hi list, I am wondering whether there is a package which can do sequence alignment for DNA sequences. I know package pairseqsim, which is applied to amino acid sequence. And I believe it is...fairly simple to change it so that it will apply to nucleotide sequence, have anybody done that? Can I ask for some code? Thank you very much. Fangxin -------------------- Fang…

pairseqsim pairseqsim

updated 20.7 years ago • Fangxin Hong

myDBx <- cdsBy(myDB,by = "tx",use.names = TRUE) Now, I would like to retrieve the external gene ids. Is this the most generic way? # select mart and dataset mymart = useMart("ENSEMBL_MART_ENSEMBL", dataset = "scerevisiae_gene_ensembl...host="www.ensembl.org") # just a selection of transcripts sel = names(myDBx)[5:6] getBM(attributes=c("ensembl_transcript_id","external_gene_id"), value…

Yeast TranscriptDb Yeast TranscriptDb

updated 12.7 years ago • Stefanie

Hi, I am trying to plot some data in R using the Gviz package, but I can't seem to get the gene names to appear on the track nor can I get the ideogram function to work. For the ideogram, I am aware that the yeast genome...gen, chromosome = chr) Error in .local(.Object, ...) : Failed to obtain 'hguid' cookie ``` For gene names, I would like to have them on top of the yellow bars indi…

gviz Saccharomyces_cerevisiae Gviz ideogram

updated 2.3 years ago • Michael

In a differential abundance modelling using DESeq2, given a single ASV (out of about 1000) with the following abundance values: | | | Seq 3 | |----|------|-------------| | 1 | A1-2 | 26250

DESeq2

updated 2.2 years ago • Roey Angel

pathview(), and only in some of pathways (running the same code) I'm getting an error with "negative length vectors are not allowed". For what I looked around it seems to be a problem with the dimensions of the dataset exceeding...around this. Any help would be very appreciated! Mauricio ```r ## Fold change matrix with gene ID as row names. gene.m <- matrix(c(1,1,-1,-1,-1), …

pathview kegg KEGG

updated 2.3 years ago • Mauricio Diego

preformatted">Dear list, I have expression data with Affymetrix hg u133 plus 2 probeset IDs & gene set data with gene names. So both gene set and expression data are not using the same gene ID system which is a requirement...to run the GAGE analysis. So the problem is that if i convert the Probeset IDs to gene name, i get a single gene name for multiple probes. So the expression data w…

convert gage convert gage

updated 13.9 years ago • Javerjung Sandhu

in case it should go to only one of the two). The task I'm trying to achieve is to align several sequences together. I don't have a basic pattern to match to. All that I know is that the "True" pattern should be of length "30" and...that the sequences I'm looking at, have had missing values introduced to them at random points. Here is an example of such sequences, were...on the left we see what…

GO GO

updated 15.1 years ago • Tal Galili

is the best approach to handle this non-integer numbers for DE analysis. I don't have gtf and gene id files for db sequence. the database only contains the putative long non-coding RNA which retrieved from...the genome.  Thanks Name Length EffectiveLength TPM NumReads CUFF.47.1 1011 845.627 21.0942 250.461 CUF…

salmon rna seq deseq2

updated 7.2 years ago • Bob

preformatted">Dear list, I have expression data with Affymetrix hg u133 plus 2 probeset IDs & gene set data with gene names. So both gene set and expression data are not using the same gene ID system. Both gene set and expression...data should use the same GENE ID system which is a requirement of the GAGE analysis. So the problem is that if i convert the Probeset IDs to gene name, i...…

convert gage convert gage

updated 13.9 years ago • Javerjung Sandhu

T, + testDirection="over") > > hgOver <- hyperGTest(params) > hgOver Gene to GO BP Conditional test for over-representation 356 GO BP ids tested (184 have p < 1) Selected gene set size: 18 Gene universe...gt; > dim(summary(hgOver)) [1] 184 7 As you can see, the hgOver object says that it tested 356 GO BP ids, but only 184 have p &l…

Annotation GO hgu95av2 GOstats Annotation GO hgu95av2 GOstats

updated 16.0 years ago • Jenny Drnevich

I'm working with a non-model species so having genes' name is luxury I can't afford. I have my design matrix and choose one combination I'm interested: cont.matrix1 <- makeContrasts...summa.fit) and when I wanna see the volcano plot: volcanoplot(fit.cont,coef=1,highlight=100,names=fit.cont$ !!!!!! , main="B.PregVsLac") in fit.cont$, there is no list of the transcript IDs there so I …

limma DEGreport Volcanoplot

updated 3.0 years ago • Mohammad

else encountered any issues with using mm10 in the goseq package: <pre> > pwf <- nullp(genes, "mm10", "geneSymbol") Can't find mm10/geneSymbol length data in genLenDataBase... Found the annotaion package, TxDb.Mmusculus.UCSC.mm10.knownGene...Trying to get the gene lengths from it. Warning message: In pcls(G) : initial point very close to some inequality constrain…

goseq

updated 10.0 years ago • Bohdan Khomtchouk

data set using gcrma for the normalization step and limma for finding differentially expressed genes. One of the most significant probesets (ProbeSetID annotation "1375535_at") in terms of d.e is annotated as : Probeset "1375535_at...Gene Symbol: Lpin1 - Location: Chr 6 in the bioconductor package "rat2302" / "rat2302.db". We also looked at the Affymetrix web site...where the same probeset wa…

Microarray Annotation limma gcrma Microarray Annotation limma gcrma

updated 17.5 years ago • Christoph Preuss

is a simple solution wherein I can convert the .h5 file into an R data frame or data table and have gene names added as a separate columns or as row names. I came up with this solution but wondering if this works? I am assuming...here that the genes are indexed. Meaning: Gene ID in position 1 (in the gene list) corresponds to the gene ID labels = 1 in the mol.info table. ``` mol.i…

tidyverse R SingleCellExperiment SingleCellData

updated 2.1 years ago • Sitapriya