Bioconductor Forum

mst slingParams curves pathStats(2): pseudotime weights cellnames(1): OEP01_N704_S507 cellData names(2): reducedDim clusterLabels pathnames(4): Lineage1 Lineage2 Lineage3 Lineage4 pathData names(0): # my problematic code...in value[[3L]](cond) : 'assay( <pseudotimeordering>, i="character", ...)' invalid subscript 'I' length of 'dimnames' [1] not equal to array extent ``` …

SingleCellExperiment singlecelle scRNAseq

updated 4 months ago • Francesco Silvestro

I hope someone can help. I am trying to carry out some differential abundance analysis on some microbiome data that has come from a metabarcoding experiment using 16S illumina sequencing...a single data point) after I have made my phyloseq object but before working out the differential abundance. I have a column for the abundance, the taxa and the sample (and there are 9 of each sample which I w…

phyloseq r

updated 3.8 years ago • strep

<div class="preformatted">Dear biocondutors, Obviously the database files accessible at the refseq, gene or locuslink ftp sites do not contain all ids which can be uniquely identified via the ncbi web interface. Whrere can I...div class="preformatted">Dear biocondutors, Obviously the database files accessible at the refseq, gene or locuslink ftp sites do not contain all ids which can b…

updated 18.9 years ago • Benjamin Otto

if I've done this optimally/right! My problem was to do compute a series of sliding window based sequence identity scans between many pairs two sequences. To do this, I programmed two iterators. One iterates sliding windows...in a DNAStringSet - I get a neat sliding window scan of sequence identity! But now say I wanted to do many pairs of sequences, this may take some time, so I'd want to para…

dnastringset parallel iterators

updated 10.1 years ago • ben.ward

<div class="preformatted">Hi, I've just realized that a call to unique on a DNAStringSet would result in the names slot to disappear. There's nothing about this in the documentation, but if that's the desired effect, warning about it would...preformatted">Hi, I've just realized that a call to unique on a DNAStringSet would result in the names slot to disappear. There's nothing about th…

updated 13.5 years ago • Nicolas Delhomme

Could you add selected row names to a ```pheatmap``` instead of including them all using ```show_rownames = T```? Something similar was asked few years ago here https...c("banana","lime","carrot","tomato"), c("A1","A2","B1","B2"))) labgenes<- c(row.names(x)[1], rep('', length(row.names(x))-1)) pheatmap(x, annotation_col = anno, fontsize_row = 10, show_colnames = F, show_rownames = labge…

R pheatmap

updated 5.1 years ago • ecg1g15

Hi, questions from a non-statistical expert... The PCA function within DESeq2 selects ntop genes before calculating PCA. This appears to make differences across samples in the 2D plot clearer. What is the statistical...versus, for example, coefficient of variation? And if I want to do a heatmap of the "top variable" genes is using absolute variance or CV only a matter of what genes we w…

DESeq2 ExpressionData heatmaps pca

updated 4.6 years ago • deut2016

Hello All! I am attempting to do differential abundance analysis using DESeq2 after doing the initial part of my analysis in QIIME 2. I was able to successfully export...Hello All! I am attempting to do differential abundance analysis using DESeq2 after doing the initial part of my analysis in QIIME 2. I was able to successfully export a...biom file, added sample metadata and taxonomy informati…

deseq2 R microbiome phyloseq qiime

updated 8.1 years ago • ssingh59

<div class="preformatted">I am sure there is an elegant way to do this. Could somebody clue me in? I have (in a simple case) two exons for a gene on the + strand, and a the full DNAString sequence of its chromosome. My naive technique for constructing a DNAString of the entire coding sequence is 1) paste together toString (subseq (seq.chrom, exon.start, exon.end)) for each exon 2)…

updated 16.8 years ago • Paul Shannon

class="preformatted">Whoops yes I see that now, I guess you first need to figure out what kind of gene ids you have. That DAVID tool I spoke of can often autodetect the type of gene id you are trying to convert if they are in its...30 PM, Paul Geeleher <paulgeeleher at="" gmail.com=""> wrote: >> Are they affy ids? Maybe the gene ID conversion tool might be able to >&…

Annotation GO probe affy convert Annotation GO probe affy convert

updated 15.0 years ago • Paul Geeleher

warnings that appeared when running the _nullp_ command. It seems GOseq cannot find the gene lenghts for my data ('hg38','ensGene') in genLenDataBase. I installed the TxDb.Hsapiens.UCSC.hg38.knownGene package...you please, help me to solve this problems?  I did not used all the differentially expressed genes. Instead, for the analysis, I used a list of DEGs of my intere…

goseq

updated 9.3 years ago • webquelzinhablue

April 2009 Number of Pages: 368 Emphasizing the search for patterns within and between biological sequences, trees, and graphs, this book shows how combinatorial pattern matching algorithms can solve computational biology...71.96 / ?43.99 For more details and to order: http://www.crcpress.com/product/isbn/9781420069730 *** Gene Expression Studies Using Affymetrix Microarrays Hinrich Gohlman…

Cancer PROcess Cancer PROcess

updated 16.6 years ago • Calver, Rob

using "rlog(dds)"? In other words, is DESeq2 "rlog" acting on Transcript per million (tpm) or raw abundance values (est_counts) from the Kallisto abundance.tsv files? ```r txi<- tximport(files, type="kallisto", tx2gene = tx2gene

DESeq2 rlog

updated 3.1 years ago • Bioinformatics-414

How do I get the 3' UTR sequence per gene rather than transcript? Online resources suggest using columns="gene_name" parameter but it doesn't seem

GenomicFeatures Transcript gene threeUTRsByTranscript

updated 4.9 years ago • amgtech

topGO, to perform GO analysis in a RNA-seq data set. I have an small data set of 104 significant genes that I called “sigG”. Firstly, I used genefilter to find genes that have similar level of expression than my “sigG”.  For...each sigGene I got 10 genes which make a background set of 1040 genes (backG). I run topGO but I found results that make suspect that something is going...wrong.…

topGO gene ontolology rnaseq

updated 8.9 years ago • colaneri

<div class="preformatted">Hi List, In the edgeR manual it is mentioned that the package may be useful in the analysis of other count based studies and I am wondering if the bulk segregation studies (based on tag counts) are such a case i.e. hoping for advice on the potential rightness or wrongness of doing this... The general experimental outline is as follows: >From parents intro…

edgeR edgeR

updated 14.3 years ago • josquin.tibbits@dpi.vic.gov.au

and viral RNAs were quantified, and as can be see in the table below, reads from viral RNA (= 10 genes) make up ~50% of the entire library in the infected condition. When running DESeq2 on this dataset, the estimated size...effectively double the counts for cellular RNAs because they capture the trend of the 20.000 human genes, however, we have reason to believe that cellular RN…

deseq2 normalization sizefactors

updated 8.0 years ago • René

rlog transformation using DESeq2: 1) This transformation normalizes for library size, which is sequencing depth or total number of mapped reads, correct? 2) This transformation does not normalize for gene or transcript...length, correct? If this is true, is there a program you would recommend to normalize transcript length using the output from

rlog transformation deseq2

updated 9.2 years ago • anpham

I want to align a DNA sequence by first translating it to a protein rather than aligning the DNA itself. Then using msa() I would align the sequences...but how would I convert it back to the original DNA sequences aligned on the basis of the AAs

msa decipher

updated 8.7 years ago • bbista91

Hi all. I am working with large data frames in R that contain a mix of numbers and variable-length strings. I've tried using the rhdf5 package to write and then read these and I haven't been able to figure out how to correctly...TRUE),collapse="")};return(rndString)} library(rhdf5) n <- 1000000 d <- data.frame(id=seq(n),name=rndString(n),val=rnorm(n),stringsAsFact ors=FALSE) h5cre…

rhdf5 rhdf5

updated 12.2 years ago • Guest User

Length of a region is calculated as _end - start_, instead of _end - start + 1_. This causes problems when length of a region equals...on a bam file, using warnings instead of stop, or solving the problem properly for shorter genes (i.e. by assigning a position to multiple bins) <pre> In <strong>cov.bin.R</strong>: setMethod("cov.bin", signature(extend = "missing

coverageview bug

updated 10.0 years ago • balwierz

BSgenome.Hsapiens.UCSC.hg19) path <- system.file("extdata/Test_100.fasta", package="rGADEM") sequences <- readDNAStringSet(path, "fasta") length(sequences) # 49 # you can use the UCSC genome browser to BLAT the first 4 sequences...what TFs bind there # http://genome.ucsc.edu/cgi-bin/hgBlat?command=start as.character(substr(sequences[1:4], 1, 80)) # Once you blat, cli…

rGADEM MotifDb rGADEM MotifDb

updated 12.9 years ago • Paul Shannon

team will offer an introductory / intermediate course "R / Bioconductor for High-Throughput Sequence Analysis" May 10-11, 2012 Fred Hutchinson Cancer Research Center - Seattle, WA Overview R / Bioconductor for High-Throughput...Sequence Analysis introduces essential concepts and work flows in the manipulation and statistical analysis of sequence...data. Topics covered include: essential R skil…

Annotation Cancer genomes Annotation Cancer genomes

updated 13.8 years ago • Martin Morgan

div class="preformatted">Dear list, I am trying to retrieve 5' flanking sequences and 5' utr for several genes. Doing this via biomart.org or, respectively, biomarRt yields different results. An example...I want to retrieve the 5' flanking sequences (3000 bases) plus the 5' utr for the gene with the EntrezID 23704. My R code: library(biomaRt) ensembl <- useMart("ensembl...c(23704), …

Homo sapiens biomaRt Homo sapiens biomaRt

updated 16.3 years ago • Tefina Paloma

I have have tx2gene files that are output from Orthofinder. Kallisto abundances were generated from for two different assemblies from different tissues. Ultimately I am hoping to use edgeR TMM means to get a normalization factor across the tissues. I was able to upload the abundance files according to the vignette https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/txi…

tximport

updated 4.3 years ago • nicolette.sipperly

Hi I remember there is a website where you can type in affy probe set ID and it will give you the gene name, could anybody tell me what it is? Thanks, -James [[alternative HTML version deleted]] </div

probe affy probe affy

updated 15.8 years ago • James Anderson

I am using the cosmo package in oder to screen promoter alleles and the promotors of co-regulated genes for conserved motifs. By the main approach described in the reference manual (and using TCM as the model), only a single...as well. Is it possible in cosmo (/like metioned for MEME in *D'haeseleer, P.* (2006) How does DNA sequence motif discovery work? /Nature Biotechnology/ 24, 959 - 961/) to …

cosmo cosmo

updated 16.6 years ago • Thierry Janssens

Hello, I am running TopGO over a genome that was sequenced, assembled and annotated by my group so I supply my own GO annotation. I am interested to see if there are enriched...GO terms in sub-group of genes of this genome and am little confused with TopGo results - because the first 6 best results include GO terms that are...even annotated for this genome, not to mention in the subgroup of gene…

topgo

updated 9.5 years ago • pninasmail

div class="preformatted"> Hi everybody, since this weekend I can't no longer connect to uniprot Biomart services via biomaRt (which worked pretty fine before) receiving the following error: > uni <- useMart("unimart...dataset="uniprot") Fehler in useMart("unimart", dataset = "uniprot") : Incorrect BioMart name, use the listMarts function to see which BioMart

biomaRt biomaRt

updated 12.6 years ago • Guest User

div class="preformatted">Could anybody give an explanation why many of the probe sequences for a gene overlap by quite a margin? For instance, on HGU133A here is hardly a probeset without overlapping oligos

hgu133a probe hgu133a probe

updated 22.6 years ago • Johannes Hüsing

div class="preformatted">dear list, i need to get the intron sequences for a group of entrez gene ids. is there any way to do it using biomaRt? apparently there is no option in the getSequence

biomaRt biomaRt

updated 18.6 years ago • Dario Greco

lt;- "U82759_at" absts <- pm.getabst(hoxa9, "hu6800") gives Warning message: the condition has length > 1 and only the first element will be used in: if (is.na(pm)) rval[[i]] <- NA else { Happens when there are multiple Pubmed IDs...corresponding to a gene. Thanks, Ritu</div

updated 22.7 years ago • ritur

task. It contains the database from miRBase along with the target genes. You can then use this database to find the sequence using biomaRt. This package does not have a vignette. To find the help...as big as posssible) of experimentally Validated miRNAs from miRecords with their relative target genes > and the 3'UTR sequences., limited to Homo sapiens. > The XLS file from miReco…

Transcription miRNA Cancer Homo sapiens biomaRt microRNA Transcription miRNA Cancer

updated 16.3 years ago • mauede@alice.it

I've been looking into normalization more and more, and I was wondering about a few things that perhaps some of you might know the answer to or want to discuss So their exists within samples normalization (TPM or others), i.e. relative abundances and between samples normalization (TMM or others), but is it necessary to do both ever, i.e. is it ever necessary to normalize relative abundances acro…

rnaseq normalization tpm tmm

updated 10.3 years ago • nickbern92

with biomaRt and I can't figure out what's going on. I am using getBM to pull down a large number of gene coordinates, and filtering to restrict to chromosomes 1-22 and X,Y. For some reason this procedure (which is giving no errors...is not pulling down some genes that I think it should. My basic code for pulling down all of this information is: tempAll<-getBM(c("ensembl_gene_id...biotype…

biomaRt biomaRt

updated 17.4 years ago • Elizabeth Purdom

Transcriptional amplification is the phenomenon where the majority of genes in a sample are increased in expression. An overview of it and the statistical implications is found in [Cell][1]. [Recent...samples have an estimated copy number of 4 for about 70% to 80% of their chromosomes from the DNA sequencing analysis. [1]: https://www.cell.com/cell/fulltext/S0092-8674(12)01226-3 [2]: http…

edgeR DESeq2 Transcriptional Amplification

updated 6.8 years ago • Dario Strbenac

HG-U133A and, as an initial step after creating an rmaset, I filter out the probesets without Entrez gene IDs as follows: arrayset <- ReadAffy() rmaset <- rma(arrayset) entrezIds <- mget(featureNames(rmaset), envir = hgu133aENTREZID...haveEntrezId <- names(entrezIds)[sapply(entrezIds, function(x) !is.na (x))] numNoEntrezId <- length(featureNames(rmaset)) - leng…

Annotation hgu133a limma Annotation hgu133a limma

updated 17.6 years ago • David Fermin

proteins in my samples. I next mapped the identified proteins back to their transcript IDs and sequences and was able to get transcript coordinates for each protein sequence. Now, I would like to convert the transcript...Chromosome.scaffold.name'][[1]][i] rng_tx <- IRanges(start = start, end = end, names = tx_id) edbx <- filter(ahEdb, filter = ~ seq_name == chr…

AnnotationHub

updated 4.1 years ago • nattzy94

Dear BioConductor community, I've recently been trying to normalizing large metagenome abundance matrices with the function cpm(y) of edgeR. I get the following error: <code>Error in .isAllZero(counts) :<br/>   long...Dear BioConductor community, I've recently been trying to normalizing large metagenome abundance matrices with the function cpm(y) of edgeR. I get the follo…

edger cpm

updated 8.3 years ago • jtremblay514

abundance.tsv file and abundance.h5 files were generated. Now in order to make use of kallisto abundance\_estimation, 1) Should I take only the estimated counts column from each sample and make a data frame which is later

deseq2 kallisto

updated 8.3 years ago • deena

wondering a similar thing: suppose I use transcripts per million (TPM) as a coherent estimate of abundance and feed it to limma/voom in order to fit a multi-factorial blocked model in a poorly annotated organism (draft genome...Edger developers and users, > > I would like to compare transcription levels of orthologous genes belonging > to different species, in order to find s…

Transcription edgeR Transcription edgeR

updated 11.4 years ago • Tim Triche

NA,dataSource="g ff file for test", species="test") extracting transcript information Extracting gene IDs extracting transcript information Processing splicing information for gff3 file. Deducing exon rank from relative...provided Prepare the 'metadata' data frame ... metadata: OK Now generating chrominfo from available sequence names. No chromosome length information is available. Warning messag…

updated 12.1 years ago • Guest User

that I'm interested of a certain region of the chromosome to analyze the expression of a particular gene,  Where I can find the region that I'm looking for and how to set up the code?  Here is an example: <pre> > gr <- GRanges...Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),ranges = IRanges(101:110, end = 111:120, names = head(letters, 10)),stra…

bioconductor Grange

updated 7.7 years ago • Merlin

span style="line-height:1.6">Is it possible to use DESeq2 for transposon insertion sequencing analysis?</span> The experiment is to construct a library of transposon-insertion mutants and compare the relative...of mutants before and after experimental condition. Since the transposon is supposed to disrupt the gene function, enriched mutants after this selection suggests the gene is dele…

deseq2

updated 10.7 years ago • yifanz119

dear all, I think there is a problem with the reactome.db package: in version 1.50.0 most of the pathway IDs are not mapped to pathway names: library( reactome.db ) reactome.list <- as.list( reactomePATHID2EXTID...have.name <- names( reactome.list ) %in% ls( reactomePATHID2NAME ) sum( have.name ) length( have.name ) In 3.1.2 with reactome.db 1.50.0 only 1518...of 10373 p…

annotation pathways

updated 10.8 years ago • Johannes Rainer

GENEMODELS <- GenomicFeatures::genes(txdb) GENEMODELS GRanges object with 39017 ranges and 1 metadata column: seqnames ranges strand | gene_id <rle> <iranges...74635214-74635351 + | ENSMUSG00000099334 ------- seqinfo: 66…

plyranges

updated 6.3 years ago • Aditya

gt; data\_counts<- DESeq(data\_counts) estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing &gt...nbsp;    "ENSEMBLPROT"   \[5\] "ENSEMBLTRANS" "ENTREZID"     "ENZYME"       "E…

annotation RNASeq annotationdbi

updated 8.5 years ago • Alexandra

I am trying out msqRob2 for the first time and am following the tutorial at https://www.bioconductor.org/packages/release/bioc/vignettes/msqrob2/inst/doc/cptac.html. I am having trouble specifying the conditions for colData for the pe object. The MDS plot produced by this code returns blank, with no data points on it: ```r limma::plotMDS(assay(pe[["peptideNorm"]]), col = as.numeric(colData(…

msqRob2 msqrob2

updated 3.7 years ago • Jane

Dear All, I am trying to do differtial gene expression analysis of Breast Cancer vs. normal patients using data set GSE62944.  biocLite("AnnotationHub") library...nbsp; reason: Forbidden (HTTP 403).  Error: failed to load 'AnnotationHub' resource   name: AH28855   title: RNA-Sequencing and clinical data for 7706 tumor samples from The Cancer Gen…

R limma cancer tcga bioconductor

updated 9.8 years ago • hamda.binte.ajmal

div class="preformatted">Dear list, I have tried to fetch 1000bp upstream sequences for a number of rat genes via biomart. When using the webbased biomart this seems to work alright, but when automating...the process I ran into a problem: for some genes, "Sequence unavailable" is returned. Exemplifiyng for a single gene that has this problem: > sqmrtjb<-useMart("ensembl...codi…

PROcess biomaRt PROcess biomaRt

updated 17.1 years ago • Jorrit Boekel

div class="preformatted">HI, Does anybody know how to control the order of the column names displaying in the heatmap (heatplot)? I was trying to compare two heatmaps with same set of genes as row names and same set...of patient samples as column names. The result shows a different order of column names. I want they display in the same order in the heatmap pictures. regards

updated 15.4 years ago • xiangxue Guo

Dear all, please could you advise : how does MAFTOOLS treat the annovar files with the gene names ? For examples, shall I have a file (below), the questions would be : <span style="background-color:Yellow">> head(var.annovar.maf

maftools

updated 8.9 years ago • Bogdan

Hi folks, I have a set of short DNA sequences containing the R character (for A or G). I used readDNAStringSet to convert my fasta input file to a DNAStringSet...Hi folks, I have a set of short DNA sequences containing the R character (for A or G). I used readDNAStringSet to convert my fasta input file to a DNAStringSet, then the Disambiguate function from the DECIPHER package to expand the set i…

DECIPHER FASTA disambiguate

updated 5.6 years ago • joannew

Tx, dropInfReps=TRUE, ignoreTxVersion = TRUE) But the result is wired. It is not counts for each gene, but seems counts for all the genes: > txi.kallisto.tsv <- tximport(files, type = "kallisto", tx2gene = Tx, dropInfReps=TRUE...in files with read_tsv 1 2 3 4 transcripts missing from tx2gene: 46826 summarizing abundance summarizing counts summariz…

tximport kallisto

updated 2.7 years ago • Yijing

<div class="preformatted">To whom it may concern, I've been having some problems with consistency in my limma results for genes that are found to have significant differential transcript abundance. In a given example, I may have 4 different groups (a, b, c, and d) in an array set of 12. >From here, I make a contrast matrix that has contrasts for a-b, a-c, and a-d. Eventually, I o…

limma limma

updated 15.7 years ago • Joseph Skaf

I was following the tutorials of WGCNA with a new dataset and I am getting an error about `sizeRestrictedClusterMerge` were the replacement is not of the same length as the values to replace: ```r > net <- blockwiseModules(biggermatrix, power = 7, + TOMType = "signed", + networkType = "unsigned", + minModul…

wgcna

updated 6.7 years ago • Lluís Revilla Sancho

written a script which reads in Affymetrix data, filters based on intensity values and then pulls sequence data for the probes which satisfy the intensity based filter. I notice that results given by the script are different...from the sequence data files which Affymetrix supplies! Here are top three seq. from my result and below these are the sequences for...the same probe ID's from Affymetrix …

probe probe

updated 21.0 years ago • hrishikesh deshmukh

div class="preformatted">Dear all, Is it possible to use a string as an environment name in mget? I was trying to do this with the following script, but it says that the second argument for mget is not an environment...Well, I know it isn't, since it was generated using paste function, so it's class is character. > genes<-c("1007_s_at", "1053_at", "117_at", "121_at") > c…

updated 19.1 years ago • Jarno Tuimala

I want to obtain all human homologous gene from a Ensembl.Gene.ID vector of all Mouse genes. The df i obtain is missing many entries. Also if I use gene names instead...br/> [6] HGNC.symbol        <br/> <0 rows> (or 0-length row.names)</code> getLDS(attributes, attributesL = attribut…

biomart

updated 9.2 years ago • alessandro.pastore

Recently I have been worked with data analysis concerning RNA capture followed by high throughput sequencing. Four samples, 2 cases and 2 controls, were used for sequencing. My library preparation protocol is similar to workflow...synthesized by Agilent. After mapping, I want to do DEG analysis utilizing DESeq, and I found the gene number would affect the results given by DESeq. So my question i…

Sequencing DESeq Sequencing DESeq

updated 12.4 years ago • 邵建明