Bioconductor Forum

using the makeTranscriptDbFromGFF() command. I can extract all of the transcripts by gene name. I get a GRanges list with each gene and then 2 columns of metadata including the transcripts ID and transcript names...associated with each gene. However, when I try to pull this information out with mcols, it is empty. I've included the code and output below. Am I doing...something wrong? Th…

genomicfeatures

updated 10.9 years ago • Jake

<div class="preformatted">Dear all, Is there any well-established routine for quality assessment and preprocessing of array CGH data, especially tiling array-based CGH data? I found most of the quality assessment of array data are about expression array, while few are related to array CGH data. We are using agilent 244k CGH array of rat, and now we have the text files produced by…

CGH probe CGH probe

updated 17.2 years ago • Leon Yee

you'd expect from a call to unique() on a DNAStringSet, i.e., what is your use case? unique() on a named character vector drops names: chr <- c(a="A", c="C", aa="A", c="CC") > unique(chr) [1] "A" "C" "CC" Same for a named list: lst <- list(a="A", c="C", aa="A", c="CC") > unique...A" [[2]] [1] "C" [[3]] [1] "CC" unique() on a DNAStringSet was patterned after th…

updated 13.2 years ago • Valerie Obenchain

div class="preformatted"> Dear Bioconductor users, i have RefSeq gene descriptions and also UniProt identifiers. How can i map this onto KO numbers (KEGG Orthology), e.g. K03120 ? </div

updated 15.8 years ago • Alla Bulashevska

div class="preformatted">I would like to know how to evaluate a character string as a variable name in R. Specifically, I need to "compute" the variable name in the phenoData slot of an ExpressionSet object. > varLabels(sample.ExpressionSet...variable in any given ExpressionSet object. Clearly I cannot do something like for (i in c(1:length(varLabels(sample.ExpressionSet))) { class_l…

updated 16.8 years ago • Kavitha Venkatesan

1,22)),"X","Y","MT") filtered<-.getGenes(chrs) chronly<-lapply(chrs, function(x){ length(.getGenes(x)$ensembl_gene_id) }) names(chronly)<-chrs allgenes<-table(unfiltered$chromosome_name) missinggenes&lt...9 1892 of individually 2303 all 2303" [10] "Missing genes on chr 10 0 of individually 2216 all 2216" [11] "Missing genes o…

biomaRt biomaRt

updated 11.9 years ago • Guest User

files/authenticated%20user/0/gdc-client_v1.3.0_Windows_x64.zip' Content type 'application/zip' length 16576709 bytes (15.8 MB) downloaded 15.8 MB Error in unzip(basename(bin)) : invalid zip name argument In addition: Warning...message: In if (grepl("^https?://", url)) { : the condition has length > 1 and only the first element will be used</pre>   Can anyone help…

tcgabiolinks gdcdownload unzip

updated 8.1 years ago • fawazfebin

like to know 2 things: -If you know if there is any code that I can use in order to obtain the gene name or description in the output file instead of the ensembl transcript ID -If there is any code to obtain more columns

deseq2 gene columns output

updated 10.2 years ago • amyfm

I have RNA-Seq data from a prokaryotic non-model organism (*Microbacterium*) and I am doing a gene set enrichment analysis. I mapped my amino acid sequences to KO annotations first. I then managed to do the gene set enrichment...and thus listed in https://www.genome.jp/kegg/catalog/org_list.html. However, I don't have the gene mapping between the reference and my sequences. For example, when…

clusterProfiler KEGG GeneSetEnrichment

updated 4.2 years ago • makrez

Just running through cummerbund and it's fine, other than when I have come to pinpoint particular genes, and extract the significant ones. In the gene\_exp.diff and other files prior to loading into cummerbund, the information...the data `` (cuff_data <- readCufflinks('CD-LD_out')) ``, <span style="line-height:1.6">the gene\_id has been replaced with the other annotation. W…

cummerbund

updated 10.7 years ago • daniel.antony.pass

One usage of ssGSEA is to evaluate the abundance of multiple gene-sets in the same sample to figure out which gene set is more abundant. For those purposes would it

GSVA

updated 4.5 years ago • k.vitting.seerup

human reference genome version 38. I wrote an R script to access certain values for that data (FASTA sequences for the protein-coding genes). However it appears that biomaRt does not have access to an archived version of that...ensembl = useEnsembl(biomart = "ensembl", dataset = "hsapiens_gene_ensembl", version = "87") sequences = getSequence(id = data[2:length(data)], type='ensembl_transcri…

biomaRt ensembldb RNA ensembl

updated 2.8 years ago • Christopher

gt; I ran again my R script aimed at extracting 3UTR sequences of validated > gene-targets. > Back to "hsa-mir-1" gene-targets ... I perfoemed the following > verifications and testsS...gt; [1] TRUE > > is.vector(genes_map[,"ensembl_transcript_id"]) > [1] TRUE > > length(genes_map[,"ensembl_transcript_id"]) > [1] 1941 > &gt…

biomaRt biomaRt

updated 15.6 years ago • mauede@alice.it

the BSgenome object. The initially reported error Calculating genomic coordinates...Error in vector(length = supersize_chr[length( chromosomes)], mode = "character") : vector size cannot be NA/NaN occurs because MEDIPS is trying to...Phytozome (JGI) > | provider version: 3.0 > | release date: January 2010 > | release name: Populus trichocarpa v3.0 > | > | sin…

GO BSgenome BSgenome Rsamtools genomes MEDIPS GO BSgenome BSgenome Rsamtools genomes

updated 11.8 years ago • Vining, Kelly

The GenomicFeatures library allows gene models to be stored in an SQLite database. Here is the full SQLite schema CREATE TABLE chrominfo ( _chrom_id INTEGER PRIMARY...library(GenomicFeatures) library(Rsamtools) library(Biostrings) # Build SQLite gene model database gff <- GenomicFeatures::makeTxDbFromGFF('example.gff') # Build index…

GenomicFeatures phase gene models GFF

updated 8.3 years ago • arendsee

Hi, I am trying to examine the change in abundance of OTUs along an elevational gradient. I have four 100m intervals of the gradient, with two sites per 100m interval

metagenomeseq

updated 9.9 years ago • craemuletz

of the enrichPathway function.  I know that the default input requires that all genes should be 'translated' to ENTREZID coding. I have my data in UNIPROT code, and when I make the translation, almost 20% of the...a similar amount of identity loss. That is why I would like to make my enrichment analysis using Uniprot identities directly.  I would like to make the enri…

reactomepa reactome.db reactome proteomics uniprot accessions

updated 8.2 years ago • Miguel.Cosenza

Hi there! I'm writing some code to extract the [Accumulated Natural Vectors](https://doi.org/10.3389/fgene.2019.00234) from all the sequences in a DNAStringSet object. To speed things...works as long as I convert each of the DNAString objects to character vectors first. For larger sequences this conversion is a bottleneck, and I was wondering if I can avoid it and pass the DNAString objec…

Biostrings DNAString Rcpp

updated 3.8 years ago • James

To make any plots of quality score or nucleotide frequency one should only look at reads of the same length. One output from PacBio filtering of the raw data is a FASTQ file. The nature of the PacBio reads is that they lack uniformity...FALSE, FALSE, TRUE)]) SeqClean <- ShortReadQ(sread=seqs, quality = qual, id = ids) #This worked!!: length(SeqClean) #484232 summary(width(SeqClean)) Mi…

Preprocessing ShortRead Preprocessing ShortRead

updated 12.9 years ago • Noah Dowell

To make any plots of quality score or nucleotide frequency one should only look at reads of the same length. One output from PacBio filtering of the raw data is a FASTQ file. The nature of the PacBio reads is that they lack uniformity...FALSE, FALSE, TRUE)]) SeqClean <- ShortReadQ(sread=seqs, quality = qual, id = ids) #This worked!!: length(SeqClean) #484232 summary(width(SeqClean)) Mi…

Preprocessing ShortRead Preprocessing ShortRead

updated 12.9 years ago • Noah Dowell

exonsBy(hgTxDb, 'tx') That the object returned (eTx in this case) doesn't contain the transcript names. But if we use e <- exonsBy(hgTxDb, 'gene') The ensembl gene names are returned as the names of GRangesList "e". I'm wondering if...there's a convenient way of annotating the eTx object with the transcript names also? Thanks, Paul. -- Paul Geeleher (PhD Student) School of Mathematic…

updated 15.0 years ago • Paul Geeleher

div class="preformatted">Hi all, I'm trying to combine two MALists, named RGAmodel and RGBmodel, using rbind and am running into the following problem: > rbind(RGAmodel, RGBmodel) Error in match.names...clabs, names(xi)) : names don't match previous names: Name In addition: Warning message: longer object length is not a multiple of shorter...object length in: clabs == …

updated 19.6 years ago • Matthew Scholz

of CCL2 inhibition accelerates breast cancer metastasis by promoting angiogenesis. Bonapace L, et al., Nature 2014. 2. PIK3CAH1047R induces multipotency and multi-lineage mammary tumors. Koren S, et al., Nature 2015. 3. Hippo kinases...LATS1/2 control human breast cell fate via crosstalk with ERα. Britschgi A, et al., Nature 2017. 4. Glucocorticoids promote breast cancer metastasis. Obradov…

JobPosting Basel Switzerland

updated 5.0 years ago • Robert Ivanek

to load it from a file, but I've noticed the number > of found gRNAs is always from the last sequence in the input fasta. > > In short the function you have is: > > findgRNAs <- function(inputFilePath, .......){ > > subjects...lt;- readDNAStringSet(inputFilePath, format > > > for(i in 1:length(subjects){ > #does s…

updated 11.5 years ago • Julie Zhu

MAE: <pre> > data A MultiAssayExperiment object of 4 listed experiments with user-defined names and respective classes. Containing an ExperimentList class object of length 4: [1] BRCA_miRNASeqGene-20160128: SummarizedExperiment...107__): <pre> > wideFormat(data[1:20, 1:108, ])</pre> <pre> Error in dimnames(x) <- dn : length of 'dimnames' […

multiassayexperiment wide format TCGA

updated 7.3 years ago • mario.zanfardino

marts for release 76 are now live on www.ensembl.org. You can change your host to access our most recent data: mart <- useMart(biomart="ENSEMBL_MART_ENSEMBL", host="www.ensembl.org", path="/biomart/martservice") If you...biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice") Ensembl Genes 76 Updated human assembly to GRCh38 Retirement of t…

Transcription zebrafish Transcription zebrafish

updated 11.4 years ago • Thomas Maurel

div class="preformatted">Hi, Why does Rkeys gives all of the gene symbols, not just the first 6 ? > head(names(geneTranscripts)) # Entrez IDs. [1] "1" "10" "100" "1000" "10000" "100008586" > length(org.Hs.egSYMBOL...head(names(geneTranscripts))]) [1] 6 > length(Rkeys(org.Hs.egSYMBOL[head(names(geneTranscripts))])) [1] 42075 GenomicFeatures_1.…

updated 13.2 years ago • Dario Strbenac

name" but I'm already using the appropriate data frame to map them to the gene ids. However, I get a `rowsum` error, saying something...removing duplicated transcript rows from tx2gene transcripts missing from tx2gene: 92 summarizing abundance summarizing counts summarizing length Error in rowsum.default(x[sub.idx, , drop = FALSE], geneId) : 'x' must be numeric...specification ─────────…

tximport

updated 4.9 years ago • Ezgi

late time points, each with the same two treatments (A & B). The early time point was sequenced recently on a newer platform with much deeper coverage than the mid and late time points, which were sequenced ~10...time point data (older technology). I converted counts to CPM and determined the 10th percentile of genes detected in at least 50% of samples from these two time points. I then a…

DESeq2 Normalization

updated 6 months ago • Mara

data alone is as follows: Error in `rownames<-`(`*tmp*`, value = list(PROBE_ID = c("ILMN_2896528", : length of 'dimnames' [1] not equal to array extent When reading in the control data I get the following error: Error in readGenericHeader...There aren't any #'s in the file, but I'm curious is read.ilmn handles these cases. The sample names do have '-' in the name, but I tried removin…

Microarray Microarray

updated 13.2 years ago • Mark Ebbert

Hi, When one have generated counts using FeatureCounts it generates a list of transcripts and note genes. I understand the procotol as "gene"-counts should be used - however how is genecounts defined? Can I use the transcript-count...Hi, When one have generated counts using FeatureCounts it generates a list of transcripts and note genes. I understand the procotol as "gene"-counts should be used…

deseq2

updated 7.2 years ago • SannaG

the best path for a particular position in the score matrix # nucA: (character) nucleotide in sequence A # nucB: (character) nucleotide in sequence B # row: (numeric) row-wise position in the matrix # col: (numeric) column-wise position...in seqA and seqB should be added, # if the path is up or left, there is a gap in one of the sequences. # nucA: (character) nucleot…

SequenceMatching Rstudio sequencing Alignment

updated 3.2 years ago • chenxinyi

I'm working with a large data-set with multiple treatment time points and genotypes. To increase my sample size for one of the time points, I'd like to add in two samples that were collected and sequenced in a different run. Alignment methods are also different (STAR vs Illumina DRAGEN). I’ve merged the counts table (I had...my sample size for one of the time points, I'd like to add in two sampl…

DESeq2

updated 2.8 years ago • Zainab

uses the TP53Genome() but not real rRNA data.) For example, do you provide a set of rRNA transcript sequences (e.g. extracted from a TxDb object or a GTF file with gene annotations) and build an index from those sequences? Or is...there an even better source for rRNA sequences, given that not all of the rRNA sequences may be annotated in the current reference genome? Thanks a lot for any pointe…

HTSeqGenie

updated 9.5 years ago • Thomas Sandmann

Ensembl Genes 96" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_ENSEMBL" path="/biomart/martservice" port="80...database="sequence_mart_96" default="" displayName="Sequence" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_SEQUENCE" path="/biomart/martservice" port...default="" displayName="Genomic features 96" host="www.ensembl.org" includeDatas…

biomaRt

updated 4 months ago • jcollidelcs

all. > > I am working with large data frames in R that contain a mix of numbers and variable-length strings. I've tried using the rhdf5 package to write and then read these and I haven't been able to figure out how to correctly...return(rndString)} > library(rhdf5) > n <- 1000000 > d <- data.frame(id=seq(n),name=rndString(n),val=rnorm(n),stringsAs…

updated 12.2 years ago • Bernd Fischer

might be able to help me out with. I'll be grateful. Affymetrix describes their 'GeneChip Human Gene 1.0 ST Array': Each of the 28,869 genes is represented on the array by approximately 26 probes spread across the full length...of the gene, providing a more complete and more accurate picture of gene expression than 3? based expression array designs. ... The Gene...Exon 1.0 ST Array and covers …

Annotation affy Annotation affy

updated 15.4 years ago • Paul Shannon

in the software, an error on my part, and most importantly if it compromises the analysis in the rest of the protocol. I am pasting in the script and then giving the...lt;- paste0(sra.numbers, ".fastq") by.group <- split(all.fastq, grouping) for (group in names(by.group)) { code <- system(paste(c("cat", by.group[[group]], ">", paste0(group, ".fastq")), collapse…

chipseq csaw; crosscorrelation

updated 4.7 years ago • richardallenfriedmanbrooklyn

Hi, I have GTF files from spliceR which tell me which transcripts of certain genes are differentially expressed between different conditions. However, I am unable to find what the ids assigned to the...transcripts (TCONS) correspond to. Is there a way to extract the transcript sequences corresponding to these ids using spliceR or any other tool? Thank you

splicer

updated 8.0 years ago • aditi

Hi All, I am interested in identifying bacterial genes that are differentially expressed in vivo in comparison to the in vitro bacterial culture. However, I encountered...Hi All, I am interested in identifying bacterial genes that are differentially expressed in vivo in comparison to the in vitro bacterial culture. However, I encountered a challenge regarding the sequencing depth of my samples.…

HugevariationsinsequencingDepth DESeq2

updated 2.6 years ago • Gopinath

I have noticed that after I run tximeta's `summarizeToGene` function, I get the following warning: Warning message: call dbDisconnect() when finished working with a connection If I execute the same command the second time, I do not get a warning. What does the warning mean? Will it affect my analysis? Thank you! ```r library(DESeq2) library(tximeta) # import data se = tximeta(colda…

tximeta

updated 4.6 years ago • Nikolay Ivanov

I import data from JSON file which contain squence names and DNA sequences, and other things. How do I make multisequence alignment, extracts positions that has SNPs (flanking

msa Biostrings

updated 5.7 years ago • Vang Le

this basic question.... Say i have three nodes, A, B and C. How can i plot a graph in which the length of the edges between the nodes are user-defined? Say between A en B the length should be 1, between A and C this should be...2, and between B and C this should be 0.5? I found that the argument "len" is able to set the length of the edges, http://www.graphviz.org/pub/scm/graphviz2/doc/info/attr…

graph Rgraphviz graph Rgraphviz

updated 15.6 years ago • Guido Hooiveld

r #### CHECK IF HK ARE ASSOCIATED WITH PRIMARY PHENO hk_raw = raw[cIdx,] pval = vector(length = nrow(hk_raw)) for (i in 1:nrow(hk_raw)){ reg = glm.nb(as.numeric(hk_raw[i,]) ~ as.factor(pData$Group)) pval[i] = coef(summary(reg))[2,4] } ``` This...values under .05, which points to an association between expression of these so called housekeeping genes and the phenotype. Foll…

NanoNormIter DifferentialExpression Housekeeping DESeq2 NanoString

updated 4.0 years ago • argonvibio

lt;- names(tab)[tab == 1] if (verbose) message(paste0(length(oneIsoGenes), " single-isoform genes found.")) ## Get transcript names for genes with...1" ## Subset to medium-to-high expressed genes if (verbose) message("Subsetting to medium-to-highly expressed genes...") txpsFit <- sort(txps[names(ebtFit)]) …

Alpine jcc

updated 6.4 years ago • bioinagesh

geneSel = topDiffGenes, nodeSize = 10, annot = annFUN.db, affyLib = affyLib) Building most specific GOs ..... ( 1670 GO terms found. ) Build GO DAG topology .......... Error in if (node == GENE.ONTO.ROOT) return(2) : argument is of length

GO.db topGO

updated 5.2 years ago • fklirono

with between group analysis (BGA) from the made4 package and I have a problem with some of the affy names: I tried bga.suppl(training.set[list,], supdata=test.set[list,], classvec=cl_train, supvec=cl_test) 'training.set' and 'test.set...202880_s_at" "1565905_at" "236347_at" ... I tried bga with different lists, and in most cases, this works fine. But however, sometimes the following erro…

Classification affy made4 Classification affy made4

updated 19.1 years ago • Heike Pospisil

IRanges_1.0.9 > > > > Error in solveUserSEW(length(x), start = start, end = end, > width = > width) : > solving row 1: 'allow.nonnarrowing' is FALSE and the supplied > start (170899994...unmasked(x), start = start, end = end, width > = width) : > Inva…

Annotation Cancer BSgenome PROcess GLAD BSgenome IRanges Annotation Cancer BSgenome GLAD

updated 17.0 years ago • Hervé Pagès

I am getting an error while mapping the gene names to the StringDB identifiers using the map function. The code was working fine before with the StringDB version...species=9606,score_threshold=700, input_directory="") geneMapped <- string_db$map(data.frame(gene=geneList), "gene", removeUnmappedRows = TRUE) Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected s…

PPI STRINGdb

updated 4.4 years ago • Ashish Jain

20 Sal_Sed 22 Sal_Sed 43 Sal_Sed 15 Sed 21 Sed 27 Sed ... ``` and these are my result names: ``` > resultsNames(dds2) [1] "Intercept" "group_F_vs_control" "group_F_Sal_vs_control" [4] "group_F_Sal_Sed_vs_control...group_Sed_vs_control" ``` I understood that for my single effects (e.g. gene expression due to sediment addition), I just have to co…

Interactions design Multifactorial DESeq2

updated 4.2 years ago • Marie

measuring cellular abundance of mRNA in Condition: "Control" and "Treated" (different group of animals than dataset 1) Samples in both the datasets...24hr cycle at 6 fixed time points in the day at the interval of 4hrs. My goal is to identify genes which are differentially expressed or not expressed in both datasets to be ale to comprehend at what post transcriptional...level gene express…

DESeq2

updated 23 months ago • Kuldeep

__Is there a way to _automatically_ determine which package is preferable, either from the name of a platform (e.g. "\[HuGene-1\_1-st\] Affymetrix Human Gene 1.1 ST Array", "Affymetrix GeneChip Human Genome U133 Plus 2.0 \[HG-U133...uses 33% less memory than _affy_'s read.affybatch().\*\*\* Since this step is the most memory-demanding of a microarray analysis, and a big dataset can easil…

microarray affymetrix microarrays affy oligo

updated 4.9 years ago • stevessheridan

even if they are .csv are tab-delimited 3. they don't have an "effective_length" column only the "length" column, no "TPM" column only "FPKM" column, and the "gene_id" column is a number (question 1 - does the "gene_id" column have to...I don't have the "effective_length" column? 4. Header looks like this: gene_id transcript_id(s) length expected_count FPKM SymbolID Cellular Component Molecul…

rsem tximport

updated 2.1 years ago • apolitics

the examples but these use data that I'm not interested in. To use my own data for chromosome names and lengths and for gene names, positions and strands I gather I need to instantiate a chromLocation object. When I try...a Y axis. Please tell me where I'm going wrong! Thanks, Matthew library(geneplotter) # make a named vector of chromosome lengths: chromLengths <- c(10000,5000) nam…

geneplotter geneplotter

updated 22.6 years ago • Matthew Hobbs

a specific command. My problem is that I am stuck. What I wish to do is this: I have a list of sequences (42 with a length of app. 300-500bp) in a fasta file. I want to BLAST these in NCBI (blastn) and get a list of top-hits corresponding...nbsp; I could (and am at the moment) doing this blast manually, however there will be many more sequences for me in the future and it feels so inefficient …

annotation blast biostrings

updated 10.5 years ago • Mathilde

following the instructions in the help-pages (exonmap.pdf) I get to a list of 'significant' exons/genes (sigs) but when interrogating xMap (which we have installed locally with the most recent version of mySQL) the query-functions...that by furnishing "Probeset ID" from Affymetrix gives correct results. However, the function "names" used with "names(fc(pc.exonmap))[.." returns different identifie…

Survival genefilter affy plier affyio exonmap Survival genefilter affy plier affyio

updated 18.4 years ago • Wolfgang Raffelsberger

When executing the example code below from the user's guide, the error below is returned after the last statement. I've noted posts regarding similar errors and so removed and reinstalled local libraries to no avail. Any guidance would be greatly appreciated. pwd<-"" \#INPUT FILES- BedFiles, FASTA, etc. path<- system.file("extdata/Test\_100.bed",package="rGADEM") BedFile&…

rgadem

updated 8.9 years ago • christopher.chaney

div class="preformatted">Hi, I have a dataset of list name of genes and the gene hits (the number of sequencing reads that mapped to this gene). This data set was generated by 454 sequencing...id. Column 2 -4 are 3 samples. I am interested in Sample1. I would like to find which functional genes in Sample1 are over-representative (i.e., I'm looking for genes which appear more often in Sample1 t…

Sequencing Category Sequencing Category

updated 12.3 years ago • Xiaoben Jiang

following options to featureCounts to let users be able to extend reads and also control the overlap length between reads and featureCounts: readExtension5 readExtension3 minReadOverlap The featureCounts help page...of each read symmetrically. I want to keep the 5-prime position of the read the same, but change the length. So if the effective fragment length was set to 150, then a 100-bp read m…

updated 11.7 years ago • Wei Shi

read.table("path/to/file_list.txt", header=FALSE)$V1) #Import a file called samples with the sample names corresponding to each file in the file_list #eg below: #sample1 #sample2 #sample3 #sample4 #look at the data structures...abundance.tsv" samples<-as.character(read.table("/path/to/samples.txt",header=FALSE)$V1) names(files)<-samples #look at the data structures samp…

tximport DESeq2

updated 2.1 years ago • mckinlee.salazar