'GO_analyse' function error
1
0
Entering edit mode
Last seen 13 months ago

Hello,

I'm running DE analyses on RNA-Seq data of 4 leukemia patients. by using the 'convert' package and the 'as' function I convert the data to 'ExpressionSet' format. then I run the 'GOexpress' package, but when I reach the 'GO_analyse' function I get the following error:

"Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate"

You can see the code I'm running below:

library("DESeq2")

list.files(directory)

sampleFiles<- c("N1.counts","N3.counts",
"L2.counts","L3.counts")

sampleNames <- c("Normal1","Normal3","Leukemia2","Leukemia3")
sampleCondition <- c("control","control","treated","treated")

sampleTable <- data.frame(sampleName = sampleNames,
fileName = sampleFiles,
condition = sampleCondition)

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
design = ~ condition)

library(GOexpress)
data(AlvMac)
AlvMac
library("Biobase")
library(convert)
z=as(ddsHTSeq,'ExpressionSet')
z
AlvMac=z


It does not make any differences in producing error running AlvMac=z or not. I ran the code using #AlvMac=z, but I faced the same error.

head(pData(z))
is.factor(z$condition) z$condition
z$condition <- factor(z$condition)
AlvMac_results <- GO_analyse(eSet = z,
f = "condition",biomart_dataset= "hsapiens_gene_ensembl")

Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate


Any idea to help me understand where the problem is or how to solve it would be appreciated.

Thanks, Reza

goexpress ExpressionSet GO_analyse • 321 views
0
Entering edit mode

Hi,

Your code is very difficult to read without appropriate formatting. Please use the online editor to help you format text and code clearly, which helps me help you. See below for a rapid cleanup and comments. Note that you can use reprex::reprex({...}) to produce code and output that you can copy paste

Also, I appreciate that you sent me the sample files directly by email, but it would be more helpful for everyone (you, me, other readers) if you could use a publicly available example (e.g. AlvMac) to demonstrate the error. Using publicly available data avoids any privacy issue, and allows other people to run the code that you provide to reproduce the error.

Below, using local paths specific to your machine makes the example irreproducible. There is no need for this. When you prepare your code, just create a new directory, move there in your own R session, and give us code that works within that directory.

library("DESeq2")
directory <- "E:/Data/New Protocol/New folder"
setwd(directory)
list.files(directory) # output?


Clear code, but it would be great it used public files or data.

sampleFiles<- c("N1.counts","N3.counts", "L2.counts","L3.counts")
sampleNames <- c("Normal1","Normal3","Leukemia2","Leukemia3")
sampleCondition <- c("control","control","treated","treated")
sampleTable <- data.frame(sampleName = sampleNames, fileName = sampleFiles, condition = sampleCondition)
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, design = ~ condition)


This code is useless in your post if you don't use AlvMac to illustrate your problem.

library(GOexpress)
data(AlvMac)
AlvMac


Good code. Thanks. If you used public data, it would be perfect, no one knows what z looks like.

library("Biobase")
library(convert)
z=as(ddsHTSeq,'ExpressionSet')
z # output?


Why do you keep switching between AlvMac and z ? In your GOanalyse example below, the only thing that seems to matter is z. But again, if you can show me that you can reproduce your error using the publicly available AlvMac, then it would be a lot more helpful for me and others to debug your situation.

AlvMac=z
is.factor(z$condition) z$condition
z$condition <- factor(z$condition)
AlvMac # output?


This is the key command that produces your error.

results <- GOanalyse(eSet = z, f = "condition",biomartdataset= "hsapiensgene_ensembl")


For clarity, it would be better to show the error at this point.

Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate

0
Entering edit mode
kevin.rue ▴ 300
@kevinrue-6757
Last seen 5 months ago
University of Oxford

results <- GOanalyse(eSet = z, f = "condition",biomartdataset= "hsapiensgene_ensembl")


There are two errors in this command:

• biomartdataset should be biomart_dataset
• hsapiensgene_ensembl should be hsapiens_gene_ensembl

Recommendations:

• Please read carefully the help pages in the GOexpress package. Here is the current GO_analyse function call
GO_analyse(
eSet, f, subset=NULL, biomart_name = "ENSEMBL_MART_ENSEMBL",
biomart_dataset="", microarray="",
method="randomForest", rank.by="rank", do.trace=100, ntree=1000,
mtry=ceiling(2*sqrt(nrow(eSet))), GO_genes=NULL, all_GO=NULL,
all_genes=NULL, FUN.GO=mean, ...)

• See the following reprex whicih illustrates how to find valid BioMart data set names.
library(biomaRt)
#> Warning: package 'biomaRt' was built under R version 3.6.1
# listMarts()
mart = useMart("ENSEMBL_MART_ENSEMBL")

subset(listDatasets(mart = mart), grepl("sapiens", dataset))
#>                  dataset              description    version
#> 76 hsapiens_gene_ensembl Human genes (GRCh38.p12) GRCh38.p12

• I do not recommend using biomart_dataset as it will download annotations from BioMart every time you run GO_analyse. The recommended way to use the GO_analyse function is to use the GO_genes argument for providing the table mapping gene identifiers to gene ontology identifiers. See data(AlvMac_GOgenes) for an example. You can download the mapping table yourself from BioMart once, save it to your computer and then load it directly from your computer instead of downloading it again.
> head(AlvMac_GOgenes)
gene_id      go_id
1 ENSBTAG00000020495 GO:0005515
2 ENSBTAG00000020495 GO:0006661
3 ENSBTAG00000020495 GO:1900027
4 ENSBTAG00000020495 GO:0032587
5 ENSBTAG00000020495 GO:0019902
6 ENSBTAG00000020495 GO:0035091


Created on 2019-09-05 by the reprex package (v0.3.0)