Question: Problems with GOseq
gravatar for webquelzinhablue
2.2 years ago by
webquelzinhablue10 wrote:

I am writing because of some warnings that appeared when running the nullp command. It seems GOseq cannot find the gene lenghts for my data ('hg38','ensGene') in genLenDataBase. I installed the TxDb.Hsapiens.UCSC.hg38.knownGene package and it seemed to help a little. However, some errors still apear (see below). Could you please, help me to solve this problems?  I did not used all the differentially expressed genes. Instead, for the analysis, I used a list of DEGs of my interest plus all non DEG.

> pwf=nullp(genes,'hg38','ensGene',, = TRUE)
Can't find hg38/ensGene length data in genLenDataBase...
Found the annotaion package, TxDb.Hsapiens.UCSC.hg38.knownGene
Trying to get the gene lengths from it.
Warning messages:
1: In library() :
  bibliotecas ‘/usr/local/lib/R/site-library’, ‘/usr/lib/R/site-library’ não contém pacotes
2: In getlength(names(DEgenes), genome, id) :
  More than 40% of gene names specified did not match the gene names for genome hg38 and ID ensGene.  No length data will be available for these genes.
	Gene names which failed to match were: ENSG00000002079, ENSG00000018607, ENSG00000020219, ENSG00000067601, ENSG00000078319, ENSG00000083622, ENSG00000088340, ENSG00000093100, ENSG00000101278, ENSG00000101898
	Required gene names are: ENSG00000000003, ENSG00000000005, ENSG00000000419, ENSG00000000457, ENSG00000000460, ENSG00000000938, ENSG00000000971, ENSG00000001036, ENSG00000001084, ENSG00000001167
3: In pcls(G) : initial point very close to some inequality constraints

Thank you!

ADD COMMENTlink modified 2.2 years ago by Nadia Davidson270 • written 2.2 years ago by webquelzinhablue10
gravatar for Nadia Davidson
2.2 years ago by
Nadia Davidson270 wrote:


This is probably happening because there are fewer "knownGene"s than "ensGene"s. How does you pwf graph look? You may still have enough gene lengths for the bias weighting. Otherwise I would suggest that you use all your genes (you could try treating the DEGs not of interest as non DEGs). You could also pull out the gene lengths from the count tables if you uses featureCounts etc. to generate them. You can then supply the gene lengths to nullp as one of the function options.



ADD COMMENTlink written 2.2 years ago by Nadia Davidson270
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 322 users visited in the last hour