Problems with GOseq
1
1
Entering edit mode
@webquelzinhablue-11485
Last seen 7.6 years ago

I am writing because of some warnings that appeared when running the nullp command. It seems GOseq cannot find the gene lenghts for my data ('hg38','ensGene') in genLenDataBase. I installed the TxDb.Hsapiens.UCSC.hg38.knownGene package and it seemed to help a little. However, some errors still apear (see below). Could you please, help me to solve this problems?  I did not used all the differentially expressed genes. Instead, for the analysis, I used a list of DEGs of my interest plus all non DEG.

> pwf=nullp(genes,'hg38','ensGene', bias.data=NULL, plot.fit = TRUE)
Can't find hg38/ensGene length data in genLenDataBase...
Found the annotaion package, TxDb.Hsapiens.UCSC.hg38.knownGene
Trying to get the gene lengths from it.
Warning messages:
1: In library() :
  bibliotecas ‘/usr/local/lib/R/site-library’, ‘/usr/lib/R/site-library’ não contém pacotes
2: In getlength(names(DEgenes), genome, id) :
  More than 40% of gene names specified did not match the gene names for genome hg38 and ID ensGene.  No length data will be available for these genes.
	Gene names which failed to match were: ENSG00000002079, ENSG00000018607, ENSG00000020219, ENSG00000067601, ENSG00000078319, ENSG00000083622, ENSG00000088340, ENSG00000093100, ENSG00000101278, ENSG00000101898
	Required gene names are: ENSG00000000003, ENSG00000000005, ENSG00000000419, ENSG00000000457, ENSG00000000460, ENSG00000000938, ENSG00000000971, ENSG00000001036, ENSG00000001084, ENSG00000001167
3: In pcls(G) : initial point very close to some inequality constraints

Thank you!

goseq • 2.2k views
ADD COMMENT
1
Entering edit mode
@nadia-davidson-5739
Last seen 5.0 years ago
Australia

Hi,

This is probably happening because there are fewer "knownGene"s than "ensGene"s. How does you pwf graph look? You may still have enough gene lengths for the bias weighting. Otherwise I would suggest that you use all your genes (you could try treating the DEGs not of interest as non DEGs). You could also pull out the gene lengths from the count tables if you uses featureCounts etc. to generate them. You can then supply the gene lengths to nullp as one of the function options.

Cheers,

Nadia.

ADD COMMENT

Login before adding your answer.

Traffic: 930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6