Entering edit mode
@mehmet-ilyas-cosacak-9020
Last seen 6.7 years ago
Germany/Dresden/ CRTD - DZNE
danRer10 is missing in "geneLenDataBase". In my analysis, I am always using the current releases of databases. danRer6" is available and is not working for me.
I am using GOSeq for gene ontology analyses and had the error as below:
Can't find danRer10/ensGene length data in genLenDataBase... Loading required package: rtracklayer Trying to download from UCSC. This might take a couple of minutes. Error in getlength(names(DEgenes), genome, id) : The gene names specified do not match the gene names for genome danRer10 and ID ensGene. Gene names given were: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Required gene names are: ENSDART00000145068.2, ENSDART00000010526.9 ...
Hi Nadia,
thanks for the reply. Yes, you are definitely right, the gene names are wrong.
I am using ensembl GeneIDs (e.g., ENSDARG00000000019, ...) in order to calculate number of reads mapped to a gene from RNA-Seq data using featureCounts. Here, GOSeq asks for transcript IDs as you can see below in the error message. Do you have a suggestions for that?
thanks,
ilyas.
Hi Ilyas,
It should be expecting the gene names and not the transcript IDs, so we will fix this in the next release of goseq. We're also planning on updating the geneLenDataBase (finally) with the next release. If you are looking for a solution faster than that, and have the gene lengths in your count table (e.g. as output by featureCounts or other programs), you can pass these to the nullp function through the parameter, bias.data.
Cheers,
Nadia.
Hi Nadia,
Thank you very much!
I solved the problem as below for the moment. Might be helpful!
updating GOSeq and geneLenDataBase will make it easier for beginner of R.
best,
ilyas.