Problem for finding "Locus Link ID" from Annotation file
Entering edit mode
modarzi ▴ 10
Last seen 23 months ago


Generally, based on  annotation file(below link) for TCGA data, I don't have  "LocusLinkID" as an attribute for genes. but as you see in below code of WGCNA tutorial, for Interfacing network analysis with other data such as functional annotation and gene ontology I need "LocusLinkID":

# Read in the probe annotation
annot = read.csv(file = "GeneAnnotation.csv");
# Match probes in the data set to the probe IDs in the annotation file
probes = names(datExpr)
probes2annot = match(probes, annot$substanceBXH)
# Get the corresponding Locuis Link IDs
allLLIDs = annot$LocusLinkID[probes2annot];
# $ Choose interesting modules
intModules = c("brown", "red", "salmon")
for (module in intModules)
  # Select module probes
  modGenes = (moduleColors==module)
  # Get their entrez ID codes
  modLLIDs = allLLIDs[modGenes];
  # Write them into a file
  fileName = paste("LocusLinkIDs-", module, ".txt", sep="");
  write.table(, file = fileName,
              row.names = FALSE, col.names = FALSE)
# As background in the enrichment analysis, we will use all probes in the analysis.
fileName = paste("LocusLinkIDs-all.txt", sep="");
write.table(, file = fileName,
            row.names = FALSE, col.names = FALSE)

I use "gene_id" instead of "substanceBXH" but for "LocusLinkID" I don't have any idea.

I appreciate if any body share his/her comment with me for solving this problem?

Best Regards,
Mohammad Darzi


PS: my annotation file can fine in below link:

wgcna package TCGA LocusLinkID genecode ensembl • 841 views
Entering edit mode
Last seen 2 hours ago
United States

Although often asked about around here, WGCNA isn't actually a Bioconductor package. It's a CRAN package. Questions about CRAN packages should be asked at

Your main problem is that you are following a tutorial without understanding it enough to apply to your own data. The basic idea is to take the IDs from whatever data you have, and then map to other IDs from a particular annotation service (and wow - LocusLink? that's a blast from the past). Anyway, matching is just something that you can do with base R, using match, or there is probably some spiffy way to do that using the tidyverse as well.

But again, how to do basic things with R is a R-help question, not Bioconductor.

Entering edit mode

Dear Dr. W.MacDonald


Thanks for your comment. You are right. WGCNA is not Bioconductor package and my presenting of question  is wrong. Actually my problem relate to converting Ensemble IDs to Entrez IDs. I have Ensemble IDs and also Symbol Genes So based on theses information I would like to retrieve Entrez IDs. Now, I do it by "biomaRt" package. So, my gene types are 56390 but based on below code I get just 19457 Enterz IDs.also some of them don't have Enterz ID.

[1] 56390    55
> head(DF$gene_id)
[1] "ENSG00000000003.13" "ENSG00000000005.5"  "ENSG00000000419.11" "ENSG00000000457.12"
[5] "ENSG00000000460.15" "ENSG00000000938.11"

ensembl <- useDataset("hsapiens_gene_ensembl", mart)

x=getBM(attributes= c("hgnc_symbol","entrezgene","gene_biotype"),
      values=ensembl_gene_id, mart=ensembl)

I appreciate if you share your comment with me.

Best Regards,

Entering edit mode

There is usually little profit in trying to convert from Ensembl to Entrez. There are any number of differences between what the two annotation services think are the set of known genes, for myriad reasons, and trying to naively convert will simply show you just how extensive those differences are.

My general recommendation is to stick with one annotation service to limit these technicalities, which are usually unimportant to the analysis at hand.

If you do insist on mapping between them, do note that biomaRt will return results in random order, so you need to return the filter column in the attributes so you can reorder.


Login before adding your answer.

Traffic: 300 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6