Search
Question: make TxDb from list of genes
0
gravatar for rtwest
10 days ago by
rtwest0
rtwest0 wrote:

I am relatively new to bioinformatics however I have learned a lot from this site and can't find a solution to an issue I am having.

I am trying to create a TxDb from a certain list of genes. 

I have tried several different options to no avail. Converting the list to granges, however could never create the txdb because no meta data was ever captured, tried to create from ensembl and most recently from biomart directly.

I inputted the list. Converted  the list to ensembl Ids. Then converted to a list of characters and finally tried to create the txdb, the transcript Ids are invalid.

Long story short: How can i create a custom txdb from a certain list of genes?

Also (second question) from the list of 70 genes, almost 700 ensembl ids are generated. Not exactly sure why that is as well.

Below is my code after loading packages:

list <- c("AMOTL2",
             "ANKRD1",
             "ANLN",
             "ARHGAP29",
             "AXL",
             "NA",
             "BIRC5",
             "CCRN4L",
             "CDC20",
             "CDK6",
             "CDKN2C",
             "CENPF",
             "COL4A3",
             "CRIM1",
             "CTGF",
             "CYR61",
             "CYR61",
             "DAB2",
             "DDAH1",
             "ASAP1",
             "DLC1",
             "DUSP1",
             "DUT",
             "ECT2",
             "EMP2",
             "ETV5",
             "FGF2",
             "FLNA",
             "FSCN1",
             "FSTL1",
             "GADD45B",
             "GAS2L3",
             "GAS6",
             "GGH",
             "GKAP1",
             "GLIS2",
             "GLS",
             "HEXB",
             "HMMR",
             "AGFG2",
             "ITGB2",
             "ITGB5",
             "LHFP",
             "MACF1",
             "MARCKS",
             "MDFIC",
             "MSRB3",
             "MYO1C",
             "NDRG1",
             "PDLIM2",
             "PHGDH",
             "PMP22",
             "SCHIP1",
             "SDPR",
             "SERPINE1",
             "SERTAD4",
             "SFRS2IP",
             "SGK1",
             "SH2D4A",
             "SHCBP1",
             "SLIT2",
             "STMN1",
             "TGFB2",
             "TGM2",
             "THBS1",
             "TK1",
             "TNNT2",
             "TNS1",
             "TOP2A",
             "TSPAN3")

 

ids <- getBM(attributes="ensembl_transcript_id", filters = "hgnc_symbol", values = list, mart= ensembl)
ids
ids.c <- as.character(ids)
ids.c
yap_taz.c <- as.character(yap_taz)

txdb_YT <- makeTxDbFromBiomart(biomart="ensembl",
                               dataset="hsapiens_gene_ensembl",
                               transcript_ids=ids.c,  
                               circ_seqs=NULL,
                               host="www.ensembl.org",
                               port=80,
                               taxonomyId=NA,
                               miRBaseBuild=NA)

 

 

Download and preprocess the 'transcripts' data frame ... Error in .makeBiomartTranscripts(filter, mart, transcript_ids, recognized_attribs,  : 
  invalid transcript ids:

1

Why do you want a TxDb for just a set of genes? It's simple enough to use a full sized one and subset after the fact.

ADD REPLYlink written 10 days ago by James W. MacDonald48k
1

I agree with James. No need to create a new TxDb database. You could simply subset an EnsDb database to your list of input genes (even better if you have Ensembl IDs): assuming you have your Ensembl gene IDs in a variable called ensids:

library(EnsDb.Hsapiens.v86)
edb <- filter(EnsDb.Hsapiens.v86, filter = ~ gene_id == ensids)

On that edb you can then call the same functions you would use on a TxDb (such as genes, exonsBy etc) and you would always just get the results for the genes you provided.

The package I loaded above contains annotations from Ensembl version 86, if you want more recent annotations you would want to download the EnsDb from AnnotationHub.

ADD REPLYlink written 10 days ago by Johannes Rainer1.3k

Thank you, I was over thinking it and going about it the wrong way, appreciate the guidance

ADD REPLYlink written 9 days ago by rtwest0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 421 users visited in the last hour