Genome annotation file in TxDb.Mmusculus.UCSC.mm10.knownGene
1
0
Entering edit mode
Kasit • 0
@kasit-24846
Last seen 3.7 years ago
United Kingdom

Dear all,

I'm currently trying to do the integrative analysis of ChIP-seq and RNA-seq data. I used the "ChIPseeker" and "TxDb.Mmusculus.UCSC.mm10.knownGene" package for annotating ChIP-seq peaks. So I understand that the package used the "mm10.knownGene.gtf" file from UCSC (https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/) for gene annotation (please correct me if I'm wrong). For my RNA-seq analysis, I then used this file in STAR mapping and RSEM quantification, together with the "knownIsoforms.txt" file downloaded from http://hgdownload.cse.ucsc.edu/goldenPath/mm10/database/.

The problem is, a lot of geneID corresponding to peaks obtained from "ChIPseeker" and "TxDb.Mmusculus.UCSC.mm10.knownGene" are not presented in RNA-seq result from RSEM and vice versa. I would like to ask if I used the right file and correct analysis methods or not? If not, what annotation file should be used for my RNA-seq analysis so that I get the compatible result to ChIP-seq peak annotation from "TxDb.Mmusculus.UCSC.mm10.knownGene"?

Best regards, Kasit

ChIPseeker TxDb.Mmusculus.UCSC.mm10.knownGene • 2.6k views
ADD COMMENT
0
Entering edit mode

You'll have to provide more information than that. You appear to be using transcripts (e.g., you pass a GTF to STAR and then use RSEM), but then you talk about Gene IDs, which are different from transcript IDs. Perhaps if you show some output of the peaks you are getting and the results from your RNA-Seq it would help clarify things.

ADD REPLY
0
Entering edit mode

Thank you James, I've just checked and found that all transcript IDs (in ensembl format) from ChIP-seq peak annotation are included in the transcript IDs (also in ensembl format) from RSEM output. Yet the gene IDs from RSEM output appear not to be in the EntrezID and I have no idea why.

Here is an example of output from RSEM.

For example, "ENSMUST00000000001.4" is actually correspond to gene_ID "14679" in entrez not "1" as shown here.

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

All of the gene_ids you present there are human, not mouse. No idea how you got the two mixed up like that? Anyway, this looks like an issue with how you ran RSEM rather than anything Bioconductor related, so you should probably ask over on Biostars.org, and provide more information about how you ran RSEM.

ADD COMMENT
0
Entering edit mode

Ah, I see. Thank you very much James.

Best regards, Kasit

ADD REPLY

Login before adding your answer.

Traffic: 387 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6