Hello everyone! I am trying to annotate a list of regions as my regions are 502 bp long sometimes they overlap with more than 2 distinct genes , for example chr2:112541661_112542162
overlaps with both POLR1B and LOC105373562 ( the first gene is on + strand and the latter is on - strand).
I am using annotatePeak
function as follows:
txdb<- TxDb.Hsapiens.UCSC.hg38.knownGene
G.coor<- as_granges(coor , seqnames=seqnames , start=start , end=end)
annotation<- annotatePeak(G.coor , TxDb = txdb ,level = "gene" , addFlankGeneInfo=TRUE )
and this message appears
>> preparing features information... 2021-07-12 16:38:13
1613 genes were dropped because they have exons located on both strands of the same reference sequence or on more
than one reference sequence, so cannot be represented by a single genomic range.
Use 'single.strand.genes.only=FALSE' to get all the genes in a GRangesList object, or use suppressMessages() to
suppress this message.
>> identifying nearest features... 2021-07-12 16:38:15
>> calculating distance from peak to TSS... 2021-07-12 16:38:16
>> assigning genomic annotation... 2021-07-12 16:38:16
>> adding flank feature information from peaks... 2021-07-12 16:38:48
>> assigning chromosome lengths 2021-07-12 16:38:49
>> done... 2021-07-12 16:38:49
for my example (chr2:112541661_112542162
) this only shows POLR1B gene for annotation. how can I get annotations for all genes that overlap with my list of regions ?
also I'm not sure how to set 'single.strand.genes.only=FALSE'
as it is not a parameter of annotatePeak
I would appreciate your help.
Thank you for your response . Unfortunately when I try
I get this error :
any idea how to fix this ? btw i used
genes
function fromGenomicFeatures
package, is that the same function as you mentioned ?and when I try to unlist
txdb
:I get this message and the output does not have annotations ( intron , exon , ...)
Ah, yes. If you say single.strand.genes.only = FALSE, you get a
CompressedGRangesList
because of things likeWhere there are genes that occur on different chromosomes. All you have to do is
unlist
.Missed your second comment. Yes, using your own
GRanges
object disables the functionality to query aTxDb
object to get the gene parts because you no longer supply aTxDb
. You can do one or the other if you want to use the stock functionality.There is one other option - either run through
ChIPseeker:::getGene
under the debugger and change the lineto
and see if that change will work correctly, or if you really care, you could fork the GitHub repo, make changes to include genes on multiple chromosomes, and then send a PR to the authors. Or maybe just file an issue on the repo asking for them to make changes for you.
thank you very much for your response