How to generate custom annotation for rat genome rn6 in ChIPQC
0
Entering edit mode
@Frédéric-Crémazy-24161
Last seen 8 days ago
Paris

Hi,

I would like to use ChIPQC on rat ChIP-seq data that were generated using rn6 genome version. Despite another post explaining that it's possible to generate custom annotation (https://support.bioconductor.org/p/70893/), I haven't succeeded to do it myself for rn6.

Has anyone be able to do it?

Thanks, Fred

ChIPQC annotation rn6 • 139 views
ADD COMMENTlink
2
Entering edit mode
@James-W.-MacDonald-5106
Last seen 15 hours ago
United States

Is there something wrong with the existing TxDb for rn6?

0
Entering edit mode

I don't think so. I guess the problem is that I don't know how to generate a ChIPQC custom annotation from TxDb.Rnorvegicus.UCSC.rn6.refGene.

Could you tell me how to do that please?

ADD REPLYlink
0
Entering edit mode

Well it depends on what you want to use as your binding regions. If you want 'the usual' I guess there are two ways to go about it. You could 'cheat the system' (untested!) by doing

library(TxDb.Rnorvegicus.UCSC.rn6.refGene)
TxDb.Rnorvegicus.UCSC.rn4.ensGene <- TxDb.Rnorvegicus.UCSC.rn6.refGene
whatevs <- ChIPQC(samples, annotation = "rn4")

Which should then use the rn6 TxDb package instead of the rn4 version. Or you could just make your own annotation, using the existing getAnnotation function.

myNewGetAnnotation <- function(){
    txdb <- TxDb.Rnorvegicus.UCSC.rn6.refGene
    All5utrs <- reduce(unique(unlist(fiveUTRsByTranscript(txdb))))
    All3utrs <- reduce(unique(unlist(threeUTRsByTranscript(txdb))))
    Allcds <- reduce(unique(unlist(cdsBy(txdb,"tx"))))
    Allintrons <- reduce(unique(unlist(intronsByTranscript(txdb))))
    Alltranscripts <- reduce(unique(transcripts(txdb)))

    posAllTranscripts <- Alltranscripts[strand(Alltranscripts) == "+"]
    posAllTranscripts <- posAllTranscripts[!(start(posAllTranscripts)-20000 < 0)]
    negAllTranscripts <- Alltranscripts[strand(Alltranscripts) == "-"]
    chrLimits <- seqlengths(negAllTranscripts)[as.character(seqnames(negAllTranscripts))]
    negAllTranscripts <- negAllTranscripts[!(end(negAllTranscripts)+20000 > chrLimits)]
    Alltranscripts <- c(posAllTranscripts,negAllTranscripts)
    Promoters500 <-  reduce(flank(Alltranscripts,500))
    Promoters2000to500 <-  reduce(flank(Promoters500,1500))
    LongPromoter20000to2000  <- reduce(flank(Promoters2000to500,18000))
    if(!missing(AllChr) & !is.null(AllChr)){
      All5utrs <- GetGRanges(All5utrs,AllChr=AllChr)
      All3utrs <- GetGRanges(All3utrs,AllChr=AllChr)
      Allcds <- GetGRanges(Allcds,AllChr=AllChr)
      Allintrons <- GetGRanges(Allintrons,AllChr=AllChr)
      Alltranscripts <- GetGRanges(Alltranscripts,AllChr=AllChr)
      Promoters500 <- GetGRanges(Promoters500,AllChr=AllChr)
      Promoters2000to500 <-  GetGRanges(Promoters2000to500,AllChr=AllChr)
      LongPromoter20000to2000  <- GetGRanges(LongPromoter20000to2000,AllChr=AllChr)
    }
  }
  return(list(version=GeneAnnotation,LongPromoter20000to2000=LongPromoter20000to2000,
              Promoters2000to500=Promoters2000to500,Promoters500=Promoters500,
              All5utrs=All5utrs,Alltranscripts=Alltranscripts,Allcds=Allcds,
              Allintrons=Allintrons,All3utrs=All3utrs))
}

## after loading the rn6 TxDb, do
annot <- myNewGetAnnotation()
whatevs <- ChIPQC(samples, annotation = annot)

There may be a simpler, obvious way that I am missing, and perhaps Rory Stark or Tom Carroll will be along in a bit to clear that up. But I don't see a simple way to use a TxDb other than the choices they have hard-coded, so unfortunately it might take extra measures like this.

ADD REPLYlink
0
Entering edit mode

Hi James,

Sorry I was away for a while.

But unfortunately the code is not working in my hands. Can you still help to find what is wrong?

Could return(list(version=GeneAnnotation be wrong here (you spoke about the existing getAnnotation yourself).

I also got Error in missing(AllChr) : 'missing' peut seulement être utilisé pour des arguments eventually.

Well, as you can see, I'm pretty lost here! Thanks!

ADD REPLYlink
0
Entering edit mode

Ah. My bad. Try

myNewGetAnnotation <- function(GeneAnnotation = "rn6", AllChr){
    txdb <- TxDb.Rnorvegicus.UCSC.rn6.refGene
    All5utrs <- reduce(unique(unlist(fiveUTRsByTranscript(txdb))))
    All3utrs <- reduce(unique(unlist(threeUTRsByTranscript(txdb))))
    Allcds <- reduce(unique(unlist(cdsBy(txdb,"tx"))))
    Allintrons <- reduce(unique(unlist(intronsByTranscript(txdb))))
    Alltranscripts <- reduce(unique(transcripts(txdb)))

    posAllTranscripts <- Alltranscripts[strand(Alltranscripts) == "+"]
    posAllTranscripts <- posAllTranscripts[!(start(posAllTranscripts)-20000 < 0)]
    negAllTranscripts <- Alltranscripts[strand(Alltranscripts) == "-"]
    chrLimits <- seqlengths(negAllTranscripts)[as.character(seqnames(negAllTranscripts))]
    negAllTranscripts <- negAllTranscripts[!(end(negAllTranscripts)+20000 > chrLimits)]
    Alltranscripts <- c(posAllTranscripts,negAllTranscripts)
    Promoters500 <-  reduce(flank(Alltranscripts,500))
    Promoters2000to500 <-  reduce(flank(Promoters500,1500))
    LongPromoter20000to2000  <- reduce(flank(Promoters2000to500,18000))
    if(!missing(AllChr) & !is.null(AllChr)){
      All5utrs <- ChIPQC:::GetGRanges(All5utrs,AllChr=AllChr)
      All3utrs <- ChIPQC:::GetGRanges(All3utrs,AllChr=AllChr)
      Allcds <- ChIPQC:::GetGRanges(Allcds,AllChr=AllChr)
      Allintrons <- ChIPQC:::GetGRanges(Allintrons,AllChr=AllChr)
      Alltranscripts <- ChIPQC:::GetGRanges(Alltranscripts,AllChr=AllChr)
      Promoters500 <- ChIPQC:::GetGRanges(Promoters500,AllChr=AllChr)
      Promoters2000to500 <-  ChIPQC:::GetGRanges(Promoters2000to500,AllChr=AllChr)
      LongPromoter20000to2000  <- ChIPQC:::GetGRanges(LongPromoter20000to2000,AllChr=AllChr)
    }
  return(list(version=GeneAnnotation,LongPromoter20000to2000=LongPromoter20000to2000,
              Promoters2000to500=Promoters2000to500,Promoters500=Promoters500,
              All5utrs=All5utrs,Alltranscripts=Alltranscripts,Allcds=Allcds,
              Allintrons=Allintrons,All3utrs=All3utrs))
}

> zz <- myNewGetAnnotation(AllChr = "chr1")
> zz
$version
[1] "rn6"

$LongPromoter20000to2000
GRanges object with 1950 ranges and 0 metadata columns:
         seqnames              ranges strand
            <Rle>           <IRanges>  <Rle>
     [1]     chr1       376700-394699      +
     [2]     chr1       679413-697412      +
     [3]     chr1     1682696-1700695      +
     [4]     chr1     1751721-1782077      +
     [5]     chr1     1806170-1824169      +
     ...      ...                 ...    ...
  [1946]     chr1 281378835-281396834      -
  [1947]     chr1 281758160-281776159      -
  [1948]     chr1 281876512-281894511      -
  [1949]     chr1 282172018-282190017      -
  [1950]     chr1 282237892-282271193      -
  -------
  seqinfo: 1 sequence from rn6 genome

$Promoters2000to500
GRanges object with 2344 ranges and 0 metadata columns:
         seqnames              ranges strand
            <Rle>           <IRanges>  <Rle>
     [1]     chr1       394700-396199      +
     [2]     chr1       697413-698912      +
     [3]     chr1     1700696-1702195      +
     [4]     chr1     1769721-1771220      +
     [5]     chr1     1782078-1783577      +
     ...      ...                 ...    ...
  [2340]     chr1 281756660-281758159      -
  [2341]     chr1 281875012-281876511      -
  [2342]     chr1 282170518-282172017      -
  [2343]     chr1 282236392-282237891      -
  [2344]     chr1 282251694-282253193      -
  -------
  seqinfo: 1 sequence from rn6 genome
<snip>
ADD REPLYlink
0
Entering edit mode

It works great. Thanks a lot for your help James!

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3