Removing metadata rows containing NA values in GRanges
1
0
Entering edit mode
Timucin • 0
@3a63554b
Last seen 18 months ago
Singapore

I'm looking for promoters for protein coding transcripts in GRanges, and I end up with lots of NA values in my protein_id metadata column. Is there a way to remove the rows corresponding to the metadata column containing NA?

> ahub <- AnnotationHub()
> query(ahub, c("GRanges","Homo sapiens","GRCh38"))
> ahub['AH28674']
> gtf <- ahub[['AH28674']]

#Create a Granges object for the promoters of all protein-coding transcripts

> gtf[,"protein_id"]
> promoters <- promoters(gtf[,"protein_id"], upstream = 1500, downstream = 500)
> promoters
GRanges object with 2661879 ranges and 1 metadata column:
            seqnames      ranges strand |  protein_id
               <Rle>   <IRanges>  <Rle> | <character>
        [1]        1 10369-12368      + |        <NA>
        [2]        1 10369-12368      + |        <NA>
        [3]        1 10369-12368      + |        <NA>
        [4]        1 11113-13112      + |        <NA>
        [5]        1 11721-13720      + |        <NA>
        ...      ...         ...    ... .         ...
  [2661875]       MT 14388-16387      + |        <NA>
  [2661876]       MT 14388-16387      + |        <NA>
  [2661877]       MT 15524-17523      - |        <NA>
  [2661878]       MT 15524-17523      - |        <NA>
  [2661879]       MT 15524-17523      - |        <NA>
  -------
  seqinfo: 270 sequences (1 circular) from GRCh38 genome
IRanges GRanges NA • 1.7k views
ADD COMMENT
2
Entering edit mode
ATpoint ★ 4.0k
@atpoint-13662
Last seen 5 hours ago
Germany

Something like this:

keep <- !is.na(mcols(promoters)$protein_id)
promoters[keep,]
ADD COMMENT

Login before adding your answer.

Traffic: 815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6