Search
Question: Adding SummarizedExperiment function
0
gravatar for rbronste
10 months ago by
rbronste20
rbronste20 wrote:

Kind of a basic question, whats the easiest way to add one of these that represents a column in a custom GRanges object? Thanks.

ADD COMMENTlink modified 10 months ago • written 10 months ago by rbronste20

What does 'add one of these' mean in this context?

ADD REPLYlink written 10 months ago by James W. MacDonald45k

I just mean in terms of a GRanges that has for instance columns like seqnames, start, end - and can be queried 

with for instance:  stuff <- stuff.DB[seqnames(stuff.DB) == 'chrY']

Trying to figure out how to do the same for other columns like FDR etc, that are not in SummarizedExperiments 

ADD REPLYlink written 10 months ago by rbronste20
2
gravatar for James W. MacDonald
10 months ago by
United States
James W. MacDonald45k wrote:

You can add anything you want in the mcols of the GRanges object and query on that at will. As a test, let's use the example for SummarizedExperiment:

> library(SummarizedExperiment)
> example("SummarizedExperiment")
> rse
class: RangedSummarizedExperiment
dim: 200 6
metadata(0):
assays(1): counts
rownames: NULL
rowData names(1): feature_id
colnames(6): A B ... E F
colData names(1): Treatment
> rowRanges(rse)
GRanges object with 200 ranges and 1 metadata column:
        seqnames           ranges strand |  feature_id
           <Rle>        <IRanges>  <Rle> | <character>
    [1]     chr1 [556101, 556200]      - |       ID001
    [2]     chr1 [792975, 793074]      - |       ID002
    [3]     chr1 [263755, 263854]      - |       ID003
    [4]     chr1 [714331, 714430]      + |       ID004
    [5]     chr1 [900677, 900776]      - |       ID005
    ...      ...              ...    ... .         ...
  [196]     chr2 [495890, 495989]      - |       ID196
  [197]     chr2 [222582, 222681]      - |       ID197
  [198]     chr2 [666857, 666956]      + |       ID198
  [199]     chr2 [404246, 404345]      - |       ID199
  [200]     chr2 [540493, 540592]      - |       ID200
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

> z <- rse[mcols(rse)$feature_id %in% paste0("ID", sprintf("%03d", 1:5)),]
> rowRanges(z)
GRanges object with 5 ranges and 1 metadata column:
      seqnames           ranges strand |  feature_id
         <Rle>        <IRanges>  <Rle> | <character>
  [1]     chr1 [556101, 556200]      - |       ID001
  [2]     chr1 [792975, 793074]      - |       ID002
  [3]     chr1 [263755, 263854]      - |       ID003
  [4]     chr1 [714331, 714430]      + |       ID004
  [5]     chr1 [900677, 900776]      - |       ID005
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths
> assays(z)[[1]]
            A        B        C        D        E        F
[1,] 9.390704 9.088845 9.726846 9.569678 9.744423 9.664979
[2,] 9.823552 7.222012 5.752299 9.486667 9.746595 8.313257
[3,] 9.496478 7.672814 9.604351 8.800272 8.292126 9.857548
[4,] 8.580828 9.613288 9.681698 9.270826 8.690414 9.233475
[5,] 9.596227 8.729721 9.739728 8.628168 8.309004 6.797500
> colData(z)
DataFrame with 6 rows and 1 column
    Treatment
  <character>
A        ChIP
B       Input
C        ChIP
D       Input
E        ChIP
F       Input

And you can have as many columns in the mcols slot, and add them whenever

> mcols(rse)$whatevs <- rnorm(nrow(rse))
> mcols(rse)$addonemore <- rnorm(nrow(rse))
> rowRanges(rse)
GRanges object with 200 ranges and 3 metadata columns:
        seqnames           ranges strand |  feature_id     whatevs  addonemore
           <Rle>        <IRanges>  <Rle> | <character>   <numeric>   <numeric>
    [1]     chr1 [556101, 556200]      - |       ID001 -0.05584487  -0.6773722
    [2]     chr1 [792975, 793074]      - |       ID002  1.01721394  -0.8628047
    [3]     chr1 [263755, 263854]      - |       ID003  0.67180836   0.4902122
    [4]     chr1 [714331, 714430]      + |       ID004  0.03497479  -2.5660873
    [5]     chr1 [900677, 900776]      - |       ID005 -1.58957034   1.3208983
    ...      ...              ...    ... .         ...         ...         ...
  [196]     chr2 [495890, 495989]      - |       ID196 -0.06389269 -2.75149592
  [197]     chr2 [222582, 222681]      - |       ID197 -1.55996247  1.27020433
  [198]     chr2 [666857, 666956]      + |       ID198  0.36173020  0.49610959
  [199]     chr2 [404246, 404345]      - |       ID199 -1.24144376 -0.31007126
  [200]     chr2 [540493, 540592]      - |       ID200 -0.60194563  0.02290882
  -------
  seqinfo: 2 sequences from an unspecified genome; no seqlengths

 

ADD COMMENTlink written 10 months ago by James W. MacDonald45k
0
gravatar for rbronste
10 months ago by
rbronste20
rbronste20 wrote:

I guess I am still a little confused. I am using a DiffBind output that has a number of columns according to they sampleSheet. The basic thing I want to do is to be able to filter and sort by any specific column or multiple columns simultaneously - such as FDR and fold change.

 

ADD COMMENTlink written 10 months ago by rbronste20

If you want to make a comment, please use the ADD COMMENT button rather than the 'Add your answer' box, which is intended for answers, not comments.

Your questions are needlessly mysterious. If you have an example of what you are trying to do, then maybe we can give some pointers.

But so far you are asking generalized questions like 'I want to filter and sort' which are just basic R manipulations. If you are having problems with basic R stuff, you should read 'An Introduction to R', and note that a SummarizedExperiment is intended to act as if it were a data.frame, so anything you can do with a data.frame will work pretty much the same way.

ADD REPLYlink written 10 months ago by James W. MacDonald45k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 274 users visited in the last hour