Question

CSAW: filtering prior information

0

Entering edit mode

sergio.espeso-gil • 0

@sergioespeso-gil-6997

Last seen 4.8 years ago

New York

Hi!

I am happy with the results given by filtering with prior information , but it will be actually nice to filter a bit by abundance those regions. I read in the documentation that it is possible , but I am struggling to make it work. Should I implement the aveLogCPM into overlapsAny function? I am bit lost, sorry.

Thanks in advance

Sergio

csaw • 1.4k views

ADD COMMENT • link 9.5 years ago sergio.espeso-gil • 0

0

Entering edit mode

I'm not sure what you mean. Do you want to filter by the abundance of the windows in addition to their genomic location?

ADD REPLY • link 9.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

Yes... is it possible?

ADD REPLY • link 9.5 years ago sergio.espeso-gil • 0

score 1 · Answer 1 · 2015-06-10

Let's say you've got a logical vector after filtering your windows on abundance. For example, if you have a SummarizedExperiment object named data:

ab <- aveLogCPM(asDGEList(data))
keep.ab <- ab > 0

Of course, you could use any of the other filtering strategies that are listed in chapter 3 of the user's guide, so long as you get a logical vector with length equal to the number of windows in data.

Now, let's say you want to filter to keep windows in a bunch of regions as well.

keep.reg <- overlapsAny(rowRanges(data), regions)

You can then combine the two filter vectors with a simple AND operation (see Section 3.7 of the guide). This will retain only those windows that are high-abundance and lie within one of the specified regions:

keep <- keep.ab & keep.reg
filtered.data <- data[keep,]

Of course, this is more stringent than filtering on abundance alone. Make sure that you have enough windows for stable calculation of downstream statistics (at least several thousand, usually).

score 0 · Answer 2 · 2015-06-10

0

Entering edit mode

sergio.espeso-gil • 0

@sergioespeso-gil-6997

Last seen 4.8 years ago

New York

Ok, perfect!!! Thanks I a lot , I will try!

ADD COMMENT • link 9.5 years ago sergio.espeso-gil • 0

score 0 · Answer 3 · 2015-06-10

0

Entering edit mode

sergio.espeso-gil • 0

@sergioespeso-gil-6997

Last seen 4.8 years ago

New York

Good, it is exactly what I needed. Thanks a lot.

One more question Aaron. When I predefined the regions , for example promoters, is there a way to avoid the counting per window and just count per predefined region? Maybe it is actually what I am doing, sorry if I misunderstood it.

thanks a lot!

Sergio

ADD COMMENT • link 9.5 years ago sergio.espeso-gil • 0

0

Entering edit mode

csaw is primarily designed for window-based counting, but you can also do region-based counting with the regionCounts function (there is a brief discussion of the differences between these two approaches in Section 6.3 of the User's guide). Of course, there are many other packages that can count over regions (e.g., the featureCounts command in the Rsubread package) so csaw is nothing special in this regard.