CSAW: filtering prior information
3
0
Entering edit mode
@sergioespeso-gil-6997
Last seen 4.8 years ago
New York

Hi! 

I am happy with the results given by filtering with prior information , but it will be actually nice to filter a bit by abundance those regions. I read in the documentation that it is possible , but I am struggling to make it work. Should I implement the aveLogCPM into overlapsAny function? I am bit lost, sorry.

 

Thanks in advance

 

Sergio 

csaw • 1.4k views
ADD COMMENT
0
Entering edit mode

I'm not sure what you mean. Do you want to filter by the abundance of the windows in addition to their genomic location?

ADD REPLY
0
Entering edit mode

Yes... is it possible? 

ADD REPLY
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 8 hours ago
The city by the bay

Let's say you've got a logical vector after filtering your windows on abundance. For example, if you have a SummarizedExperiment object named data:

ab <- aveLogCPM(asDGEList(data))
keep.ab <- ab > 0

Of course, you could use any of the other filtering strategies that are listed in chapter 3 of the user's guide, so long as you get a logical vector with length equal to the number of windows in data.

Now, let's say you want to filter to keep windows in a bunch of regions as well.

keep.reg <- overlapsAny(rowRanges(data), regions)

You can then combine the two filter vectors with a simple AND operation (see Section 3.7 of the guide). This will retain only those windows that are high-abundance and lie within one of the specified regions:

keep <- keep.ab & keep.reg
filtered.data <- data[keep,]

Of course, this is more stringent than filtering on abundance alone. Make sure that you have enough windows for stable calculation of downstream statistics (at least several thousand, usually).

ADD COMMENT
0
Entering edit mode
@sergioespeso-gil-6997
Last seen 4.8 years ago
New York

Ok, perfect!!! Thanks I a lot , I will try! 

ADD COMMENT
0
Entering edit mode
@sergioespeso-gil-6997
Last seen 4.8 years ago
New York

Good, it is exactly what I needed. Thanks a lot. 

One more question Aaron. When I predefined the regions , for example promoters, is there a way to avoid the counting per window and just count per predefined region? Maybe it is actually what I am doing, sorry if I misunderstood it. 

thanks a lot! 

Sergio

 

ADD COMMENT
0
Entering edit mode

csaw is primarily designed for window-based counting, but you can also do region-based counting with the regionCounts function (there is a brief discussion of the differences between these two approaches in Section 6.3 of the User's guide). Of course, there are many other packages that can count over regions (e.g., the featureCounts command in the Rsubread package) so csaw is nothing special in this regard.

ADD REPLY
0
Entering edit mode

Ok , ok! Thanks a lot!! really useful support!! :)

ADD REPLY

Login before adding your answer.

Traffic: 790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6