Question: a filter works alone but not with other filters
1
4.1 years ago by
France
TimothéeFlutre70 wrote:

I wrote the following filter to discard variants which have less than 20% of samples with less than 10 reads:

filterDp <- function(x, min.dp=10, min.prop.dp=0.2){
filterDp <- function(x, min.dp=10, min.prop.dp=0.2){
dp <- geno(x)$DP
(rowSums(dp >= min.dp) / ncol(dp) < min.prop.dp)
}

When I use it alone, it's working well:

filters <- FilterRules(list(dp=filterDp))
filterVcf(file=tabix.file, genome="test", destination=out.file,
index=TRUE, filters=filters, param=vcf.params, verbose=TRUE)
starting filter
filtering 1202 records
completed filtering
compressing and indexing '...'

However, when I combine it with others, it fails:

filterBiall <- function(x){
(elementLengths(alt(vcf)) > 1)
}
filterSnv <- function(x){
(! isSNV(x))
}
filters <- FilterRules(list(dp=filterDp, biall=filterBiall, snv=filterSnv))
filterVcf(file=tabix.file, genome="test", destination=out.file,
index=TRUE, filters=filters, param=vcf.params, verbose=TRUE)
starting filter
filtering 1202 records
Error in extractROWS(x, eval(filter, x)) :
error in evaluating the argument 'i' in selecting a method for function 'extractROWS': Error in eval(filter, x) : filter rule evaluated to inconsistent length:

Moreover, the error changes depending on the order of the filters:

filters <- FilterRules(list(dp=filterDp, biall=filterBiall, snv=filterSnv))
filterVcf(file=tabix.file, genome="test", destination=out.file,
index=TRUE, filters=filters, param=vcf.params, verbose=TRUE)
starting filter
filtering 1202 records
Error in extractROWS(x, eval(filter, x)) :
error in evaluating the argument 'i' in selecting a method for function 'extractROWS': Error in rowSums(dp >= min.dp)
'x' must be an array of at least two dimensions

Do you have any idea?

Answer: a filter works alone but not with other filters

1

4.1 years ago by
Martin Morgan ♦♦ 24k
United States
Martin Morgan ♦♦ 24k wrote:

The problem is that some of the filters remove all variants, and the remaining filters are not robust to having nothing

> filters[[1]](vcf[FALSE,])
Error in rowSums(dp >= min.dp) :
'x' must be an array of at least two dimensions
> filters[[2]](vcf[FALSE,])
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[29] FALSE
> filters[[3]](vcf[FALSE,])
logical(0)

The second filter is incorrect because it references 'vcf' instead of its argument 'x. The first filter is invalid because it tries something like

> rowSums(matrix(0, 0, 2) > 1)
Error in rowSums(matrix(0, 0, 2) > 1) :
'x' must be an array of at least two dimensions

Neat, eh? (thanks for your earlier suggestion about updating the document to use isSNV(); the vignette was written several years ago).
(rowSums(dp >= min.dp) / ncol(dp) < min.prop.dp)
}
}

Could you also mention this in the vignette? That would be great. Thanks a lot!