Search
Question: How to extract out wanted IntegerList more elegantly from nested list?
0
19 months ago by
jian_liangli0 wrote:

Hi :

I have list of position index in the IntegerList, and I intended to filter them given threshold, and it works well. However, I want to extract out one of specific filtered set for each IntegerList for further usage. I aware that myList is nested list , and data are very much simulated based on real data set. Is there any way to retrieve wanted IntegerList easily and elegantly? How can I make it this extraction happen?

mini example :

myList <- list(f1=IntegerList(1,2,3,4,1,1,1,integer(0),1,2,4),
f2=IntegerList(1,5,integer(0),integer(0),2,3,4,6,1,5,6),
f3=IntegerList(1,4,6,7,2,3,3,7,2,5,7))

len <- Reduce('+', lapply(myList, lengths))
keepMe <- len >= length(myList)

I did following filtering:

res.filt <- lapply(myList, function(elm) {
ans <- list(keep=elm[keepMe], droped=elm[!keepMe])
ans
})

my rough attempt output :

wantedKeep.list <- list(f1.kp=res.filt$f1$keep, f2.kp=res.filt$f2$keep, f3.kp=res.filt$f3$keep)
wantedDrop.list <- list(f1.dp=res.filt$f1$droped, f2.dp=res.filt$f2$droped, f3.dp=res.filt$f3$droped)

Based on my rough output, How can I get more elegant output ? any efficient way to achieve my output ? Can anyone point me how to do? Or any suggestion how to get my expected output ?  Thanks in advance

modified 19 months ago by Michael Lawrence10.0k • written 19 months ago by jian_liangli0
3
19 months ago by
Michael Lawrence10.0k
United States
Michael Lawrence10.0k wrote:

It looks like you are thnking of the data as a matrix. You can create a list matrix like this:

v <- unlist(lapply(myList, as.list), recursive=FALSE)
m <- matrix(v, length(myList), byrow=TRUE)
rownames(m) <- names(myList)


Then you can compute lengths and make the selection:

lens <- lengths(m)
dim(lens) <- dim(m)
keep <- colSums(lens) >= nrow(m)
keepList <- m[,keep,drop=FALSE]
dropList <- m[,!keep,drop=FALSE]

I will work on making some of these operations easier. But it's pretty straight-forward as is.

Is it the case that all of the elements are of length either 1 or 0? If so, it might be easier to convert the zero-length elements to NAs and drop to a vector.

Here is a simpler solution in Bioc devel for generating the matrix "m" (first three lines above):

m <- do.call(rbind, myList)

Also, I will push a change to R devel so that lengths() preserves matrix dimensions, so they do not need to be carried over.

1

A variant is use DataFrame

df = DataFrame(myList)
keep = rowSums(sapply(df, lengths)) >= ncol(df)
as.list(df[keep,, drop=FALSE])

but the short version of jian-liangli's original filtering + rough attempt seems clean enough

lapply(myList, [, keepMe, drop=FALSE)

Thank you Martin, both answer is very well done.

1

I edited my answer with a simplification possible with S4Vectors 0.11.18.

Dear Michael:

Thanks again your favor to address my issue. However, I must avoid of converting to NA because I have very important minor details that must be taken into consideration for my specific problem, surely length(NA) is way different from length(integer(0)) to do vector sum. I have to use length(integer(0)) in order to efficiently use geometric property of vector list. the form of myList are very much simulated on my specific data set. Is that possible to make your solution more compatible and easier ? Thank you so much