Feasibility of adapting iRanges to count concomittant drug therapy
1
0
Entering edit mode
glipori • 0
@glipori-9378
Last seen 9.0 years ago

I’m working on a problem where I need to look at simultaneity across drug regimens. I’ve coded a solution in sql. It works, but It is woefully inefficient. I was hoping to adapt the iRanges package to solve the problem instead. I was wondering if you might be able to chime in on whether this was a reasonable thing to attempt, since the package was crafted with other things in mind.

By way of example, I need to count the maximum number of simultaneous drug regimens a patient was on over a period defined by the input file (simultaneous for this use case is defined as any therapy overlapping for >30 days).The input is reflected in the synthetic example below, where, in this case, the subject was simultaneously on 5 drugs.

DRUGGROUP IntervalStart IntervalEnd IntervalWidth
ACE_ARB 7/18/2011 9/30/2015 1535
Aldosterone_antagonists 7/21/2011 9/30/2015 1532
Beta_Blockers 8/11/2011 12/3/2011 114
Beta_Blockers 10/19/2012 9/30/2015 1076
Dihydro_CCB 7/23/2011 9/30/2015 1530
Diuretics 7/21/2011 9/30/2015 1532

I truly appreciate any thoughts

summarizeoverlaps • 1.2k views
ADD COMMENT
0
Entering edit mode

Hi,

It's not clear what you mean by "counting the maximum number of simultaneous drug regimens a patient was on". Let's say you've managed to store the above input in an IRanges object where the start and end are the IntervalStart and IntervalEnd counted in number of days since January 1st, 2011. The object would look something like this:

> therapies
IRanges of length 6
    start  end width
[1]   198 1732  1535
[2]   201 1732  1532
[3]   222  335   114
[4]   657 1732  1076
[5]   203 1732  1530
[6]   201 1732  1532

Note that you don't see it here but let's say that this object has a metadata column showing the drug group used on each time interval:

> mcols(therapies)
DataFrame with 6 rows and 1 column
                DRUGGROUP
              <character>
1                 ACE_ARB
2 Aldosterone_antagonists
3           Beta_Blockers
4           Beta_Blockers
5             Dihydro_CCB
6               Diuretics

Can you clarify the output you would expect from "counting the maximum number of simultaneous drug regimens a patient was on"?

FWIW here are some basic operations you can do on this IRanges object. For example you can get the nb of drug regimens the patient was on at any given time with coverage():

> coverage(therapies)
integer-Rle of length 1732 with 7 runs
  Lengths:  197    3    2   19  114  321 1076
  Values :    0    1    3    4    5    4    5

The run lengths of this Rle object are numbers of days. A more user-friendly representation of this is with the following data.frame:

cvg <- coverage(therapies)
nb_drug_regimens <- runValue(cvg)
IntervalEnd <- as.Date("2011/1/1") + cumsum(runLength(cvg))
IntervalStart <- c(as.Date("2011/1/1"),
                   IntervalEnd[-length(IntervalEnd)] + 1L)

data.frame(IntervalStart, IntervalEnd, nb_drug_regimens)
#   IntervalStart IntervalEnd nb_drug_regimens
# 1    2011-01-01  2011-07-17                0
# 2    2011-07-18  2011-07-20                1
# 3    2011-07-21  2011-07-22                3
# 4    2011-07-23  2011-08-10                4
# 5    2011-08-11  2011-12-02                5
# 6    2011-12-03  2012-10-18                4
# 7    2012-10-19  2015-09-29                5

You can also use slice() on cvg to find the intervals of time when the patient was on a given number of drug regimens:

slice(cvg, lower=4, upper=4)  # exactly 4 drug regimens
# Views on a 1732-length Rle subject
#
# views:
#     start end width
# [1]   203 221    19 [4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4]
# [2]   336 656   321 [4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ...]

slice(cvg, lower=4)  # at least 4 drug regimens
# Views on a 1732-length Rle subject
#
# views:
#     start  end width
# [1]   203 1732  1530 [4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 ...]

Note that you can use start() and end() on the output of slice() to get the interval starts and ends. Then add as.Date("2011/1/1") to them to turn them into dates again etc...

Cheers,

H.

ADD REPLY
1
Entering edit mode
@steve-lianoglou-2771
Last seen 21 months ago
United States

You might try changing your IntervalStart and IntervalEnd to date objects (POSIXct or whatever), and then further into "seconds after the unix epoch" -- I don't have the commands to do that handy right off the top of my head, but you can sort it out.

Using those values (seconds after unix epoch) as your start and ends would help fit this more cleanly into an "overlaps query" scenario with IRanges.

You might also want to consider looking at the lubridate package and its vignette for working with dates.

Lastly, the data.table package also supports overlaps-like queries using the `foverlaps` function, but I know has some handy facilities that people working with times find useful, such as rolling joins.

I'm intentionally not providing any solutions here because I actually don't work with dates much, so can't help you with exact code, but I'm just rather pointing to things that should help you figure out how to get this done yourself.

Update

Arun has a presentation he did on `foverlaps` with data.table, and apparently use can use POSIXct values directly, so you might want to look at that some more. You can find that PDF here.

ADD COMMENT

Login before adding your answer.

Traffic: 695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6