Question: Ploting with NA values in Gviz (generic function from plot)
3.0 years ago by
Brazil

I would like to use the DataTrack (generic function from plot) with types as exemplified in script below. My real data contain a lot of NA values, thus I modified the Gviz package example to expose my problem.

First, this is how the plot looks like without NAs:

library(Gviz)

data(twoGroups)

## Plot data without NAs
dTrack <- DataTrack(twoGroups, name = "uniform")
tiff("Gviz_original.tiff", units="in", width=11, height=8.5, res=200, compress="lzw")
plotTracks(dTrack, groups = rep(c("control", "treated"),
each = 3), type = c("a", "p", "confint"))
graphics.off()

Now, the plot with NAs:

## Transforming in data frame
df <- as.data.frame(twoGroups)

## Input NAs to look like my real data
df[ df <= 0 ] = NA
df <- df[,-4]
df <- df[,-4]
names(df) <- c("chr", "start", "end", "control", "control.1", "control.2", "treated", "treated.1", "treated.2")

## Plot with NA
df <- makeGRangesFromDataFrame(df, TRUE)
dftrack <- DataTrack(df, name = "uniform")
tiff("Gviz_NA.tiff", units="in", width=11, height=8.5, res=200, compress="lzw")
plotTracks(dftrack, groups = rep(c("control", "treated"),
each = 3), type = c("a", "p", "confint"))
graphics.off()

I am aware that each graphic is from a different data now. However, even in the plot with NAs I expect lines relying on the means (even if the mean is estimated from just 1 animal!). Any ideas why the lines are not completely there? Thank you!

3.0 years ago by
United States
James W. MacDonald46k wrote:

Note that plotTracks() has an ellipsis (...) argument, which allows you to pass arbitrary arguments to lower level functions (so long as they have ellipsis arguments as well). As an example, the mean() function by default uses an argument of na.rm = FALSE, so any vector with NA observations will have a mean of NA as well.

> mean(c(1,2,NA,4,5,6))
[1] NA
> mean(c(1,2,NA,4,5,6), na.rm = TRUE)
[1] 3.6

If you then modify your call to plotTracks() by adding na.rm = TRUE, then  you will calculate means even if there are NA observations. But since you need more than one observation to get a CI, you just get the lines, not the CI bands.

plotTracks(dftrack, groups = rep(c("control", "treated"),
each = 3), type = c("a", "p", "confint"), na.rm = TRUE)

Edit:  It is more likely that the code that computes the confidence intervals does not have an 'na.rm' argument, which is why you don't get a CI band, rather than the number of observations that are not NA.

Thank you very much, your solution worked very well!
However, about the CI computing, there have any way to include a similar 'na.rm' argument?