Question

bumphunter of multi covariates

0

Entering edit mode

Asma rabe ▴ 290

@asma-rabe-4697

Last seen 6.2 years ago

Japan

Hi,

I have used bump hunter function for two populations (control and disease represented by status column) using the following formula

designMatrix <- model.matrix(~ status)

For each of control and disease populations,within each population, I have subpouations treated with different drugs

I would like to identify DMRs between control and disease considering these conditions (drugs)

I made the design matrix as follows:

designMatrix <- model.matrix(~ status+condition)

I found that the out put is very similar. I would like to know is it possible to identify DMRs between disease and normal considering different treatments using bump hunter?

Thank you.

bumphunter • 1.5k views

ADD COMMENT • link 7.7 years ago Asma rabe ▴ 290

0

Entering edit mode

Asma rabe ▴ 290

@asma-rabe-4697

Last seen 6.2 years ago

Japan

Thank you James.

I have two questions

1)If i have more than two conditions, it will be

(Drug1_disease - Drug1_control) - (Drug2_disease - Drug2_control)-(Drug3_disease - Drug3_control) - (Drug4_disease - Drug4_control)?

2)#========regarding the original design matrix which have only disease/control states

designMatrix <- model.matrix(~status)

I want to confirm that determining which to be the reference (control/disease) is based on the order of data in status vector. If the vector starts with disease, the disease will be the reference

the coefficient will be

methylation_control-methylation_disease

Thank you very much

ADD COMMENT • link 7.7 years ago Asma rabe ▴ 290

0

Entering edit mode

Please don't use the answer box to post another question! It's intended for people to provide answers, hence the name.

1.) That is something you could do, if you want to test that any of the four drugs has a different effect. At this point we are just doing simple algebra. You are testing the hypotheses:

H0: (Drug1_disease - Drug1_control) - (Drug2_disease - Drug2_control)-(Drug3_disease - Drug3_control) - (Drug4_disease - Drug4_control) = 0

HA: (Drug1_disease - Drug1_control) - (Drug2_disease - Drug2_control)-(Drug3_disease - Drug3_control) - (Drug4_disease - Drug4_control) != 0

So what would it mean if you rejected the null hypothesis?

2.) This is something you can figure out for yourself. How do you determine the order of factors? Is it the order of the vector supplied (as you assume), or something else? If you are planning to use R long term, you need to learn to figure out simple things like that yourself. There is only a certain amount of goodwill that is extended to people on forums such as this. If you ask simple questions that are easily checked (or googled) yourself, you run the risk of appearing like you are taking advantage of those who would help you, and they will stop doing so.

ADD REPLY • link 7.7 years ago James W. MacDonald 65k

0

Entering edit mode

Thank you James.

The reference is considered as the element with the lowest number or letter with lowest alphabetical order (first element in levels of the factor object ) not on the order of data in the factor object

Thanks for the advice.

ADD REPLY • link 7.7 years ago Asma rabe ▴ 290

score 1 · Accepted Answer · 2016-08-18

1

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 2 hours ago

United States

To do that you need an interaction term, so you would do

designMatrix <- model.matrix(~status*condition)

And the interaction term will be the last column of your design matrix.

ADD COMMENT • link 7.7 years ago James W. MacDonald 65k

0

Entering edit mode

Thank you James .

Using the coefficient which is value column in the table of bump hunter output, the sign can tell if the region is hypo/hyper methylated in the disease relative to normal. How to infer which region of disease is hypomethylated in one condition relative to other condition?

would you please explains a bit how the coefficient (value) is calculated?

Thank you again for help

ADD REPLY • link 7.7 years ago Asma rabe ▴ 290

1

Entering edit mode

It's a bit complicated for an interaction term. As you note already, for the design matrix you originally specified, the sign of the coefficient indicates up or down methylation. But the interaction is computed as:

(Drug1_disease - Drug1_control) - (Drug2_disease - Drug2_control)

And the sign simply indicates that the term in the first set of parentheses is larger than the term in the second set of parentheses. The methylation may be lower in disease vs control in both instances (just less so when treating with Drug1), and you would still get a positive sign. So the only way to interpret a coefficient for an interaction is to either look at the underlying data, or to do plots.

I personally like to use Gviz for that sort of thing, as you can easily plot any region with differential methylation (genomic region I mean), along with the methylation status of each group.

ADD REPLY • link 7.7 years ago James W. MacDonald 65k