Design/contrast matrix for Multivariate experimental design in limma
1
0
Entering edit mode
@1234anjalianjali1234-10835
Last seen 4.6 years ago

Hellow, 

I have four number of Two-color array chips (.gpr files/ genepix) with different time points, infecting organism and with dye swaps and some having biological replicate. it's getting complicated for me to make design/contrast matrix to find differential gene expression.

I have done normalization step with all these chips and also merge them using InSilicoDb package using Combat method. Could anyone help me to make design matrix for these chips in combined form?

Sample description with Cy3/Cy5 information given below. [WT = Wounded and Treated; T = Treated; CDLV = Carborundum-dusted leaves treated with Virus; CDLW = Carborundum-dusted leavestreated with Water]

 

Sample_title             Organism    Geo      protocol_1  Label_1  protocol_2  label_ch2
ctrl_vs_race0_Scjm_72h_1    Fungus   GSM200000  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Scjm_72h_2    Fungus   GSM200001  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Scjm_72h_3    Fungus   GSM200002  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Scjm_96h_1    Fungus   GSM200003  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Scjm_96h_2    Fungus   GSM200004  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Stbr_72h_1    Fungus   GSM200005  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Stbr_72h_2    Fungus   GSM200006  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Stbr_72h_3    Fungus   GSM200007  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Stbr_96h_1    Fungus   GSM200008  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Stbr_96h_2    Fungus   GSM200009  Treated      Cy3    Control       Cy5
ctrl_vs_race0_Stbr_96h_3    Fungus   GSM200010  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Scjm_72h_1  Fungus   GSM200011  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Scjm_72h_2  Fungus   GSM200012  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Scjm_72h_3  Fungus   GSM200013  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Scjm_96h_1  Fungus   GSM200014  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Scjm_96h_2  Fungus   GSM200015  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Scjm_96h_3  Fungus   GSM200016  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_72h_1  Fungus   GSM200017  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_72h_2  Fungus   GSM200018  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_72h_3  Fungus   GSM200019  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_96h_1  Fungus   GSM200020  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_96h_2  Fungus   GSM200021  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_96h_3  Fungus   GSM200022  Treated      Cy3    Control       Cy5
ctrl_vs_complex_Stbr_72h_1FD Fungus  GSM200023  Treated      Cy5    Control       Cy3

Sample                   Organism     Geo    protocol_1  Label_1  protocol_2 label_2
ctrl_vs_1h_infested1           Insect GSM200024  T 1hr     Cy3   Control 4 hr  Cy5
ctrl_vs_1h_infested2           Insect GSM200025  T 1hr     Cy3   Control 4 hr  Cy5
ctrl_vs_1h_infested3           Insect GSM200026  T 1 hr    Cy3   Control 4 hr  Cy5
ctrl_vs_1h_infested1swap       Insect GSM200027  T 1hr     Cy5   Control 4 hr  Cy3
ctrl_vs_1h_infested2swap       Insect GSM200028  T 1hr     Cy5   Control 4 hr  Cy3
ctrl_vs_1h_infested3swap       Insect GSM200029  T 1 hr    Cy5   Control 4 hr  Cy3
wounded_vs_wounded_spit1       Insect GSM200030  WT 4 hr   Cy3   Wounded 4 hr  Cy5
wounded_vs_wounded_spit2       Insect GSM200031  WT 4 hr   Cy3   Wounded 4 hr  Cy5
wounded_vs_wounded_spit3       Insect GSM200032  WT 4 hr   Cy3   Wounded 4 hr  Cy5
wounded_vs_wounded_spit1swap   Insect GSM200033  WT 4 hr   Cy5   Wounded 4 hr  Cy3
wounded_vs_wounded_spit2swap   Insect GSM200034  WT 4 hr   Cy5   Wounded 4 hr  Cy3
wounded_vs_wounded_spit3swap   Insect GSM200035  WT 4 hr   Cy5   Wounded 4 hr  Cy3
ctrl_vs_4h_infested1           Insect GSM200036  T 4 hr    Cy3   Control 4 hr  Cy5
ctrl_vs_4h_infested2           Insect GSM200037  T 4 hr    Cy3   Control 4 hr  Cy5
ctrl_vs_4h_infested3           Insect GSM200038  T 4 hr    Cy3   Control 4 hr  Cy5
ctrl_vs_4h_infested1swap       Insect GSM200039  T 4 hr    Cy5   Control 4 hr  Cy3
ctrl_vs_4h_infested2swap       Insect GSM200040  T 4 hr    Cy5   Control 4 hr  Cy3
ctrl_vs_4h_infested3swap       Insect GSM200041  T 4 hr    Cy5   Control 4 hr  Cy3
wd_systemic_vs_spit_systemic1  Insect GSM200042  WT 4 hr   Cy3   Wounded 4 hr  Cy5
wd_systemic_vs_spit_systemic2  Insect GSM200043  WT 4 hr   Cy3   Wounded 4 hr  Cy5
wd_systemic_vs_spit_systemic3  Insect GSM200044  WT 4 hr   Cy3   Wounded 4 hr  Cy5
wd_systemic_vs_spit_systemic1s Insect GSM200045  WT 4 hr   Cy5   Wounded 4 hr  Cy3
wd_systemic_vs_spit_systemic2s Insect GSM200046  WT 4 hr   Cy5   Wounded 4 hr  Cy3
wd_systemic_vs_spit_systemic3s Insect GSM200047  WT 4 hr   Cy5   Wounded 4 hr  Cy3


Sample_title              Organism    Geo     protocol_1  Label_1  protocol_2   label_2
non_GM_D_vs_GM_A_Pinf_1    Fungus  GSM200048  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_D_vs_GM_A_water_1   Fungus  GSM200049  Control_24h  Cy3  Control_24h  Cy5
non_GM_E_vs_GM_B_Pinf_1    Fungus  GSM200050  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_E_vs_GM_B_water_1   Fungus  GSM200051  Control_24h  Cy3  Control_24h  Cy5
non_GM_F_vs_GM_C_Pinf_1    Fungus  GSM200052  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_F_vs_GM_C_water_1   Fungus  GSM200053  Control_24h  Cy3  Control_24h  Cy5
non_GM_D_vs_GM_A_Pinf_2    Fungus  GSM200054  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_D_vs_GM_A_water_2   Fungus  GSM200055  Control_24h  Cy3  Control_24h  Cy5
non_GM_E_vs_GM_B_Pinf_2    Fungus  GSM200056  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_E_vs_GM_B_water_2   Fungus  GSM200057  Control_24h  Cy3  Control_24h  Cy5
non_GM_F_vs_GM_C_Pinf_2    Fungus  GSM200058  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_F_vs_GM_C_water_2   Fungus  GSM200059  Control_24h  Cy3  Control_24h  Cy5
non_GM_D_vs_GM_A_Pinf_3    Fungus  GSM200060  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_D_vs_GM_A_water_3   Fungus  GSM200061  Control_24h  Cy3  Control_24h  Cy5
non_GM_E_vs_GM_B_Pinf_3    Fungus  GSM200062  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_E_vs_GM_B_water_3   Fungus  GSM200063  Control_24h  Cy3  Control_24h  Cy5
non_GM_F_vs_GM_C_Pinf_3    Fungus  GSM200064  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_F_vs_GM_C_water_3   Fungus  GSM200065  Control_24h  Cy3  Control_24h  Cy5
non_GM_D_vs_GM_A_water_3FD Fungus  GSM200066  Control_24h  Cy5  Control_24h  Cy3

!Sample_title               Organism    Geo    treatment_1  Label_1 treatment_2 Label_2
mock_vs_pvy_infected_1dpi_1.1  Virus GSM200067 CDLV_1dpi    Cy3 CDLW_1dpi    Cy5
mock_vs_pvy_infected_1dpi_1.2  Virus GSM200068 CDLV_1dpi    Cy3 CDLW_1dpi    Cy5
mock_vs_pvy_infected_1dpi_2.1  Virus GSM200069 CDLV_1dpi    Cy3 CDLW_1dpi    Cy5
mock_vs_pvy_infected_1dpi_2.2  Virus GSM200070 CDLV_1dpi    Cy3 CDLW_1dpi    Cy5
mock_vs_pvy_infected_3dpi_1.1  Virus GSM200071 CDLV_3dpi    Cy3 CDLW_3dpi    Cy5
mock_vs_pvy_infected_3dpi_1.2  Virus GSM200072 CDLV_3dpi    Cy3 CDLW_3dpi    Cy5
mock_vs_pvy_infected_3dpi_1_FD Virus GSM200073 CDLV_3dpi    Cy5 CDLW_3dpi    Cy3
mock_vs_pvy_infected_3dpi_2.1  Virus GSM200074 CDLV_3dpi    Cy3 CDLW_3dpi    Cy5
mock_vs_pvy_infected_3dpi_2.2  Virus GSM200075 CDLV_3dpi    Cy3 CDLW_3dpi    Cy5
mock_vs_pvy_infected_6dpi_1.1  Virus GSM200076 CDLV_6dpi    Cy3 CDLW_6dpi    Cy5
mock_vs_pvy_infected_6dpi_1.2  Virus GSM200077 CDLV_6dpi    Cy3 CDLW_6dpi    Cy5
mock_vs_pvy_infected_6dpi_2.1  Virus GSM200078 CDLV_6dpi    Cy3 CDLW_6dpi    Cy5
mock_vs_pvy_infected_6dpi_2.2  Virus GSM200079 CDLV_6dpi    Cy3 CDLW_6dpi    Cy5
mock_vs_pvy_systemic_1dpi_1.1  Virus GSM200080 Treated_1dpi Cy3 Treated_1dpi Cy5
mock_vs_pvy_systemic_1dpi_1.2  Virus GSM200081 Treated_1dpi Cy3 Treated_1dpi Cy5
mock_vs_pvy_systemic_1dpi_2.1  Virus GSM200082 Treated_1dpi Cy3 Treated_1dpi Cy5
mock_vs_pvy_systemic_1dpi_2.2  Virus GSM200083 Treated_1dpi Cy3 Treated_1dpi Cy5
mock_vs_pvy_systemic_3dpi_1.1  Virus GSM200084 Treated_3dpi Cy3 Treated_3dpi Cy5
mock_vs_pvy_systemic_3dpi_1.2  Virus GSM200085 Treated_3dpi Cy3 Treated_3dpi Cy5
mock_vs_pvy_systemic_3dpi_1_FD Virus GSM200086 Treated_3dpi Cy5 Treated_3dpi Cy3
mock_vs_pvy_systemic_3dpi_2.1  Virus GSM200087 Treated_3dpi Cy3 Treated_3dpi Cy5
mock_vs_pvy_systemic_3dpi_2.2  Virus GSM200088 Treated_3dpi Cy3 Treated_3dpi Cy5
mock_vs_pvy_systemic_6dpi_1.1  Virus GSM200089 Treated_6dpi Cy3 Treated_6dpi Cy5
mock_vs_pvy_systemic_6dpi_1.2  Virus GSM200090 Treated_6dpi Cy3 Treated_6dpi Cy5
mock_vs_pvy_systemic_6dpi_2.1  Virus GSM200091 Treated_6dpi Cy3 Treated_6dpi Cy5
mock_vs_pvy_systemic_6dpi_2.2  Virus GSM200092 Treated_6dpi Cy3 Treated_6dpi Cy5
 
microarray limma design and contrast matrix • 1.4k views
ADD COMMENT
0
Entering edit mode

Cross posted on Biostars: https://www.biostars.org/p/309475/

ADD REPLY
0
Entering edit mode

Sorry for the inconvenience, I am getting replies on Biostar, hope to get my solution there.

Thank you

ADD REPLY
2
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

Making the design matrix is easy. If you are doing a classical two-color analysis of log-ratios, then you just give the Cy3 and Cy5 columns to modelMatrix() and it forms the design matrix for you. If you doing a separate channel analysis using lmscFit() as described here:

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-165

then you use targetsA2C() to expand the Cy3 and Cy5 columns into a single column and then use model.matrix() as you would do for any single channel analysis.

There are however quite a few puzzling things about your post that made me hesitant to reply to it:

  1. You say that you have used the InSilicoDb package, but that package was removed from Bioconductor long ago. Even when it did exist, it was only for single channel platforms (Affy, Illumina & RNA-seq), not for two-color arrays such as you have here. The InSilico Db itself that the package was designed to go with is also closed now.
  2. You give GEO GSM identifiers, but the sample information you show here does not match the information given by GEO for the same Ids. The experiments corresponding to these GSM Ids on GEO could not possibly be merged in any sensible way.
  3. You say that you've batch corrected using Combat, but there are several reasons why that is problematic. First, it makes no sense to batch correct before you've formed a design matrix. Second, Combat is not designed for two-color data. Third, it isn't a statistically valid approach to batch correct data and then to do a DE analysis as if the data was not batch corrected. Finally, it isn't possible to batch correct when the treatment conditions are completely confounded with the batches, at least not without negative control probes.

All in all, I have a lot of reservations about the analysis you are doing. It would seem that people on Biostars are telling you the same thing.

 

ADD COMMENT
0
Entering edit mode

Hellow Smyth,

Here are the answer to your questions:

1.  For using InSilicoDb package I was using an older version of R [R.3.2.3], but I am not aware of this that it is not applicable to two-color array.

2. The reason you are not finding those identifiers in GEO because I have changed the ID, if you need the original ID I can provide. They belong to same GPL platform.

3. I have realized that your point is valid.

Now, I am doing meta-analysis using these 4 and more chips to increase my sample size.

ADD REPLY
0
Entering edit mode

I was wondering, how could we make design matrix for the chip given below:

 

Sample_title              Organism    Geo     protocol_1  Label_1  protocol_2   label_2
non_GM_D_vs_GM_A_Pinf_1    Fungus  GSM200048  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_D_vs_GM_A_water_1   Fungus  GSM200049  Control_24h  Cy3  Control_24h  Cy5
non_GM_E_vs_GM_B_Pinf_1    Fungus  GSM200050  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_E_vs_GM_B_water_1   Fungus  GSM200051  Control_24h  Cy3  Control_24h  Cy5
non_GM_F_vs_GM_C_Pinf_1    Fungus  GSM200052  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_F_vs_GM_C_water_1   Fungus  GSM200053  Control_24h  Cy3  Control_24h  Cy5
non_GM_D_vs_GM_A_Pinf_2    Fungus  GSM200054  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_D_vs_GM_A_water_2   Fungus  GSM200055  Control_24h  Cy3  Control_24h  Cy5
non_GM_E_vs_GM_B_Pinf_2    Fungus  GSM200056  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_E_vs_GM_B_water_2   Fungus  GSM200057  Control_24h  Cy3  Control_24h  Cy5
non_GM_F_vs_GM_C_Pinf_2    Fungus  GSM200058  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_F_vs_GM_C_water_2   Fungus  GSM200059  Control_24h  Cy3  Control_24h  Cy5
non_GM_D_vs_GM_A_Pinf_3    Fungus  GSM200060  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_D_vs_GM_A_water_3   Fungus  GSM200061  Control_24h  Cy3  Control_24h  Cy5
non_GM_E_vs_GM_B_Pinf_3    Fungus  GSM200062  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_E_vs_GM_B_water_3   Fungus  GSM200063  Control_24h  Cy3  Control_24h  Cy5
non_GM_F_vs_GM_C_Pinf_3    Fungus  GSM200064  Treated_24h  Cy3  Treated_24h  Cy5
non_GM_F_vs_GM_C_water_3   Fungus  GSM200065  Control_24h  Cy3  Control_24h  Cy5
non_GM_D_vs_GM_A_water_3FD Fungus  GSM200066  Control_24h  Cy5  Control_24h  Cy3

 

I am unable to locate this kind of experiment in limma user guide. OR Should I treat it as one-color array?

Thankyou

ADD REPLY
1
Entering edit mode

This type of experiment is pretty standard for limma two-color analyses. You form the design matrix using modelMatrix as I said in my answer. No, it would be incorrect to treat it as one-color, and the correct analysis is easier anyway.

ADD REPLY

Login before adding your answer.

Traffic: 590 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6