Need help: no MTC possible

0

Entering edit mode

suparna mitra ▴ 290

@suparna-mitra-5328

Last seen 9.7 years ago

Hi all, I have been working in a project where I have Affymetrix Hgene 1.0 St V1 data. And I have tree groups of patients having 6 samples each. I tried to perform rma normalization and to filter my data based on expression values 20%. After that went for unpaired t-test to test each two combination of groups. But the problem is my data is extremely variable. I have tried to filter my genes based on variance and/or CV before testing, to try to reduce the number of genes entering your test and multiple correction. But with different reasonable filtering also I am with no luck. And I don't have the option to increase sample size of my project. Further I tried to check for the bad samples and bad probes from experimentand remove outlier if these are not of interest. Still the same when run t-test (and other possible test like Mann-Whitney) with MTC there are no genes. On the other hand if I go on with out MTC and select a good p value cutoff and reasonable fold change I get a list of significant gene which may be good or reasonable for my study. but the problem is I somehow need to justify the method for my finding. Do you know any study or paper where anybody has treated their data without MTC? My main concern is if I find a good story matching biological prospective, would it be anyhow possible to justify the method without MTC? Thanks a lot, Suparna. [[alternative HTML version deleted]]

Normalization GO Normalization GO • 1.4k views

ADD COMMENT • link updated 11.6 years ago by James W. MacDonald 65k • written 11.6 years ago by suparna mitra ▴ 290

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 22 hours ago

United States

Hi Suparna, On 10/15/2012 7:01 AM, suparna mitra wrote: > Hi all, > I have been working in a project where I have Affymetrix Hgene 1.0 St V1 > data. And I have tree groups of patients having 6 samples each. I tried to > perform rma normalization and to filter my data based on expression values > 20%. After that went for unpaired t-test to test each two combination of > groups. But the problem is my data is extremely variable. > I have tried to filter my genes based on variance and/or CV before testing, > to try to reduce the number of genes entering your test and multiple > correction. But with different reasonable filtering also I am with no > luck. And I don't have the option to increase sample size of my project. > Further I tried to check for the bad samples and bad probes from > experimentand remove outlier if these are not of interest. Still the same > when run t-test (and other possible test like Mann-Whitney) with MTC there > are no genes. > On the other hand if I go on with out MTC and select a good p value cutoff > and reasonable fold change I get a list of significant gene which may be > good or reasonable for my study. but the problem is I somehow need to > justify the method for my finding. Do you know any study or paper where > anybody has treated their data without MTC? > My main concern is if I find a good story matching biological prospective, > would it be anyhow possible to justify the method without MTC? It's not clear to me what you are doing here - when you filter on variance are you keeping or removing the high variability genes (keeping, I hope)? I am also not sure what MTC stands for - is this multiple test correction? Anyway, assuming I have things correct, some suggestions. First, you might want to use array weights when fitting your model. If you have a lot of intra-group variability, this will tend to help. Second, the t-statistic is the universally most powerful test (assuming the underlying data are relatively hump-shaped), so going to a non-parametric test will usually reduce rather than increase power to detect differences. Third, univariate tests are arguably not the most sophisticated way of analyzing expression data, and you might get better (or at least more satisfactory) results if you instead looked at analyzing for groups of genes rather than individually. Depending on your experiment, you could accomplish this task with a gene set analysis (there are multiple ways of doing this - perhaps the easiest being romer() and roast() in limma), or if you have phenotypic data, especially continuous measures, a WGCNA analysis might be of some use. Best, Jim > Thanks a lot, > Suparna. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 11.6 years ago James W. MacDonald 65k

0

Entering edit mode

Dear James, Thanks for your suggestion. I was reading arrayWeights package in limma. But being novice in bioC I have one confusion. Should I perform arrayWeights on normalized (rmaOligo) expression data or on the raw data? This is what i have done so far: > sessionInfo() R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] statmod_1.4.15 limma_3.12.1 annotate_1.34.1 hugene10stprobeset.db_8.0.1 org.Hs.eg.db_2.7.1 [6] BiocInstaller_1.4.7 affycoretools_1.28.0 KEGG.db_2.7.1 GO.db_2.7.1 AnnotationDbi_1.18.1 [11] affy_1.34.0 Biobase_2.16.0 BiocGenerics_0.2.0 pd.hugene.1.0.st.v1_3.6.0 RSQLite_0.11.1 [16] DBI_0.2-5 oligo_1.20.4 oligoClasses_1.18.0 rmaOligoinvivo = oligo::rma(InVivodat1) Background correcting Normalizing Calculating Expression > maplot(rmaOligoinvivo) >hist(rmaOligoinvivo) > InVivoTargets=readTargets("~/Desktop/Recent/Liverpool-work- related/Micro_RawData/InVivoTargets.txt") > InVivoTargets FileName Treatment 1 MC1 A 2 MC2 A 3 MC3 A 4 MC4 A 5 MC5 A 6 MC6 A 7 MC7 R 8 MC8 R 9 MC9 R 10 MC10 R 11 MC11 R 12 MC12 R 13 MC13 T 14 MC14 T 15 MC15 T 16 MC16 T 17 MC17 T 18 MC18 T f.invivo <- factor(InVivoTargets$Treatment, levels = c("A", "R", "T")) design.invivo <- model.matrix(~0 + f.invivo) > colnames(design.invivo) <- c("A", "R", "T") > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo) > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = design.invivo) > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo) > fit2.invivo <-eBayes(fit2.invivo) Thanks a lot, Suparna. On 15 October 2012 14:33, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Suparna, > > > On 10/15/2012 7:01 AM, suparna mitra wrote: > >> Hi all, >> I have been working in a project where I have Affymetrix Hgene 1.0 St >> V1 >> data. And I have tree groups of patients having 6 samples each. I tried to >> perform rma normalization and to filter my data based on expression values >> 20%. After that went for unpaired t-test to test each two combination of >> groups. But the problem is my data is extremely variable. >> I have tried to filter my genes based on variance and/or CV before >> testing, >> to try to reduce the number of genes entering your test and multiple >> correction. But with different reasonable filtering also I am with no >> luck. And I don't have the option to increase sample size of my project. >> Further I tried to check for the bad samples and bad probes from >> experimentand remove outlier if these are not of interest. Still the same >> when run t-test (and other possible test like Mann-Whitney) with MTC there >> are no genes. >> On the other hand if I go on with out MTC and select a good p value cutoff >> and reasonable fold change I get a list of significant gene which may be >> good or reasonable for my study. but the problem is I somehow need to >> justify the method for my finding. Do you know any study or paper where >> anybody has treated their data without MTC? >> My main concern is if I find a good story matching biological prospective, >> would it be anyhow possible to justify the method without MTC? >> > > It's not clear to me what you are doing here - when you filter on variance > are you keeping or removing the high variability genes (keeping, I hope)? I > am also not sure what MTC stands for - is this multiple test correction? > > Anyway, assuming I have things correct, some suggestions. First, you might > want to use array weights when fitting your model. If you have a lot of > intra-group variability, this will tend to help. > > Second, the t-statistic is the universally most powerful test (assuming > the underlying data are relatively hump-shaped), so going to a > non-parametric test will usually reduce rather than increase power to > detect differences. > > Third, univariate tests are arguably not the most sophisticated way of > analyzing expression data, and you might get better (or at least more > satisfactory) results if you instead looked at analyzing for groups of > genes rather than individually. > > Depending on your experiment, you could accomplish this task with a gene > set analysis (there are multiple ways of doing this - perhaps the easiest > being romer() and roast() in limma), or if you have phenotypic data, > especially continuous measures, a WGCNA analysis might be of some use. > > Best, > > Jim > > > Thanks a lot, >> Suparna. >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5394, Internal ext: 55394 M: +44 (0) 7511387895 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]

ADD REPLY • link 11.6 years ago suparna mitra ▴ 290

0

Entering edit mode

Hello group, Related to my previous post, I further tried arrayweight as: > f.invivo <- factor(InVivoTargets$Treatment, levels = c("A", "R", "T")) > design.invivo <- model.matrix(~0 + f.invivo) > colnames(design.invivo) <- c("A", "R", "T") > design.invivo A R T 1 1 0 0 2 1 0 0 3 1 0 0 4 1 0 0 5 1 0 0 6 1 0 0 7 0 1 0 8 0 1 0 9 0 1 0 10 0 1 0 11 0 1 0 12 0 1 0 13 0 0 1 14 0 0 1 15 0 0 1 16 0 0 1 17 0 0 1 18 0 0 1 attr(,"assign") [1] 1 1 1 attr(,"contrasts") attr(,"contrasts")$f.invivo [1] "contr.treatment" > > arrayw <- arrayWeightsSimple(rmaOligoinvivo, design.invivo) > fit <- lmFit(rmaOligoinvivo, design.invivo, weights=arrayw) > arrayw 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0.3749711 0.8578285 1.9289731 1.2390065 0.8116796 1.7846502 1.0741852 1.4277605 0.6533368 0.7637412 1.2647738 1.4520790 0.8309346 0.9328655 15 16 17 18 1.1926458 0.7280477 0.5130294 1.8503073 > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = design.invivo) > fit2<-contrasts.fit(fit, contrast.matrix.invivo) > fit2 <- eBayes(fit2) > Can anybody please suggest if I am doing it right? Actually being new in this I am bit afraid to make errors. Thanks, Suparna. On 16 October 2012 10:36, suparna mitra <smitra@liverpool.ac.uk> wrote: > Dear James, > Thanks for your suggestion. I was reading arrayWeights package in limma. > But being novice in bioC I have one confusion. Should I > perform arrayWeights on normalized (rmaOligo) expression data or on the raw > data? > > This is what i have done so far: > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] statmod_1.4.15 limma_3.12.1 > annotate_1.34.1 hugene10stprobeset.db_8.0.1 org.Hs.eg.db_2.7.1 > > [6] BiocInstaller_1.4.7 affycoretools_1.28.0 KEGG.db_2.7.1 > GO.db_2.7.1 AnnotationDbi_1.18.1 > [11] affy_1.34.0 Biobase_2.16.0 > BiocGenerics_0.2.0 pd.hugene.1.0.st.v1_3.6.0 RSQLite_0.11.1 > > [16] DBI_0.2-5 oligo_1.20.4 > oligoClasses_1.18.0 > > > rmaOligoinvivo = oligo::rma(InVivodat1) > Background correcting > Normalizing > Calculating Expression > > > maplot(rmaOligoinvivo) > >hist(rmaOligoinvivo) > > InVivoTargets=readTargets("~/Desktop/Recent/Liverpool-work- > related/Micro_RawData/InVivoTargets.txt") > > InVivoTargets > FileName Treatment > 1 MC1 A > 2 MC2 A > 3 MC3 A > 4 MC4 A > 5 MC5 A > 6 MC6 A > 7 MC7 R > 8 MC8 R > 9 MC9 R > 10 MC10 R > 11 MC11 R > 12 MC12 R > 13 MC13 T > 14 MC14 T > 15 MC15 T > 16 MC16 T > 17 MC17 T > 18 MC18 T > > f.invivo <- factor(InVivoTargets$Treatment, levels = c("A", "R", "T")) > > design.invivo <- model.matrix(~0 + f.invivo) > > > colnames(design.invivo) <- c("A", "R", "T") > > > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo) > > > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = > design.invivo) > > > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo) > > > fit2.invivo <-eBayes(fit2.invivo) > > Thanks a lot, > Suparna. > > > On 15 October 2012 14:33, James W. MacDonald <jmacdon@uw.edu> wrote: > >> Hi Suparna, >> >> >> On 10/15/2012 7:01 AM, suparna mitra wrote: >> >>> Hi all, >>> I have been working in a project where I have Affymetrix Hgene 1.0 St >>> V1 >>> data. And I have tree groups of patients having 6 samples each. I tried >>> to >>> perform rma normalization and to filter my data based on expression >>> values >>> 20%. After that went for unpaired t-test to test each two combination of >>> groups. But the problem is my data is extremely variable. >>> I have tried to filter my genes based on variance and/or CV before >>> testing, >>> to try to reduce the number of genes entering your test and multiple >>> correction. But with different reasonable filtering also I am with no >>> luck. And I don't have the option to increase sample size of my project. >>> Further I tried to check for the bad samples and bad probes from >>> experimentand remove outlier if these are not of interest. Still the same >>> when run t-test (and other possible test like Mann-Whitney) with MTC >>> there >>> are no genes. >>> On the other hand if I go on with out MTC and select a good p value >>> cutoff >>> and reasonable fold change I get a list of significant gene which may be >>> good or reasonable for my study. but the problem is I somehow need to >>> justify the method for my finding. Do you know any study or paper where >>> anybody has treated their data without MTC? >>> My main concern is if I find a good story matching biological >>> prospective, >>> would it be anyhow possible to justify the method without MTC? >>> >> >> It's not clear to me what you are doing here - when you filter on >> variance are you keeping or removing the high variability genes (keeping, I >> hope)? I am also not sure what MTC stands for - is this multiple test >> correction? >> >> Anyway, assuming I have things correct, some suggestions. First, you >> might want to use array weights when fitting your model. If you have a lot >> of intra-group variability, this will tend to help. >> >> Second, the t-statistic is the universally most powerful test (assuming >> the underlying data are relatively hump-shaped), so going to a >> non-parametric test will usually reduce rather than increase power to >> detect differences. >> >> Third, univariate tests are arguably not the most sophisticated way of >> analyzing expression data, and you might get better (or at least more >> satisfactory) results if you instead looked at analyzing for groups of >> genes rather than individually. >> >> Depending on your experiment, you could accomplish this task with a gene >> set analysis (there are multiple ways of doing this - perhaps the easiest >> being romer() and roast() in limma), or if you have phenotypic data, >> especially continuous measures, a WGCNA analysis might be of some use. >> >> Best, >> >> Jim >> >> >> Thanks a lot, >>> Suparna. >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > > > -- > Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5394, Internal ext: 55394 > M: +44 (0) 7511387895 > Email id: smitra@liverpool.ac.uk > Alternative Email id: suparna.mitra.sm@gmail.com > > -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5394, Internal ext: 55394 M: +44 (0) 7511387895 Email id: smitra@liverpool.ac.uk Alternative Email id: suparna.mitra.sm@gmail.com [[alternative HTML version deleted]]

ADD REPLY • link 11.6 years ago suparna mitra ▴ 290

0

Entering edit mode

Hi Suparna, On 10/16/2012 5:58 AM, suparna mitra wrote: > Hello group, > Related to my previous post, I further tried arrayweight as: > > > f.invivo <- factor(InVivoTargets$Treatment, levels = c("A", "R", "T")) > > > design.invivo <- model.matrix(~0 + f.invivo) > > > colnames(design.invivo) <- c("A", "R", "T") > > > design.invivo > > A R T > > 1 1 0 0 > > 2 1 0 0 > > 3 1 0 0 > > 4 1 0 0 > > 5 1 0 0 > > 6 1 0 0 > > 7 0 1 0 > > 8 0 1 0 > > 9 0 1 0 > > 10 0 1 0 > > 11 0 1 0 > > 12 0 1 0 > > 13 0 0 1 > > 14 0 0 1 > > 15 0 0 1 > > 16 0 0 1 > > 17 0 0 1 > > 18 0 0 1 > > attr(,"assign") > > [1] 1 1 1 > > attr(,"contrasts") > > attr(,"contrasts")$f.invivo > > [1] "contr.treatment" > > > > > > > arrayw <- arrayWeightsSimple(rmaOligoinvivo, design.invivo) > > > fit <- lmFit(rmaOligoinvivo, design.invivo, weights=arrayw) > > > arrayw > > 1 2 3 4 5 6 7 > 8 9 10 11 12 13 14 > > 0.3749711 0.8578285 1.9289731 1.2390065 0.8116796 1.7846502 1.0741852 > 1.4277605 0.6533368 0.7637412 1.2647738 1.4520790 0.8309346 0.9328655 > > 15 16 17 18 > > 1.1926458 0.7280477 0.5130294 1.8503073 > > > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = > design.invivo) > > > fit2<-contrasts.fit(fit, contrast.matrix.invivo) > > > fit2 <- eBayes(fit2) > Looks good to me. Best, Jim > > > > Can anybody please suggest if I am doing it right? Actually being new > in this I am bit afraid to make errors. > Thanks, > Suparna. > > On 16 October 2012 10:36, suparna mitra <smitra at="" liverpool.ac.uk=""> <mailto:smitra at="" liverpool.ac.uk="">> wrote: > > Dear James, > Thanks for your suggestion. I was reading arrayWeights package > in limma. > But being novice in bioC I have one confusion. Should I > perform arrayWeights on normalized (rmaOligo) expression data or > on the raw data? > > This is what i have done so far: > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] statmod_1.4.15 limma_3.12.1 > annotate_1.34.1 hugene10stprobeset.db_8.0.1 > org.Hs.eg.db_2.7.1 > [6] BiocInstaller_1.4.7 affycoretools_1.28.0 > KEGG.db_2.7.1 GO.db_2.7.1 > AnnotationDbi_1.18.1 > [11] affy_1.34.0 Biobase_2.16.0 > BiocGenerics_0.2.0 pd.hugene.1.0.st.v1_3.6.0 > RSQLite_0.11.1 > [16] DBI_0.2-5 oligo_1.20.4 > oligoClasses_1.18.0 > > > rmaOligoinvivo = oligo::rma(InVivodat1) > Background correcting > Normalizing > Calculating Expression > > > maplot(rmaOligoinvivo) > >hist(rmaOligoinvivo) > > InVivoTargets=readTargets("~/Desktop/Recent/Liverpool-work- related/Micro_RawData/InVivoTargets.txt") > > InVivoTargets > FileName Treatment > 1 MC1 A > 2 MC2 A > 3 MC3 A > 4 MC4 A > 5 MC5 A > 6 MC6 A > 7 MC7 R > 8 MC8 R > 9 MC9 R > 10 MC10 R > 11 MC11 R > 12 MC12 R > 13 MC13 T > 14 MC14 T > 15 MC15 T > 16 MC16 T > 17 MC17 T > 18 MC18 T > > f.invivo <- factor(InVivoTargets$Treatment, levels = c("A", "R", "T")) > > design.invivo <- model.matrix(~0 + f.invivo) > > > colnames(design.invivo) <- c("A", "R", "T") > > > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo) > > > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = > design.invivo) > > > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo) > > > fit2.invivo <-eBayes(fit2.invivo) > > Thanks a lot, > Suparna. > > > On 15 October 2012 14:33, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Suparna, > > > On 10/15/2012 7:01 AM, suparna mitra wrote: > > Hi all, > I have been working in a project where I have > Affymetrix Hgene 1.0 St V1 > data. And I have tree groups of patients having 6 samples > each. I tried to > perform rma normalization and to filter my data based on > expression values > 20%. After that went for unpaired t-test to test each two > combination of > groups. But the problem is my data is extremely variable. > I have tried to filter my genes based on variance and/or > CV before testing, > to try to reduce the number of genes entering your test > and multiple > correction. But with different reasonable filtering also > I am with no > luck. And I don't have the option to increase sample size > of my project. > Further I tried to check for the bad samples and bad > probes from > experimentand remove outlier if these are not of interest. > Still the same > when run t-test (and other possible test like > Mann-Whitney) with MTC there > are no genes. > On the other hand if I go on with out MTC and select a > good p value cutoff > and reasonable fold change I get a list of significant > gene which may be > good or reasonable for my study. but the problem is I > somehow need to > justify the method for my finding. Do you know any study > or paper where > anybody has treated their data without MTC? > My main concern is if I find a good story matching > biological prospective, > would it be anyhow possible to justify the method without MTC? > > > It's not clear to me what you are doing here - when you filter > on variance are you keeping or removing the high variability > genes (keeping, I hope)? I am also not sure what MTC stands > for - is this multiple test correction? > > Anyway, assuming I have things correct, some suggestions. > First, you might want to use array weights when fitting your > model. If you have a lot of intra-group variability, this will > tend to help. > > Second, the t-statistic is the universally most powerful test > (assuming the underlying data are relatively hump-shaped), so > going to a non-parametric test will usually reduce rather than > increase power to detect differences. > > Third, univariate tests are arguably not the most > sophisticated way of analyzing expression data, and you might > get better (or at least more satisfactory) results if you > instead looked at analyzing for groups of genes rather than > individually. > > Depending on your experiment, you could accomplish this task > with a gene set analysis (there are multiple ways of doing > this - perhaps the easiest being romer() and roast() in > limma), or if you have phenotypic data, especially continuous > measures, a WGCNA analysis might be of some use. > > Best, > > Jim > > > Thanks a lot, > Suparna. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > > -- > Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5394 <tel:%2b44%20%280%29151%20795%205394>, > Internal ext: 55394 > M: +44 (0) 7511387895 <tel:%2b44%20%280%29%207511387895> > Email id: smitra at liverpool.ac.uk <mailto:smitra at="" liverpool.ac.uk=""> > Alternative Email id: suparna.mitra.sm at gmail.com > <mailto:suparna.mitra.sm at="" gmail.com=""> > > > > > -- > Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5394, Internal ext: 55394 > M: +44 (0) 7511387895 > Email id: smitra at liverpool.ac.uk <mailto:smitra at="" liverpool.ac.uk=""> > Alternative Email id: suparna.mitra.sm at gmail.com > <mailto:suparna.mitra.sm at="" gmail.com=""> > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 11.6 years ago James W. MacDonald 65k

0

Entering edit mode

Dear James, Thanks for the support. But after doing these step also still no significant genes (see attached ven diagram as all 0). I realize my data is very variable. But isn't there any fix? Thanks a lot, Suparna. > topTable(fit2, coef = 1, adjust = "fdr") ID logFC AveExpr t P.Value adj.P.Val B 6238 7917530 0.6251124 11.170012 5.592012 2.536230e-05 0.6518774 -0.0171572 11556 7970507 0.9123944 7.490579 5.057525 7.969967e-05 0.6518774 -0.5140431 15234 8007228 0.6400697 9.710164 4.854888 1.239777e-04 0.6518774 -0.7143350 8819 7943047 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 -0.8148014 9675 7950951 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 -0.8148014 18889 8043581 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 -0.8148014 19899 8053785 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 -0.8148014 6239 7917532 0.7207256 10.449577 4.677368 1.831220e-04 0.6518774 -0.8950366 25845 8113130 0.6264274 8.957173 4.640816 1.984975e-04 0.6518774 -0.9328368 3759 7896284 -0.8301988 5.727302 -4.616141 2.096126e-04 0.6518774 -0.9584680 > topTable(fit2, coef = 2, adjust = "fdr") ID logFC AveExpr t P.Value adj.P.Val B 2088 7894602 0.5606499 2.841855 5.401484 3.800711e-05 0.9778984 -1.825861 685 7893190 -0.5281344 6.726990 -4.966811 9.708481e-05 0.9778984 -2.059423 6238 7917530 -0.5550876 11.170012 -4.786129 1.441524e-04 0.9778984 -2.162851 621 7893126 -0.6332961 4.412764 -4.785753 1.442714e-04 0.9778984 -2.163070 26642 8120756 -1.2198288 5.439265 -4.615075 2.101065e-04 0.9778984 -2.264206 1687 7894197 -1.0441762 2.631359 -4.526834 2.553958e-04 0.9778984 -2.317791 20947 8065084 -0.4297158 6.630412 -4.274936 4.470448e-04 0.9778984 -2.475534 154 7892657 0.9444466 3.997249 4.150578 5.900236e-04 0.9778984 -2.555951 20151 8056222 -0.8638926 7.892249 -4.144942 5.974996e-04 0.9778984 -2.559635 7851 7933619 0.4101773 8.480778 4.128728 6.195398e-04 0.9778984 -2.570249 > topTable(fit2, coef = 3, adjust = "fdr") ID logFC AveExpr t P.Value adj.P.Val B 6210 7917182 -0.3334646 3.273225 -5.847896 1.483281e-05 0.2755645 2.2740621 27812 8132245 -0.5028082 5.409405 -5.795271 1.655191e-05 0.2755645 2.1981127 2366 7894884 0.6507323 8.436001 5.335322 4.378172e-05 0.3234851 1.5133904 26802 8122099 -0.4655070 4.548920 -5.279910 4.930640e-05 0.3234851 1.4284277 587 7893092 -1.0604644 6.013864 -5.143614 6.613900e-05 0.3234851 1.2172775 2562 7895081 0.6962641 6.898546 4.999306 9.045119e-05 0.3234851 0.9904391 867 7893372 1.2334593 3.017891 4.971552 9.608676e-05 0.3234851 0.9464374 685 7893190 -0.5196216 6.726990 -4.948033 1.011423e-04 0.3234851 0.9090560 808 7893313 0.9743437 7.938503 4.893437 1.139496e-04 0.3234851 0.8219582 15234 8007228 0.6486240 9.710164 4.801424 1.393931e-04 0.3234851 0.6741646 > > results <- decideTests(fit2) > > vennDiagram(results) On 16 October 2012 14:48, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > Hi Suparna, > > > On 10/16/2012 5:58 AM, suparna mitra wrote: > >> Hello group, >> Related to my previous post, I further tried arrayweight as: >> >> > f.invivo <- factor(InVivoTargets$**Treatment, levels = c("A", "R", >> "T")) >> >> > design.invivo <- model.matrix(~0 + f.invivo) >> >> > colnames(design.invivo) <- c("A", "R", "T") >> >> > design.invivo >> >> A R T >> >> 1 1 0 0 >> >> 2 1 0 0 >> >> 3 1 0 0 >> >> 4 1 0 0 >> >> 5 1 0 0 >> >> 6 1 0 0 >> >> 7 0 1 0 >> >> 8 0 1 0 >> >> 9 0 1 0 >> >> 10 0 1 0 >> >> 11 0 1 0 >> >> 12 0 1 0 >> >> 13 0 0 1 >> >> 14 0 0 1 >> >> 15 0 0 1 >> >> 16 0 0 1 >> >> 17 0 0 1 >> >> 18 0 0 1 >> >> attr(,"assign") >> >> [1] 1 1 1 >> >> attr(,"contrasts") >> >> attr(,"contrasts")$f.invivo >> >> [1] "contr.treatment" >> >> >> > >> >> > arrayw <- arrayWeightsSimple(**rmaOligoinvivo, design.invivo) >> >> > fit <- lmFit(rmaOligoinvivo, design.invivo, weights=arrayw) >> >> > arrayw >> >> 1 2 3 4 5 6 7 >> 8 9 10 11 12 13 14 >> >> 0.3749711 0.8578285 1.9289731 1.2390065 0.8116796 1.7846502 1.0741852 >> 1.4277605 0.6533368 0.7637412 1.2647738 1.4520790 0.8309346 0.9328655 >> >> 15 16 17 18 >> >> 1.1926458 0.7280477 0.5130294 1.8503073 >> >> > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = >> design.invivo) >> >> > fit2<-contrasts.fit(fit, contrast.matrix.invivo) >> >> > fit2 <- eBayes(fit2) >> >> > Looks good to me. > > Best, > > Jim > > > > >> >> Can anybody please suggest if I am doing it right? Actually being new in >> this I am bit afraid to make errors. >> Thanks, >> Suparna. >> >> On 16 October 2012 10:36, suparna mitra <smitra at="" liverpool.ac.uk="" <mailto:="">> smitra at liverpool.ac.uk**>> wrote: >> >> Dear James, >> Thanks for your suggestion. I was reading arrayWeights package >> in limma. >> But being novice in bioC I have one confusion. Should I >> perform arrayWeights on normalized (rmaOligo) expression data or >> on the raw data? >> >> This is what i have done so far: >> >> > sessionInfo() >> R version 2.15.1 (2012-06-22) >> Platform: i386-apple-darwin9.8.0/i386 (32-bit) >> >> locale: >> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.**UTF-8/C/en_GB.UTF-8/en_GB.UTF-**8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] statmod_1.4.15 limma_3.12.1 >> annotate_1.34.1 hugene10stprobeset.db_8.0.1 >> org.Hs.eg.db_2.7.1 >> [6] BiocInstaller_1.4.7 affycoretools_1.28.0 >> KEGG.db_2.7.1 GO.db_2.7.1 >> AnnotationDbi_1.18.1 >> [11] affy_1.34.0 Biobase_2.16.0 >> BiocGenerics_0.2.0 pd.hugene.1.0.st.v1_3.6.0 RSQLite_0.11.1 >> [16] DBI_0.2-5 oligo_1.20.4 >> oligoClasses_1.18.0 >> >> >> rmaOligoinvivo = oligo::rma(InVivodat1) >> Background correcting >> Normalizing >> Calculating Expression >> >> > maplot(rmaOligoinvivo) >> >hist(rmaOligoinvivo) >> > InVivoTargets=readTargets("~/**Desktop/Recent/Liverpool- work-** >> related/Micro_RawData/**InVivoTargets.txt") >> > InVivoTargets >> FileName Treatment >> 1 MC1 A >> 2 MC2 A >> 3 MC3 A >> 4 MC4 A >> 5 MC5 A >> 6 MC6 A >> 7 MC7 R >> 8 MC8 R >> 9 MC9 R >> 10 MC10 R >> 11 MC11 R >> 12 MC12 R >> 13 MC13 T >> 14 MC14 T >> 15 MC15 T >> 16 MC16 T >> 17 MC17 T >> 18 MC18 T >> >> f.invivo <- factor(InVivoTargets$**Treatment, levels = c("A", "R", >> "T")) >> >> design.invivo <- model.matrix(~0 + f.invivo) >> >> > colnames(design.invivo) <- c("A", "R", "T") >> >> > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo) >> >> > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = >> design.invivo) >> >> > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo) >> >> > fit2.invivo <-eBayes(fit2.invivo) >> >> Thanks a lot, >> Suparna. >> >> >> On 15 October 2012 14:33, James W. MacDonald <jmacdon at="" uw.edu="">> <mailto:jmacdon at="" uw.edu="">> wrote: >> >> Hi Suparna, >> >> >> On 10/15/2012 7:01 AM, suparna mitra wrote: >> >> Hi all, >> I have been working in a project where I have >> Affymetrix Hgene 1.0 St V1 >> data. And I have tree groups of patients having 6 samples >> each. I tried to >> perform rma normalization and to filter my data based on >> expression values >> 20%. After that went for unpaired t-test to test each two >> combination of >> groups. But the problem is my data is extremely variable. >> I have tried to filter my genes based on variance and/or >> CV before testing, >> to try to reduce the number of genes entering your test >> and multiple >> correction. But with different reasonable filtering also >> I am with no >> luck. And I don't have the option to increase sample size >> of my project. >> Further I tried to check for the bad samples and bad >> probes from >> experimentand remove outlier if these are not of interest. >> Still the same >> when run t-test (and other possible test like >> Mann-Whitney) with MTC there >> are no genes. >> On the other hand if I go on with out MTC and select a >> good p value cutoff >> and reasonable fold change I get a list of significant >> gene which may be >> good or reasonable for my study. but the problem is I >> somehow need to >> justify the method for my finding. Do you know any study >> or paper where >> anybody has treated their data without MTC? >> My main concern is if I find a good story matching >> biological prospective, >> would it be anyhow possible to justify the method without MTC? >> >> >> It's not clear to me what you are doing here - when you filter >> on variance are you keeping or removing the high variability >> genes (keeping, I hope)? I am also not sure what MTC stands >> for - is this multiple test correction? >> >> Anyway, assuming I have things correct, some suggestions. >> First, you might want to use array weights when fitting your >> model. If you have a lot of intra-group variability, this will >> tend to help. >> >> Second, the t-statistic is the universally most powerful test >> (assuming the underlying data are relatively hump-shaped), so >> going to a non-parametric test will usually reduce rather than >> increase power to detect differences. >> >> Third, univariate tests are arguably not the most >> sophisticated way of analyzing expression data, and you might >> get better (or at least more satisfactory) results if you >> instead looked at analyzing for groups of genes rather than >> individually. >> >> Depending on your experiment, you could accomplish this task >> with a gene set analysis (there are multiple ways of doing >> this - perhaps the easiest being romer() and roast() in >> limma), or if you have phenotypic data, especially continuous >> measures, a WGCNA analysis might be of some use. >> >> Best, >> >> Jim >> >> >> Thanks a lot, >> Suparna. >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <mailto:bioconductor at="" r-**="">> project.org <bioconductor at="" r-project.org="">> >> >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<ht tps:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.**science.biology.informatics.** >> conductor<http: news.gmane.org="" gmane.science.biology.informatics.c="" onductor=""> >> >> >> -- James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> >> >> >> -- Dr. Suparna Mitra >> Wolfson Centre for Personalised Medicine >> Department of Molecular and Clinical Pharmacology >> Institute of Translational Medicine University of Liverpool >> Block A: Waterhouse Buildings, L69 3GL Liverpool >> >> Tel. +44 (0)151 795 5394 <tel:%2b44%20%280%29151%20795%**205394>, >> Internal ext: 55394 >> M: +44 (0) 7511387895 <tel:%2b44%20%280%29%**207511387895> >> Email id: smitra at liverpool.ac.uk <mailto:smitra at="" liverpool.ac.uk**=""> >> Alternative Email id: suparna.mitra.sm at gmail.com >> <mailto:suparna.mitra.sm@**gmail.com <suparna.mitra.sm="" at="" gmail.com="">> >> >> >> >> >> >> -- >> Dr. Suparna Mitra >> Wolfson Centre for Personalised Medicine >> Department of Molecular and Clinical Pharmacology >> Institute of Translational Medicine University of Liverpool >> Block A: Waterhouse Buildings, L69 3GL Liverpool >> >> Tel. +44 (0)151 795 5394, Internal ext: 55394 >> M: +44 (0) 7511387895 >> Email id: smitra at liverpool.ac.uk <mailto:smitra at="" liverpool.ac.uk**=""> >> Alternative Email id: suparna.mitra.sm at gmail.com <mailto:>> suparna.mitra.sm@**gmail.com <suparna.mitra.sm at="" gmail.com="">> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- Dr. Suparna Mitra Wolfson Centre for Personalised Medicine Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings, L69 3GL Liverpool Tel. +44 (0)151 795 5394, Internal ext: 55394 M: +44 (0) 7511387895 Email id: smitra at liverpool.ac.uk Alternative Email id: suparna.mitra.sm at gmail.com -------------- next part -------------- A non-text attachment was scrubbed... Name: Rplot.pdf Type: application/pdf Size: 75900 bytes Desc: not available URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20121016="" 549060c6="" attachment.pdf="">

ADD REPLY • link 11.6 years ago suparna mitra ▴ 290

0

Entering edit mode

Hi Suparna, On 10/16/2012 11:57 AM, suparna mitra wrote: > Dear James, > Thanks for the support. But after doing these step also still no > significant genes (see attached ven diagram as all 0). I realize my > data is very variable. But isn't there any fix? As disheartening as this is, there is no magic bullet to 'fix' your results. You should note two things, however. First, there is a difference between statistical difference and no differences. There may be some genes in the top N that your collaborators might be interested in, given the goals of the experiment. Remember that a microarray experiment isn't the end - it is the beginning. You still have some hints here that might end up being validated, and could make a compelling story. Second, just because individual gene comparisons are not working out doesn't mean that there aren't sets of genes that are significant. You can always do gene set analyses (as I already recommended, IIRC). Note that there are literally thousands of gene sets these days, so you might be better off selecting a few that are biologically plausible rather than just trying them all. See ?romer or ?roast. Best, Jim > Thanks a lot, > Suparna. > > > topTable(fit2, coef = 1, adjust = "fdr") > > ID logFC AveExpr t P.Value adj.P.Val > B > > 6238 7917530 0.6251124 11.170012 5.592012 2.536230e-05 0.6518774 > -0.0171572 > > 11556 7970507 0.9123944 7.490579 5.057525 7.969967e-05 0.6518774 > -0.5140431 > > 15234 8007228 0.6400697 9.710164 4.854888 1.239777e-04 0.6518774 > -0.7143350 > > 8819 7943047 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 > -0.8148014 > > 9675 7950951 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 > -0.8148014 > > 18889 8043581 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 > -0.8148014 > > 19899 8053785 -0.3189082 4.177681 -4.755607 1.541497e-04 0.6518774 > -0.8148014 > > 6239 7917532 0.7207256 10.449577 4.677368 1.831220e-04 0.6518774 > -0.8950366 > > 25845 8113130 0.6264274 8.957173 4.640816 1.984975e-04 0.6518774 > -0.9328368 > > 3759 7896284 -0.8301988 5.727302 -4.616141 2.096126e-04 0.6518774 > -0.9584680 > > > topTable(fit2, coef = 2, adjust = "fdr") > > ID logFC AveExpr t P.Value adj.P.Val > B > > 2088 7894602 0.5606499 2.841855 5.401484 3.800711e-05 0.9778984 > -1.825861 > > 685 7893190 -0.5281344 6.726990 -4.966811 9.708481e-05 0.9778984 > -2.059423 > > 6238 7917530 -0.5550876 11.170012 -4.786129 1.441524e-04 0.9778984 > -2.162851 > > 621 7893126 -0.6332961 4.412764 -4.785753 1.442714e-04 0.9778984 > -2.163070 > > 26642 8120756 -1.2198288 5.439265 -4.615075 2.101065e-04 0.9778984 > -2.264206 > > 1687 7894197 -1.0441762 2.631359 -4.526834 2.553958e-04 0.9778984 > -2.317791 > > 20947 8065084 -0.4297158 6.630412 -4.274936 4.470448e-04 0.9778984 > -2.475534 > > 154 7892657 0.9444466 3.997249 4.150578 5.900236e-04 0.9778984 > -2.555951 > > 20151 8056222 -0.8638926 7.892249 -4.144942 5.974996e-04 0.9778984 > -2.559635 > > 7851 7933619 0.4101773 8.480778 4.128728 6.195398e-04 0.9778984 > -2.570249 > > > topTable(fit2, coef = 3, adjust = "fdr") > > ID logFC AveExpr t P.Value adj.P.Val > B > > 6210 7917182 -0.3334646 3.273225 -5.847896 1.483281e-05 0.2755645 > 2.2740621 > > 27812 8132245 -0.5028082 5.409405 -5.795271 1.655191e-05 0.2755645 > 2.1981127 > > 2366 7894884 0.6507323 8.436001 5.335322 4.378172e-05 0.3234851 > 1.5133904 > > 26802 8122099 -0.4655070 4.548920 -5.279910 4.930640e-05 0.3234851 > 1.4284277 > > 587 7893092 -1.0604644 6.013864 -5.143614 6.613900e-05 0.3234851 > 1.2172775 > > 2562 7895081 0.6962641 6.898546 4.999306 9.045119e-05 0.3234851 > 0.9904391 > > 867 7893372 1.2334593 3.017891 4.971552 9.608676e-05 0.3234851 > 0.9464374 > > 685 7893190 -0.5196216 6.726990 -4.948033 1.011423e-04 0.3234851 > 0.9090560 > > 808 7893313 0.9743437 7.938503 4.893437 1.139496e-04 0.3234851 > 0.8219582 > > 15234 8007228 0.6486240 9.710164 4.801424 1.393931e-04 0.3234851 > 0.6741646 > > > > > > results <- decideTests(fit2) > > > > > > vennDiagram(results) > > > On 16 October 2012 14:48, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > > Hi Suparna, > > > On 10/16/2012 5:58 AM, suparna mitra wrote: > > Hello group, > Related to my previous post, I further tried arrayweight as: > > > f.invivo <- factor(InVivoTargets$Treatment, levels = c("A", > "R", "T")) > > > design.invivo <- model.matrix(~0 + f.invivo) > > > colnames(design.invivo) <- c("A", "R", "T") > > > design.invivo > > A R T > > 1 1 0 0 > > 2 1 0 0 > > 3 1 0 0 > > 4 1 0 0 > > 5 1 0 0 > > 6 1 0 0 > > 7 0 1 0 > > 8 0 1 0 > > 9 0 1 0 > > 10 0 1 0 > > 11 0 1 0 > > 12 0 1 0 > > 13 0 0 1 > > 14 0 0 1 > > 15 0 0 1 > > 16 0 0 1 > > 17 0 0 1 > > 18 0 0 1 > > attr(,"assign") > > [1] 1 1 1 > > attr(,"contrasts") > > attr(,"contrasts")$f.invivo > > [1] "contr.treatment" > > > > > > > arrayw <- arrayWeightsSimple(rmaOligoinvivo, design.invivo) > > > fit <- lmFit(rmaOligoinvivo, design.invivo, weights=arrayw) > > > arrayw > > 1 2 3 4 5 6 > 7 8 9 10 11 12 > 13 14 > > 0.3749711 0.8578285 1.9289731 1.2390065 0.8116796 1.7846502 > 1.0741852 1.4277605 0.6533368 0.7637412 1.2647738 1.4520790 > 0.8309346 0.9328655 > > 15 16 17 18 > > 1.1926458 0.7280477 0.5130294 1.8503073 > > > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels > = design.invivo) > > > fit2<-contrasts.fit(fit, contrast.matrix.invivo) > > > fit2 <- eBayes(fit2) > > > Looks good to me. > > Best, > > Jim > > > > > > Can anybody please suggest if I am doing it right? Actually > being new in this I am bit afraid to make errors. > Thanks, > Suparna. > > On 16 October 2012 10:36, suparna mitra > <smitra at="" liverpool.ac.uk="" <mailto:smitra="" at="" liverpool.ac.uk=""> > <mailto:smitra at="" liverpool.ac.uk=""> <mailto:smitra at="" liverpool.ac.uk="">>> wrote: > > Dear James, > Thanks for your suggestion. I was reading arrayWeights > package > in limma. > But being novice in bioC I have one confusion. Should I > perform arrayWeights on normalized (rmaOligo) expression > data or > on the raw data? > > This is what i have done so far: > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] > en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] statmod_1.4.15 limma_3.12.1 > annotate_1.34.1 hugene10stprobeset.db_8.0.1 > org.Hs.eg.db_2.7.1 > [6] BiocInstaller_1.4.7 affycoretools_1.28.0 > KEGG.db_2.7.1 GO.db_2.7.1 > AnnotationDbi_1.18.1 > [11] affy_1.34.0 Biobase_2.16.0 > BiocGenerics_0.2.0 pd.hugene.1.0.st.v1_3.6.0 > RSQLite_0.11.1 > [16] DBI_0.2-5 oligo_1.20.4 > oligoClasses_1.18.0 > > > rmaOligoinvivo = oligo::rma(InVivodat1) > Background correcting > Normalizing > Calculating Expression > > > maplot(rmaOligoinvivo) > >hist(rmaOligoinvivo) > > > InVivoTargets=readTargets("~/Desktop/Recent/Liverpool-work- related/Micro_RawData/InVivoTargets.txt") > > InVivoTargets > FileName Treatment > 1 MC1 A > 2 MC2 A > 3 MC3 A > 4 MC4 A > 5 MC5 A > 6 MC6 A > 7 MC7 R > 8 MC8 R > 9 MC9 R > 10 MC10 R > 11 MC11 R > 12 MC12 R > 13 MC13 T > 14 MC14 T > 15 MC15 T > 16 MC16 T > 17 MC17 T > 18 MC18 T > > f.invivo <- factor(InVivoTargets$Treatment, levels = > c("A", "R", "T")) > > design.invivo <- model.matrix(~0 + f.invivo) > > > colnames(design.invivo) <- c("A", "R", "T") > > > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo) > > > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels = > design.invivo) > > > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo) > > > fit2.invivo <-eBayes(fit2.invivo) > > Thanks a lot, > Suparna. > > > On 15 October 2012 14:33, James W. MacDonald > <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu=""> > <mailto:jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu="">>> wrote: > > Hi Suparna, > > > On 10/15/2012 7:01 AM, suparna mitra wrote: > > Hi all, > I have been working in a project where I have > Affymetrix Hgene 1.0 St V1 > data. And I have tree groups of patients having 6 > samples > each. I tried to > perform rma normalization and to filter my data > based on > expression values > 20%. After that went for unpaired t-test to test > each two > combination of > groups. But the problem is my data is extremely > variable. > I have tried to filter my genes based on variance > and/or > CV before testing, > to try to reduce the number of genes entering your > test > and multiple > correction. But with different reasonable > filtering also > I am with no > luck. And I don't have the option to increase > sample size > of my project. > Further I tried to check for the bad samples and bad > probes from > experimentand remove outlier if these are not of > interest. > Still the same > when run t-test (and other possible test like > Mann-Whitney) with MTC there > are no genes. > On the other hand if I go on with out MTC and select a > good p value cutoff > and reasonable fold change I get a list of significant > gene which may be > good or reasonable for my study. but the problem is I > somehow need to > justify the method for my finding. Do you know any > study > or paper where > anybody has treated their data without MTC? > My main concern is if I find a good story matching > biological prospective, > would it be anyhow possible to justify the method > without MTC? > > > It's not clear to me what you are doing here - when > you filter > on variance are you keeping or removing the high > variability > genes (keeping, I hope)? I am also not sure what MTC > stands > for - is this multiple test correction? > > Anyway, assuming I have things correct, some suggestions. > First, you might want to use array weights when > fitting your > model. If you have a lot of intra-group variability, > this will > tend to help. > > Second, the t-statistic is the universally most > powerful test > (assuming the underlying data are relatively > hump-shaped), so > going to a non-parametric test will usually reduce > rather than > increase power to detect differences. > > Third, univariate tests are arguably not the most > sophisticated way of analyzing expression data, and > you might > get better (or at least more satisfactory) results if you > instead looked at analyzing for groups of genes rather > than > individually. > > Depending on your experiment, you could accomplish > this task > with a gene set analysis (there are multiple ways of doing > this - perhaps the easiest being romer() and roast() in > limma), or if you have phenotypic data, especially > continuous > measures, a WGCNA analysis might be of some use. > > Best, > > Jim > > > Thanks a lot, > Suparna. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > <mailto:bioconductor at="" r-project.org=""> <mailto:bioconductor at="" r-project.org="">> > > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > > -- Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5394 > <tel:%2b44%20%280%29151%20795%205394> > <tel:%2b44%20%280%29151%20795%205394>, > Internal ext: 55394 > M: +44 (0) 7511387895 <tel:%2b44%20%280%29%207511387895> > <tel:%2b44%20%280%29%207511387895> > Email id: smitra at liverpool.ac.uk > <mailto:smitra at="" liverpool.ac.uk=""> <mailto:smitra at="" liverpool.ac.uk=""> <mailto:smitra at="" liverpool.ac.uk="">> > Alternative Email id: suparna.mitra.sm at gmail.com > <mailto:suparna.mitra.sm at="" gmail.com=""> > <mailto:suparna.mitra.sm at="" gmail.com=""> <mailto:suparna.mitra.sm at="" gmail.com="">> > > > > > > -- > Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5394 > <tel:%2b44%20%280%29151%20795%205394>, Internal ext: 55394 > M: +44 (0) 7511387895 <tel:%2b44%20%280%29%207511387895> > Email id: smitra at liverpool.ac.uk > <mailto:smitra at="" liverpool.ac.uk=""> <mailto:smitra at="" liverpool.ac.uk=""> <mailto:smitra at="" liverpool.ac.uk="">> > Alternative Email id: suparna.mitra.sm at gmail.com > <mailto:suparna.mitra.sm at="" gmail.com=""> > <mailto:suparna.mitra.sm at="" gmail.com=""> <mailto:suparna.mitra.sm at="" gmail.com="">> > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > > -- > Dr. Suparna Mitra > Wolfson Centre for Personalised Medicine > Department of Molecular and Clinical Pharmacology > Institute of Translational Medicine University of Liverpool > Block A: Waterhouse Buildings, L69 3GL Liverpool > > Tel. +44 (0)151 795 5394, Internal ext: 55394 > M: +44 (0) 7511387895 > Email id: smitra at liverpool.ac.uk <mailto:smitra at="" liverpool.ac.uk=""> > Alternative Email id: suparna.mitra.sm at gmail.com > <mailto:suparna.mitra.sm at="" gmail.com=""> > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 11.6 years ago James W. MacDonald 65k

Login before adding your answer.