differntially expressed gene

0

Entering edit mode

Prasad Siddavatam ▴ 150

@prasad-siddavatam-4508

Last seen 9.4 years ago

United States

Hi, I sincerely thank Dr.Brown for his response to my earlier post on design matrix. I have two questions....1. about the discrepancies in my DE results 2. B-values These are my target files FileName Cy3 Cy5 Original Names HIDEN_1.gpr Ref HI_Inf Heat_Inactivated_1 HIDEN_2.gpr Ref HI_Inf Heat_Inactivated_2 HIDEN_3.gpr Ref HI_Inf Heat_Inactivated_3 infected_1.gpr Ref Infect Live_Infection_1 infected_2.gpr Ref Infect Live_Infection_2 infected_3.gpr Ref Infect Live_Infection_3 design: HI_Inf Infect 1 0 1 0 1 0 0 1 0 1 0 1 contrast: Contrasts Levels HI_INF INF INFvsHI_INF HI_Inf 1 0 -1 Infect 0 1 1 When I used the above matrix and contrasts, I found 270 and 2484 DE genes (for HI_INF and INF, respectively). But when I divided the data into two separate analyses 1. For HI_INF I found 608 DE genes HIDEN_1.gpr Ref HI_Inf Heat_Inactivated_1 HIDEN_2.gpr Ref HI_Inf Heat_Inactivated_2 HIDEN_3.gpr Ref HI_Inf Heat_Inactivated_3 2. For INF I found 868 DE genes infected_1.gpr Ref Infect Live_Infection_1 infected_2.gpr Ref Infect Live_Infection_2 infected_3.gpr Ref Infect Live_Infection_3 Why is this difference? technically those should be same because rest of the steps were similar between the two. ---------------------------------------------------------------------- -- I also found that there are some genes with negative "B Values" but < 0.5 adjusted p.values and p.values...see below logFC t P.Value adj.P.Val B -0.6740520 -3.655211 0.006576695 0.04978582 -2.453619 -0.3866386 -3.655013 0.006578564 0.04978582 -2.453912 -0.6554844 -3.652845 0.006599049 0.04992410 -2.457116 In this case, can I delete the genes with negative B-values though adjusted p.values and p.values are < 0.05? Your suggestions are highly appreciated -regards Prasad

• 758 views

ADD COMMENT • link updated 13.2 years ago by James W. MacDonald 65k • written 13.2 years ago by Prasad Siddavatam ▴ 150

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 12 hours ago

United States

Hi Prasad, On 3/9/2011 11:56 AM, Prasad Siddavatam wrote: > Hi, > > I sincerely thank Dr.Brown for his response to my earlier post on design matrix. > > I have two questions....1. about the discrepancies in my DE results > 2. B-values > These are my target files > FileName Cy3 Cy5 Original Names > HIDEN_1.gpr Ref HI_Inf Heat_Inactivated_1 > HIDEN_2.gpr Ref HI_Inf Heat_Inactivated_2 > HIDEN_3.gpr Ref HI_Inf Heat_Inactivated_3 > infected_1.gpr Ref Infect Live_Infection_1 > infected_2.gpr Ref Infect Live_Infection_2 > infected_3.gpr Ref Infect Live_Infection_3 > > design: > HI_Inf Infect > 1 0 > 1 0 > 1 0 > 0 1 > 0 1 > 0 1 > contrast: > Contrasts > Levels HI_INF INF INFvsHI_INF > HI_Inf 1 0 -1 > Infect 0 1 1 > When I used the above matrix and contrasts, I found 270 and 2484 DE genes (for > HI_INF and INF, respectively). > > But when I divided the data into two separate analyses > 1. For HI_INF I found 608 DE genes > HIDEN_1.gpr Ref HI_Inf Heat_Inactivated_1 > HIDEN_2.gpr Ref HI_Inf Heat_Inactivated_2 > HIDEN_3.gpr Ref HI_Inf Heat_Inactivated_3 > 2. For INF I found 868 DE genes > infected_1.gpr Ref Infect Live_Infection_1 > infected_2.gpr Ref Infect Live_Infection_2 > infected_3.gpr Ref Infect Live_Infection_3 > > Why is this difference? technically those should be same because rest of the > steps were similar between the two. Actually they shouldn't be the same. This has to do with the denominator of the t-statistic you are computing. Recall that the denominator of a t-statistic acts as a 'yardstick', allowing us to determine if a given difference in means is larger than expected under the null. In the first case above, the denominator is computed using all six arrays (if doing a conventional ANOVA, this is the sums of squares for error or SSE). In the second case, the denominator is computed using just the three arrays under consideration (the standard error of the mean or SEM). Because there are fewer arrays, this estimator will have fewer degrees of freedom, and hence will be less powerful. As for why the difference, I wonder if the live_infection arrays are much noiser than the heat_inactivated arrays. This could explain why you see the varying number of significant genes. Best, Jim > -------------------------------------------------------------------- ---- > I also found that there are some genes with negative "B Values" but< 0.5 > adjusted p.values and p.values...see below > logFC t P.Value adj.P.Val B > -0.6740520 -3.655211 0.006576695 0.04978582 -2.453619 > -0.3866386 -3.655013 0.006578564 0.04978582 -2.453912 > -0.6554844 -3.652845 0.006599049 0.04992410 -2.457116 > > In this case, can I delete the genes with negative B-values though adjusted > p.values and p.values are< 0.05? > > Your suggestions are highly appreciated > -regards > Prasad > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD COMMENT • link 13.2 years ago James W. MacDonald 65k

Login before adding your answer.