After searching about filtering method for DEG analysis, finally I decide to to use panp package for my work,I exclude those prob-IDs that present in less than 10% of each groups(control vs basal like breast cancer), although in 100 prob-IDs that came from taptable function in limma,there are a lot of common genes(with variation in position in ranked list) in two list that come from toptable on 1- unfilteret and 2- filtered dataset, but after validation of these to gene list with survival analysis on an independent dataset, I found that my filtered data gives me a better output , but its p-values seems insignificant
output from unfiltered dataset test:
P.Value adj.P.Val B
213706_at 4.95E-49 1.10E-44 100.5409
204388_s_at 1.98E-48 2.21E-44 99.17014
43427_at 5.43E-48 4.03E-44 98.17258
221928_at 9.04E-48 5.04E-44 97.66887
201890_at 2.02E-47 9.02E-44 96.87277
204997_at 4.50E-47 1.67E-43 96.08317
207092_at 1.36E-46 4.31E-43 94.99363
205913_at 2.05E-46 5.70E-43 94.58617
49452_at 2.58E-46 6.39E-43 94.35622
204389_at 2.01E-45 4.48E-42 92.32877
218039_at 3.72E-45 7.53E-42 91.72103
206030_at 1.46E-44 2.70E-41 90.37272
212741_at 1.89E-44 3.24E-41 90.11262
208383_s_at 5.34E-44 8.50E-41 89.08835
output from filtered dataset
design <- model.matrix(~factor(filtereddata$Disease)) fit1 <- lmFit(filtereddata,design) ebayes1 <- eBayes(fit1) tab1 <- topTable(ebayes1, coef=2, adjust="fdr", n=150)
P.Value adj.P.Val B
221928_at 0.097476 0.999917 -4.29479
204570_at 0.10648 0.999917 -4.35075
49452_at 0.116574 0.999917 -4.40763
212741_at 0.116969 0.999917 -4.40974
205913_at 0.124306 0.999917 -4.44763
210298_x_at 0.128623 0.999917 -4.46879
213071_at 0.128718 0.999917 -4.46925
221747_at 0.128972 0.999917 -4.47046
216331_at 0.135798 0.999917 -4.50225
205382_s_at 0.13713 0.999917 -4.50825
221748_s_at 0.137664 0.999917 -4.51064
203548_s_at 0.140671 0.999917 -4.52388
now I want to know your opinion about this results??
thanks in advance
After I merge two eset with COMBAT method in insilicomerging package, my merged eset has 30 control ,60 basal like breast cancer and 60 luminal A in the next step I subset the merge eset into 3 esets that each one is include control, basal like and luminalA(because I thought it's better i.e. to filter non-informative gene in each group than performing a blind filtering on all group)
here is my codes ,maybe in the next step I should not use COMBAT method for merging my esets
###########after I did the same codes for basal group,again I merged them to get an eset with 2 filtered group(control vs basal like)
Thank you very much