limma for identifying differentially expressed genes from illumina data

0

Entering edit mode

Md.Mamunur Rashid ▴ 260

@mdmamunur-rashid-3595

Last seen 11.2 years ago

Dear All, I am working with a set of illumina microarray data (96 samples) from three groups (i.e. group-1(X) group-2 (Y), group-3(Z)). I have read the data using lumiR methoda and normalized the data using lumi Methods. Now I need to identify the differentially expressed genes by comparing each of these groups with each other. I am using linear model fit in limma package and topTable method to identify top N differentially expressed genes. 1. When I am adjusting the p value using "BH" method in the topTable method the adj.p.value is getting too high as a result none of the genes are getting selected with threshold p.value = 0.05 . 2.* *The logfold change values are very low. I have tried comparing all the 3 combination and the situation is more or less similar. Does this indicate that none of the genes are not differentially expressed at all!!! (Which might be a odd) or I am doing something wrong???!!! Please I will really appreciate if any body can give any advice. Thanks in advance. regards, Md. Mamunur Rashid **************************************************** I have attached the code and the result below. ****************************************************** ## norm_object is the normalized object d_Matrix <- exprs(norm_object) probeList <- rownames(d_Matrix) ## 32 samples from each group without any pair sampleType <- c("X","X","Y","Y",..........96 samples.......... ,"Z","X","Y","Y","X","X","Z","I","X") design <- model.matrix(~0+sampleType) colnames(design_norm_test) <- c('X','Y','Z') fit1 <- lmFit(d_Matrix,design) constrast.matrix <- makeContrasts (Y-X , Z-Y , Z-X, levels=design) fit1_2 <- contrasts.fit(fit1,contrast.matrix) fit1_2 <- eBayes(fit1_2) topTable(fit1_2,coef=1, adjust="BH") > ID logFC AveExpr t P.Value adj.P.Val 6284 ILMN_1111111 0.11999169 6.341387 4.828711 5.237786e-06 0.2325975 12919 ILMN_2222222 -0.05966259 6.187268 -4.678886 9.532099e-06 0.2325975 6928 ILMN_3333333 -0.31283278 6.881315 -4.561366 1.513503e-05 0.2462115 42428 ILMN_4444444 -0.13036276 6.815443 -4.288051 4.321272e-05 0.3964163 36153 ILMN_5555555 0.25070344 6.487644 4.190735 6.220719e-05 0.3964163 36152 ILMN_6666666 0.21502145 6.470917 4.158153 7.019901e-05 0.3964163 28506 ILMN_7777777 0.13918530 6.616036 4.158140 7.020219e-05 0.3964163 11763 ILMN_8888888 -0.17331384 7.322021 -4.154668 7.110990e-05 0.3964163 38906 ILMN_9999999 0.05532714 6.224477 4.093623 8.903425e-05 0.3964163 4728 ILMN_0000000 0.05371882 6.177268 4.081921 9.293339e-05 0.3964163 B 12919 3.236579 6928 2.801832 42428 1.818263 36153 1.477781 36152 1.364969 28506 1.364927 11763 1.352940 38906 1.143329 4728 1.103392

Microarray limma lumi Microarray limma lumi • 1.5k views

ADD COMMENT • link updated 16.2 years ago by Wei Shi ★ 3.6k • written 16.2 years ago by Md.Mamunur Rashid ▴ 260

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 3 months ago

Australia/Melbourne

Dear Md.Mamunur Rashid: Perhaps you can try filter out probes which do not express in all the arrays. You can use the detection p value to do this. The detection p value of a probe is the proportion of negative control probes which have intensities larger than that probe. You can filter out those probes which have detection p values larger than a cutoff (e.g. 0.01) in all the arrays. This might help you find some differentially expressed genes. Cheers, Wei Md.Mamunur Rashid wrote: > Dear All, > > I am working with a set of illumina microarray data (96 samples) from > three groups (i.e. group-1(X) group-2 (Y), group-3(Z)). I have read > the data using lumiR methoda and normalized the data using lumi > Methods. Now I need to identify the differentially expressed genes by > comparing each of these groups with each other. I am using linear > model fit in limma package and topTable method to identify top N > differentially > expressed genes. > > > 1. When I am adjusting the p value using "BH" method in the topTable > method the adj.p.value is getting too high > as a result none of the genes are getting selected with threshold > p.value = 0.05 . 2.* *The logfold change values are very low. > I have tried comparing all the 3 combination and the situation is more > or less similar. > Does this indicate that none of the genes are not differentially > expressed at all!!! > (Which might be a odd) or I am doing something wrong???!!! > Please I will really appreciate if any body can give any advice. > > Thanks in advance. > > regards, > Md. Mamunur Rashid > > **************************************************** > I have attached the code and the result below. > ****************************************************** > > ## norm_object is the normalized object > > d_Matrix <- exprs(norm_object) > probeList <- rownames(d_Matrix) > > ## 32 samples from each group without any pair > sampleType <- c("X","X","Y","Y",..........96 samples.......... > ,"Z","X","Y","Y","X","X","Z","I","X") design <- > model.matrix(~0+sampleType) > colnames(design_norm_test) <- c('X','Y','Z') > fit1 <- lmFit(d_Matrix,design) > constrast.matrix <- makeContrasts (Y-X , Z-Y , Z-X, levels=design) > fit1_2 <- contrasts.fit(fit1,contrast.matrix) > fit1_2 <- eBayes(fit1_2) > topTable(fit1_2,coef=1, adjust="BH") > >> > > ID logFC AveExpr t P.Value adj.P.Val > 6284 ILMN_1111111 0.11999169 6.341387 4.828711 5.237786e-06 > 0.2325975 12919 ILMN_2222222 -0.05966259 6.187268 -4.678886 > 9.532099e-06 0.2325975 > 6928 ILMN_3333333 -0.31283278 6.881315 -4.561366 1.513503e-05 0.2462115 > 42428 ILMN_4444444 -0.13036276 6.815443 -4.288051 4.321272e-05 0.3964163 > 36153 ILMN_5555555 0.25070344 6.487644 4.190735 6.220719e-05 0.3964163 > 36152 ILMN_6666666 0.21502145 6.470917 4.158153 7.019901e-05 0.3964163 > 28506 ILMN_7777777 0.13918530 6.616036 4.158140 7.020219e-05 0.3964163 > 11763 ILMN_8888888 -0.17331384 7.322021 -4.154668 7.110990e-05 0.3964163 > 38906 ILMN_9999999 0.05532714 6.224477 4.093623 8.903425e-05 0.3964163 > 4728 ILMN_0000000 0.05371882 6.177268 4.081921 9.293339e-05 0.3964163 > B > > 12919 3.236579 > 6928 2.801832 > 42428 1.818263 > 36153 1.477781 > 36152 1.364969 > 28506 1.364927 > 11763 1.352940 > 38906 1.143329 > 4728 1.103392 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 16.2 years ago Wei Shi ★ 3.6k

0

Entering edit mode

With Illumina arrays I usually filter out all the un-annotated probes before doing the statistical analysis. Especially in the 48k versions of these arrays about half the probes are not annotated, and for most purposes these are not interesting. Furthermore you should have a look at the quality of the arrays. -The average expression seems a bit low, but this could also be caused by the normalization method. -Check if the order of samples in the lumi object is the same as you define in the script. It is usually better to incorporate information like this in the phenoData slot of your object, and pull it out when defining groups for the analysis. Jan In your case filtering the probes would look something like this: library(illuminaHumanv3BeadID.db) # get annotations x <- illuminaHumanv3BeadIDSYMBOL annotated_ids <- mappedkeys(x) d_Matrix <- exprs(norm_object) probeList <- rownames(d_Matrix) idx<- probeList %in% annotated_ids d_matrix<-d_matrix[idx,] ## 32 samples from each group without any pair design <-model.matrix(~0+pData(norm_object)$sampleType) colnames(design) <- c('X','Y','Z') fit1 <- lmFit(d_Matrix,design) constrast.matrix <- makeContrasts (Y-X , Z-Y , Z-X, levels=design) fit1_2 <- contrasts.fit(fit1,contrast.matrix) fit1_2 <- eBayes(fit1_2) topTable(fit1_2,coef=1, adjust="BH") > Dear Md.Mamunur Rashid: > > Perhaps you can try filter out probes which do not express in all > the arrays. You can use the detection p value to do this. The detection > p value of a probe is the proportion of negative control probes which > have intensities larger than that probe. You can filter out those probes > which have detection p values larger than a cutoff (e.g. 0.01) in all > the arrays. This might help you find some differentially expressed genes. > > Cheers, > Wei > > Md.Mamunur Rashid wrote: > > Dear All, > > > > I am working with a set of illumina microarray data (96 samples) from > > three groups (i.e. group-1(X) group-2 (Y), group-3(Z)). I have read > > the data using lumiR methoda and normalized the data using lumi > > Methods. Now I need to identify the differentially expressed genes by > > comparing each of these groups with each other. I am using linear > > model fit in limma package and topTable method to identify top N > > differentially > > expressed genes. > > > > > > 1. When I am adjusting the p value using "BH" method in the topTable > > method the adj.p.value is getting too high > > as a result none of the genes are getting selected with threshold > > p.value = 0.05 . 2.* *The logfold change values are very low. > > I have tried comparing all the 3 combination and the situation is more > > or less similar. > > Does this indicate that none of the genes are not differentially > > expressed at all!!! > > (Which might be a odd) or I am doing something wrong???!!! > > Please I will really appreciate if any body can give any advice. > > > > **************************************************** > > I have attached the code and the result below. > > ****************************************************** > > > > ## norm_object is the normalized object > > > > d_Matrix <- exprs(norm_object) > > probeList <- rownames(d_Matrix) > > > > ## 32 samples from each group without any pair > > sampleType <- c("X","X","Y","Y",..........96 samples.......... > > ,"Z","X","Y","Y","X","X","Z","I","X") design <- > > model.matrix(~0+sampleType) > > colnames(design_norm_test) <- c('X','Y','Z') > > fit1 <- lmFit(d_Matrix,design) > > constrast.matrix <- makeContrasts (Y-X , Z-Y , Z-X, levels=design) > > fit1_2 <- contrasts.fit(fit1,contrast.matrix) > > fit1_2 <- eBayes(fit1_2) > > topTable(fit1_2,coef=1, adjust="BH") > > > >> > > > > ID logFC AveExpr t P.Value adj.P.Val > > 6284 ILMN_1111111 0.11999169 6.341387 4.828711 5.237786e-06 > > 0.2325975 12919 ILMN_2222222 -0.05966259 6.187268 -4.678886 > > 9.532099e-06 0.2325975 > > 6928 ILMN_3333333 -0.31283278 6.881315 -4.561366 1.513503e-05 0.2462115 > > 42428 ILMN_4444444 -0.13036276 6.815443 -4.288051 4.321272e-05 0.3964163 > > 36153 ILMN_5555555 0.25070344 6.487644 4.190735 6.220719e-05 0.3964163 > > 36152 ILMN_6666666 0.21502145 6.470917 4.158153 7.019901e-05 0.3964163 > > 28506 ILMN_7777777 0.13918530 6.616036 4.158140 7.020219e-05 0.3964163 > > 11763 ILMN_8888888 -0.17331384 7.322021 -4.154668 7.110990e-05 0.3964163 > > 38906 ILMN_9999999 0.05532714 6.224477 4.093623 8.903425e-05 0.3964163 > > 4728 ILMN_0000000 0.05371882 6.177268 4.081921 9.293339e-05 0.3964163 > > B > > > > 12919 3.236579 > > 6928 2.801832 > > 42428 1.818263 > > 36153 1.477781 > > 36152 1.364969 > > 28506 1.364927 > > 11763 1.352940 > > 38906 1.143329 > > 4728 1.103392

ADD REPLY • link 16.2 years ago Oosting, J. PATH ▴ 550

Login before adding your answer.