Repeated gene entries on Toptable by limma

0

Entering edit mode

Marcos Pinho ▴ 200

@marcos-pinho-3584

Last seen 9.6 years ago

Dear list, I am a new user to the limma package for differential expression analysis and have recently noticed that when I generated my toptable with the 50 most differentially expressed genes that I have the same gene repeated more than once with diferent p values. Could someone suggest how to overcome this situation? Which values should I consider? Is there a way to condense this multiple entries into a single value for differential expression? Regards, Marcos B. Pinho Programa de Engenharia Química - PEQ Laboratório de Engenharia de Cultivos Celulares- LECC Universidade Federal do Rio de Janeiro - UFRJ Instituto Nacional de Câncer - INCA Rio de Janeiro - Brasil [[alternative HTML version deleted]]

limma limma • 940 views

ADD COMMENT • link updated 14.6 years ago by Tefina Paloma ▴ 220 • written 14.6 years ago by Marcos Pinho ▴ 200

0

Entering edit mode

Chao-Jen Wong ▴ 580

@chao-jen-wong-3603

Last seen 9.3 years ago

USA/Seattle/Fred Hutchinson Cancer Rese…

Hi, Marcos, There are several ways you can do it. The easiest way is to to perform some non-specific filtering using 'nsFilter' or 'featureFilter' functions from the genefilter package. Assuming the probes set has one-to-one mapping onto Entrez ID (there are some exception, but rarely), you can remove probes that have duplicate Entrez ID by nsFilter(eset, remove.dupEntrez=TRUE, ...) or featureFilter(eset, remove.dupEntrez=TRUE,...) You can also manually pick the probe that has highest variation among its duplicates before performing downstream analysis (limma). Hope this would help. Cheers, Chao-Jen Marcos Pinho wrote: > Dear list, > > I am a new user to the limma package for differential expression analysis > and have recently noticed that when I generated my toptable with the 50 most > differentially expressed genes that I have the same gene repeated more than > once with diferent p values. Could someone suggest how to overcome this > situation? Which values should I consider? Is there a way to condense this > multiple entries into a single value for differential expression? > > Regards, > > Marcos B. Pinho > Programa de Engenharia Qu?mica - PEQ > Laborat?rio de Engenharia de Cultivos Celulares- LECC > Universidade Federal do Rio de Janeiro - UFRJ > Instituto Nacional de C?ncer - INCA > Rio de Janeiro - Brasil > > [[alternative HTML version deleted]] > > > -------------------------------------------------------------------- ---- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Chao-Jen Wong Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., M2-B876 PO Box 19024 Seattle, WA 98109 206.667.4485 cwon2 at fhcrc.org

ADD COMMENT • link 14.6 years ago Chao-Jen Wong ▴ 580

0

Entering edit mode

Tefina Paloma ▴ 220

@tefina-paloma-3676

Last seen 9.6 years ago

Hi, if your are working with affymetrix arrays, you might also consider a custom CDF. Custom CDFs just represent an alternative to the affymetrix cdf, more up-to-date and the probe mapping is based on e.g. refseq or ensemble IDs. As far as I know these CDFs do not contain double entries. best, Tefina [[alternative HTML version deleted]]

ADD COMMENT • link 14.6 years ago Tefina Paloma ▴ 220

Login before adding your answer.