Choosing the transcripts for dataanalysis from GeneChip

0

Entering edit mode

John Antonydas Gaspar ▴ 130

@john-antonydas-gaspar-3144

Last seen 11.3 years ago

Dear Sir/Madam, I am just introduced to Affymetrix GeneChip technology and am supposed to do the analysis. I just readin the .cel files: a-MHC A.cel, a-MHC B.cel, a-MHC C.cel, ctrl-mES-02.cel, ctrl-mES-02.cel, ctrl-mES-03.cel. Saved in a directory; ibrary(affy) Data <- ReadAffy() list.celfiles() # to show the CEL files that are locatd in currentworking directory # CDF package automatically loaded contains the information about the PM and MM eset_mas5 <- mas5(Data) write.exprs(eset_eset_mas5, file="Data_all.xls") Here in the Excel table I have tow coloumns for each condition that are Avg-signal value and the detection call. Based on the detection call, I wished to filter the genes that is with P(Present) of the detection call. However for a single transcript in three chips(conditions) they are considered to be Present but at the rest they are with Absent call. Now how shall I go about in dealing with those transcripts with these behaviour?. Kindly guide me if there is any special way to solve it. With Kind regards, antony -- John Antonydas Gaspar, Phd Student AG: Prof.A.Sachinidis Institute of Neurophysiology University of Cologne Robert-Koch-Str. 39 50931 Cologne/Germany Tel: 004922125918042 Handy: 004917683142627

GO cdf GO cdf • 1.4k views

ADD COMMENT • link updated 16.8 years ago by Björn Usadel ▴ 250 • written 16.8 years ago by John Antonydas Gaspar ▴ 130

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 14 days ago

United States

Hi Antony, John Antonydas Gaspar wrote: > Dear Sir/Madam, > > I am just introduced to Affymetrix GeneChip technology and am supposed to do the > analysis. > > I just readin the .cel files: > a-MHC A.cel, > a-MHC B.cel, > a-MHC C.cel, > > ctrl-mES-02.cel, > ctrl-mES-02.cel, > ctrl-mES-03.cel. > > Saved in a directory; > > ibrary(affy) > > Data <- ReadAffy() > > list.celfiles() # to show the CEL files that are locatd in currentworking > directory > > # CDF package automatically loaded contains the information about the PM and MM > > eset_mas5 <- mas5(Data) > > write.exprs(eset_eset_mas5, file="Data_all.xls") > > > > Here in the Excel table I have tow coloumns for each condition that are > Avg-signal value and the detection call. > > Based on the detection call, I wished to filter the genes that is with > P(Present) of the detection call. However for a single transcript in three > chips(conditions) they are considered to be Present but at the rest they are > with Absent call. Now how shall I go about in dealing with those transcripts > with these behaviour?. You keep them. Really the only transcripts you want to remove are those that are absent in _all_ conditions. The transcripts that are absent in some but not all conditions are usually the ones you are most interested in. Best, Jim > > Kindly guide me if there is any special way to solve it. > > > With Kind regards, > > antony > > > > -- James W. MacDonald, M.S. Biostatistician Douglas Lab 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826

ADD COMMENT • link 16.8 years ago James W. MacDonald 68k

0

Entering edit mode

prsmra01@uniroma2.it ▴ 50

@prsmra01uniroma2it-2079

Last seen 11.3 years ago

Dear list, I cannot use getGeneSim() with yeast genes (the code in vignette works properly for me but the entrez ids in the provided examples are from human); actually the problem seems linked to the mapping "yeast entrezIDs-GOterms" as you can see by trying the chunk of code below: library(org.Sc.sgd.db) library("GOSim") x <- org.Sc.sgdENTREZID setEvidenceLevel(evidences = c("IMP","IGI","ISS","IDA","IEP","IEA"), organism="yeast", gomap=NULL) calcICs() #now I move the ICs file in GOSim/data setOntology("BP", loadIC=FALSE) filterGO(Rkeys(x)) filtering out genes not mapping to the currently set GO category ... ===> list of 6199 genes reduced to 0 list() > sessionInfo() R version 2.9.0 Under development (unstable) (2009-03-03 r48046) x86_64-unknown-linux-gnu locale: LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE=it_IT .UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GOSim_1.1.5.4 corpcor_1.5.2 Matrix_0.999375-21 [4] lattice_0.17-20 RBGL_1.19.2 mclust_3.1-10 [7] cluster_1.11.12 topGO_1.11.1 SparseM_0.79 [10] graph_1.21.4 org.Rn.eg.db_2.2.6 org.Pf.plasmo.db_2.2.6 [13] org.Mm.eg.db_2.2.6 org.Hs.eg.db_2.2.6 org.Dm.eg.db_2.2.6 [16] annotate_1.21.4 GO.db_2.2.5 org.Sc.sgd.db_2.2.8 [19] RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.5.17 [22] Biobase_2.3.10 loaded via a namespace (and not attached): [1] grid_2.9.0 tools_2.9.0 xtable_1.5-4 May you give me some suggestions? Thanks for your attention, Best, Maria -- Maria Persico, PhD. student http://cbm.bio.uniroma2.it/~maria/ MINT database group Universita' di Tor Vergata, via della Ricerca scientifica 11 00133 Roma, Italy Tel +39 0672594315 (Supervisor's room) Fax +39 0672594766 Mobile phone: +393479715662 e-mail maria.persico at uniroma2.it

ADD COMMENT • link 16.8 years ago prsmra01@uniroma2.it ▴ 50

0

Entering edit mode

Björn Usadel ▴ 250

@bjorn-usadel-1492

Last seen 11.3 years ago

Hi, Just make sure in your interpretation that you dont get inflated fold changes as mas5 can give you very low expression estimates which get highly negative values on the log scale. I would also suggest to use (gc)rma by default if you are new to microarray analysis. Best wishes Bjoern ----- Originalnachricht ----- Von: bioconductor-bounces at stat.math.ethz.ch <bioconductor-bounces at="" stat.math.ethz.ch=""> An: bioconductor at stat.math.ethz.ch <bioconductor at="" stat.math.ethz.ch=""> Gesendet: Wed Mar 04 17:03:30 2009 Betreff: [BioC] Choosing the transcripts for dataanalysis from GeneChip Dear Sir/Madam, I am just introduced to Affymetrix GeneChip technology and am supposed to do the analysis. I just readin the .cel files: a-MHC A.cel, a-MHC B.cel, a-MHC C.cel, ctrl-mES-02.cel, ctrl-mES-02.cel, ctrl-mES-03.cel. Saved in a directory; ibrary(affy) Data <- ReadAffy() list.celfiles() # to show the CEL files that are locatd in currentworking directory # CDF package automatically loaded contains the information about the PM and MM eset_mas5 <- mas5(Data) write.exprs(eset_eset_mas5, file="Data_all.xls") Here in the Excel table I have tow coloumns for each condition that are Avg-signal value and the detection call. Based on the detection call, I wished to filter the genes that is with P(Present) of the detection call. However for a single transcript in three chips(conditions) they are considered to be Present but at the rest they are with Absent call. Now how shall I go about in dealing with those transcripts with these behaviour?. Kindly guide me if there is any special way to solve it. With Kind regards, antony -- John Antonydas Gaspar, Phd Student AG: Prof.A.Sachinidis Institute of Neurophysiology University of Cologne Robert-Koch-Str. 39 50931 Cologne/Germany Tel: 004922125918042 Handy: 004917683142627 _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 16.8 years ago Björn Usadel ▴ 250

Login before adding your answer.