Bioconductor Digest, Vol 87, Issue 10
2
0
Entering edit mode
avinash gupta ▴ 130
@avinash-gupta-4042
Last seen 7.1 years ago
sir, i work on microarry analysis using R and bioconductor . i had read the data from array express creat the object and normalize it .now i want to do clustering by HOPACH method .In the manual of hopach method first load the necessary packages (hopach) , in second step load the data set ,they use the golub data ,i don't understand it that how it prepared and what type of data is stored in this data set and in my project what type of object or data i use in the place of golub like simple object or after normalization object. . Please mail me coding or manual of making of golub dataset or manual for making any data set from any object data . -- avinash [[alternative HTML version deleted]]
Clustering hopach Clustering hopach • 2.1k views
ADD COMMENT
0
Entering edit mode
@saroj-k-mohapatra-3419
Last seen 9.6 years ago
Hi Avinash: I have never used hopach. But as the vignette suggests, you could get the information from the documentation for golub data. > require(hopach) > ?golub gives me the details of golub data: ----------------------- golub package:hopach R Documentation Gene expression dataset from Golub et al. (1999) Description: Gene expression data (3051 genes and 38 tumor mRNA samples) from the leukemia microarray study of Golub et al. (1999). Pre-processing was done as described in Dudoit et al. (2002). The R code for pre-processing is available in the file golub.R in the docs directory. Usage: data(golub) Value: golub: matrix of gene expression levels for the 38 tumor mRNA samples, rows correspond to genes (3051 genes) and columns to mRNA samples. golub.cl: numeric vector indicating the tumor class, 27 acute lymphoblastic leukemia (ALL) cases (code 0) and 11 acute myeloid leukemia (AML) cases (code 1). golub.gnames: a matrix containing the names of the 3051 genes for the expression matrix ?golub?. The three columns correspond to the gene ?index?, ?ID?, and ?Name?, respectively. ---------------------------- To make your data similar to this, the following 3 objects need to be created: yourdata: a numerical matrix, rows for genes, columns for samples yourclassvec: a vector with same length as number of columns, made up of 0s and 1s yourgeneann: a matrix with number of rows same as yourdata, 3 columns I hope then it should be straight forward to implement hopach. Hope that helps. Best wishes, Saroj avinash gupta wrote: > sir, > i work on microarry analysis using R and bioconductor . i had read > the data from array express creat the object and normalize it .now i want to > do clustering by HOPACH method .In the manual of hopach method first load > the necessary packages (hopach) , in second step load the data set ,they use > the golub data ,i don't understand it that how it prepared and what type of > data is stored in this data set and in my project what type of object or > data i use in the place of golub like simple object or after normalization > object. . Please mail me coding or manual of making of golub dataset or > manual for making any data set from any object data . >
ADD COMMENT
0
Entering edit mode
Saroj, thank you, class vec: my colum lenth is 6 so what is the code for make it . genenames : i had download my data from in raw file and load .cel file in R after that i normailze it and make the object for make the matrix data i use tah command: >expr<- exprs(obj) after that what i do for creat the genenames matrix object, how to creat 3 object and where to download the gene names and how to laos it in genenames matrix object.plz mail me the code or manual for it . thank u regards.. avinash [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Avinash: You could create a vector using c(), e.g., if the first 3 are control and last 3 are cancer, then, > myclassvec <- c(0,0,0,1,1,1) For creating the gene name matrix, I hope you have the probe annotation file in which each probe (or probe set, for affy) is listed along with gene name, etc. From that select index, genename, id and save these three columns to another tab-delimited file (named say, myfile.tab). Keep no header. Read this file using read.table, e.g., > mygnames <- read.table(file="myfile.tab", header=F, sep="\t") Have a look at ?read.table for more help. Good luck! Saroj avinash gupta wrote: > Saroj, > thank you, > class vec: my colum lenth is 6 so what is the code for make it . > genenames : i had download my data from in raw file and load .cel > file in R after that i normailze it and make the object for make the > matrix data i use tah command: > >expr<- exprs(obj) > after that what i do for creat the genenames matrix object, how to > creat 3 object and where to download the gene names and how to laos it > in genenames matrix object.plz mail me the code or manual for it . > > thank u > > > regards.. > > avinash
ADD REPLY
0
Entering edit mode
thank u Saroj sorry to say that i dont have probe annotataion file .so please tell where to and how to download this file . thank you regards avinash [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Earlier you mentioned downloading the data from ArrayExpress. That should have the probe annotation file. Another option is to download the annotation file from the chip manufacturer's web site. For example: I am investigating the array express entry E-MEXP-2620. The Array design file from http://www.ebi.ac.uk/microarray- as/ae/files/A-MEXP-1444/A-MEXP-1444.adf.txt contains the probe annotation. Of course this file needs to be downloaded, formated in a suitable program (e.g., Excel), the extra headers need to be removed, etc. Keep three columns as suggested. Best, Saroj avinash gupta wrote: > thank u Saroj > sorry to say that i dont have probe > annotataion file .so please tell where to and how to download this file . > thank you > > regards > avinash
ADD REPLY
0
Entering edit mode
List, sir , i have create the object by downloading CEL file through Array Express,now I want to do puma clustering method, for this i use "*puma: a Bioconductor package for propagating uncertainty in * *microarray analysis*" for guideline, in this method i done pumaPCA method,but in the second step "Identifying differentially expressed genes" show error in this command limmaRes <- calculateLimma(eset_estrogen_rma) show error: limmaRes <- calculateLimma(eset_rma) Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim) : No residual degrees of freedom in linear model fits i don't understand it .Plz mail me any help or suggestion about it . Regards avinash [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Avinash Please read the posting guide and provide a reproducible example and output of sessionInfo(). I'm guessing you don't have replicates of your conditions, and hence the error, but difficult to say without knowing exactly what you've done. Showing the output of pData() on your ExpressionSet would help confirm this, as in the following example: library(pumadata) data(eset_estrogen_rma) limmaRes <- calculateLimma(eset_estrogen_rma) pData(eset_estrogen_rma) estrogen time.h low10-1.cel absent 10 low10-2.cel absent 10 high10-1.cel present 10 high10-2.cel present 10 low48-1.cel absent 48 low48-2.cel absent 48 high48-1.cel present 48 high48-2.cel present 48 Best wishes Richard avinash gupta wrote: > List, > sir , i have create the object by downloading CEL file through Array > Express,now I want to do puma clustering method, for this i use "*puma: a > Bioconductor package for propagating uncertainty in * > *microarray analysis*" for guideline, in this method i done pumaPCA > method,but in the second step "Identifying differentially expressed genes" > show error in this command > limmaRes <- calculateLimma(eset_estrogen_rma) > show error: > limmaRes <- calculateLimma(eset_rma) > Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = > stdev.coef.lim) : > No residual degrees of freedom in linear model fits > > i don't understand it .Plz mail me any help or suggestion about it . > > Regards > avinash > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Dr Richard D Pearson richard.pearson at well.ox.ac.uk Wellcome Trust Centre for Human Genetics http://www.well.ox.ac.uk/~rpearson University of Oxford Tel: +44 (0)1865 617890 Roosevelt Drive Mob: +44 (0)7971 221181 Oxford OX3 7BN, UK Fax: +44 (0)1865 287664
ADD REPLY
0
Entering edit mode
sir , i have completed k-means clustering and identify 19 cluster and plot it . how can i extract the gene ids or probe ids from separate cluster.please mail me as soon as possible thank you Regards avinash gupta [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Tue, May 25, 2010 at 7:07 AM, avinash gupta <gupta308@gmail.com> wrote: > sir , > i have completed k-means clustering and identify 19 cluster and > plot it . > how can i extract the gene ids or probe ids from separate cluster.please > mail me as soon as possible > Hi, Avinash. You'll need to read the help for kmeans. It is pretty clear: Value: An object of class ‘"kmeans"’ which is a list with components: cluster: A vector of integers indicating the cluster to which each point is allocated. If you have problems, please read the posting guide and repost your question. Finally, please post your question directly rather than replying to another email. Sean [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@richard-pearson-3213
Last seen 9.6 years ago
Avinash Please keep replies on list (reply to all). You've still not given a reproducible example - I don't know what your object eset_rma contains. Also, you've not provided the output of sessionInfo(). However, if eset_rma is the output of pumaComb, it will contain a single combined expression value for each condition, and therefore I would expect calculateLimma to return an error. If you look in the puma User Guide, you should see the following two lines: > pumaDERes <- pumaDE(eset_estrogen_comb) > limmaRes <- calculateLimma(eset_estrogen_rma) pumaDE is being applied to a combined ExpressionSet (eset_estrogen_comb is the output from pumaComb), whereas calculateLimma is being applied to the non-combined ExpressionSet (eset_estrogen_rma is NOT the output from pumaComb). In short, don't run calculateLimma on the output of pumaComb. Best wishes Richard avinash gupta wrote: > sir , > in Puma method for " Identifying di?erentially expressed (DE) > genes with PPLR method" , i make the object of pumacomb method,and > pumaDres method but in the limma Res i found the errors that show as: > > limmaRes <- calculateLimma(eset_rma) > Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = > stdev.coef.lim) : > No residual degrees of freedom in linear model fits > > plz mail me as soon as possible > > regards > avinash -- Dr Richard D Pearson richard.pearson at well.ox.ac.uk Wellcome Trust Centre for Human Genetics http://www.well.ox.ac.uk/~rpearson University of Oxford Tel: +44 (0)1865 617890 Roosevelt Drive Mob: +44 (0)7971 221181 Oxford OX3 7BN, UK Fax: +44 (0)1865 287664
ADD COMMENT
0
Entering edit mode
sir, puma clustering method . i make the object of (sample6, Celfile) by this command dat<-ReadAffy() library(pumadata) >eset_mmgmos<-mmgmos(dat) >eset_rma<-rma(dat) after that i do "dentifying differentially expressed genes" method in this method i use this command >eset_comb <- pumaComb(eset_mmgmos) > pumaDERes <- pumaDE(eset_comb) > limmaRes <- calculateLimma(eset_rma) ## in this command it show the error like: Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim) : No residual degrees of freedom in linear model fits i don't understand it plz solve this and mail me . i more things in the code toppumaDEIntGene<-topGenes(pumaDERes, contrast = 7) i don't under stand the value of contrast, how it's value define ,in my obj have 6 cel file ,so plz help me to define its value. [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Avinash So, it looks from your code like you haven't provided any phenotype information about your CEL files, and therefore each CEL file will be treated as a different condition. Because you have only 1 array per condition (i.e. you have no replicates), limma is going to give an error. Please note the following line from the puma User Guide (bottom of page 4 if you're using puma 2.0.0, but you still haven't given me the output from sessionInfo() so I don't know what version you're using): "The easiest way to supply phenotype information is in a text file that is loaded using the phenotype parameter of the ReadAffy function". Unfortunately, my documentation is incorrect here - ReadAffy has no phenotype parameter! Instead it should say phenoData parameter. As I suggested in my first reply, showing the output of pData(eset_rma) would confirm that the above is true. From reading ?ReadAffy, my guess is that pData(eset_rma) will give you a data.frame with a column called sample containing the numbers 1 to 6. To provide phenotype data for your eset_rma object you could do something like the following: pData(eset_rma) <- data.frame(AvinashCondition=c("Condition1", "Condition1", "Condition1", "Condition2", "Condition2", "Condition2")) "contrast" is a standard statistical term. I've tried to explain this in section 4.6 of the puma manual, but you could also look at the limma user guide, or google this (e.g. to get here: http://en.wikipedia.org/wiki/Contrast_%28statistics%29), or ask a local statistician. Please do read the posting guide - it can be found here: http://www.bioconductor.org/docs/postingGuide.html Best wishes Richard avinash gupta wrote: > sir, > puma clustering method . i make the object of (sample6, Celfile) > by this command > > dat<-ReadAffy() > > library(pumadata) > > >eset_mmgmos<-mmgmos(dat) > >eset_rma<-rma(dat) > > after that i do "dentifying differentially expressed genes" method in > this method i use this command > >eset_comb <- pumaComb(eset_mmgmos) > > pumaDERes <- pumaDE(eset_comb) > > limmaRes <- calculateLimma(eset_rma) ## in this command it show > the error like: > Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = > stdev.coef.lim) : > No residual degrees of freedom in linear model fits > i don't understand it > plz solve this and mail me . > > i more things > in the code > toppumaDEIntGene<-topGenes(pumaDERes, contrast = 7) > > i don't under stand the value of contrast, how it's value define ,in my > obj have 6 cel file ,so plz help me to define its value. -- Dr Richard D Pearson richard.pearson at well.ox.ac.uk Wellcome Trust Centre for Human Genetics http://www.well.ox.ac.uk/~rpearson University of Oxford Tel: +44 (0)1865 617890 Roosevelt Drive Mob: +44 (0)7971 221181 Oxford OX3 7BN, UK Fax: +44 (0)1865 287664
ADD REPLY
0
Entering edit mode
sir , i have completed k-means clustering and identify 19 cluster and plot it . how can i extract the gene ids or probe ids from separate cluster.please mail me as soon as possible thank you Regards avinash gupta [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6