Adding annotations to fit2 - still having problems

0

Entering edit mode

Sally ▴ 250

@sally-2430

Last seen 10.6 years ago

Hello, I am using Limma for microarray analysis. Ultimately I want to add/combine/append (not sure what word is correct) a file named MGL.txt (mastergene list) to fit2 so that when topTable prints it will have the annotation information for each gene. The first few lines of MGL.txt looks like this (there are blank cells/missing data). There are 17,328 rows and 6 columns in this file. well accession bitscore evalue symbol U179971039 GFP empty1 Blank Well empty2 Blank Well empty3 Blank Well empty4 Blank Well empty5 Blank Well empty6 Blank Well empty7 Blank Well empty8 Blank Well CA054869 UNKNOWN CB490276 G2/mitotic-specific cyclin-B2 Q60FX9 524.24 3.58E-148 ccnb2 CA769480 Endothelial differentiation-related factor 1 homolog Q6PBY3 214.16 8.45E-55 edf1 MGL.txt was read in with: MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="",f ill=TRUE,row.names=1) When MGLnew is printed on screen it looks like this (first few lines) accession bitscore evalue symbol U179971039 <na> NA NA <na> empty1 <na> NA NA <na> empty2 <na> NA NA <na> empty3 <na> NA NA <na> empty4 <na> NA NA <na> empty5 <na> NA NA <na> empty6 <na> NA NA <na> empty7 <na> NA NA <na> empty8 <na> NA NA <na> CA054869 <na> NA NA <na> CB490276 Q60FX9 524.24 3.58e-148 ccnb2 CA769480 Q6PBY3 214.16 8.45e-55 edf1 When I then check the # of rows using nrow(MGLnew) it says there are 3312 rows, even though the original file had 17,328 rows. Also it doesn't include the column "well" in MGLnew. Why are these two things happening? Sally Goldes ----- Original Message ----- From: "john seers (IFR)" <john.seers@bbsrc.ac.uk> To: "Sally" <sagoldes at="" shaw.ca="">; <bioconductor at="" stat.math.ethz.ch=""> Sent: Tuesday, February 17, 2009 8:35 AM Subject: RE: [BioC] Adding annotations to fit2 Hi Sally Is fit2 by any chance the output from limma fitting a linear model? If not this is not relevant. If it is so I think you may be able to do something like fit2$genes<-cbind(fit2$genes, MGLnew) If that does not work, if you are using topTable a bit later, you can do something like this: tt<-topTable(eb, number=ngenes, genelist=cbind(eb$genes, MGLnew)) Regards John --- -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Sally Sent: 17 February 2009 03:35 To: bioconductor at stat.math.ethz.ch Cc: Sally Subject: [BioC] Adding annotations to fit2 I want merge fit2 with a txt file I'll call MGL (mastergenelist) which contains gene id information. I am using a custom cDNA array. The reason is that I want the gene ID information along-side the rownames (which are the accession IDs). Both have identical row names. MGL has missing data. I have tried: MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="",f il l=TRUE) #WORKS write.table(MGLnew,file="MGLnew.txt",sep="\t") #WORKS fit2<-merge(fit2,MGLnew,by="row.names") #NOT WORKING When I run this I get the following error message: Error in dim(data) <- dim : attempt to set an attribute on NULL What does this error message mean? How do I fix the problem? Sally Goldes [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Microarray limma Microarray limma • 912 views

ADD COMMENT • link updated 16.2 years ago by Sean Davis 21k • written 16.2 years ago by Sally ▴ 250

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 8 weeks ago

United States

On Tue, Feb 17, 2009 at 6:57 PM, Sally <sagoldes@shaw.ca> wrote: > Hello, > > I am using Limma for microarray analysis. Ultimately I want to > add/combine/append (not sure what word is correct) a file named MGL.txt > (mastergene list) to fit2 so that when topTable prints it will have the > annotation information for each gene. The first few lines of MGL.txt looks > like this (there are blank cells/missing data). There are 17,328 rows and 6 > columns in this file. > > well accession bitscore evalue symbol > U179971039 GFP > empty1 Blank Well > empty2 Blank Well > empty3 Blank Well > empty4 Blank Well > empty5 Blank Well > empty6 Blank Well > empty7 Blank Well > empty8 Blank Well > CA054869 UNKNOWN > CB490276 G2/mitotic-specific cyclin-B2 Q60FX9 524.24 3.58E-148 ccnb2 > CA769480 Endothelial differentiation-related factor 1 homolog Q6PBY3 > 214.16 8.45E-55 edf1 > > > > MGL.txt was read in with: > > MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="" ,fill=TRUE,row.names=1) > When MGLnew is printed on screen it looks like this (first few lines) > > accession bitscore evalue symbol > U179971039 <na> NA NA <na> > empty1 <na> NA NA <na> > empty2 <na> NA NA <na> > empty3 <na> NA NA <na> > empty4 <na> NA NA <na> > empty5 <na> NA NA <na> > empty6 <na> NA NA <na> > empty7 <na> NA NA <na> > empty8 <na> NA NA <na> > CA054869 <na> NA NA <na> > CB490276 Q60FX9 524.24 3.58e-148 ccnb2 > CA769480 Q6PBY3 214.16 8.45e-55 edf1 > > > When I then check the # of rows using nrow(MGLnew) it says there are 3312 > rows, even though the original file had 17,328 rows. > Also it doesn't include the column "well" in MGLnew. > > > Why are these two things happening? > It looks like you might want to set row.names=FALSE. Does that fix the column name problem? As for the other, try setting quote="",comment.char="" and see if that helps in the read.table. Sometimes, a character in the file ends up causing R to parse the line in a manner that you didn't expect. Sean > > > > > ----- Original Message ----- From: "john seers (IFR)" < > john.seers@bbsrc.ac.uk> > To: "Sally" <sagoldes@shaw.ca>; <bioconductor@stat.math.ethz.ch> > Sent: Tuesday, February 17, 2009 8:35 AM > Subject: RE: [BioC] Adding annotations to fit2 > > > > > Hi Sally > > Is fit2 by any chance the output from limma fitting a linear model? If > not this is not relevant. > > If it is so I think you may be able to do something like > > fit2$genes<-cbind(fit2$genes, MGLnew) > > If that does not work, if you are using topTable a bit later, you can do > something like this: > > tt<-topTable(eb, number=ngenes, genelist=cbind(eb$genes, MGLnew)) > > > Regards > John > > > --- > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Sally > Sent: 17 February 2009 03:35 > To: bioconductor@stat.math.ethz.ch > Cc: Sally > Subject: [BioC] Adding annotations to fit2 > > I want merge fit2 with a txt file I'll call MGL (mastergenelist) which > contains gene id information. I am using a custom cDNA array. The > reason is that I want the gene ID information along-side the rownames > (which are the accession IDs). Both have identical row names. MGL has > missing data. > > I have tried: > > MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="" ,fil > l=TRUE) #WORKS > write.table(MGLnew,file="MGLnew.txt",sep="\t") #WORKS > fit2<-merge(fit2,MGLnew,by="row.names") #NOT WORKING > > When I run this I get the following error message: > > Error in dim(data) <- dim : attempt to set an attribute on NULL > > What does this error message mean? How do I fix the problem? > > Sally Goldes > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 16.2 years ago Sean Davis 21k

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 8 weeks ago

United States

On Tue, Feb 17, 2009 at 9:34 PM, Sally <sagoldes@shaw.ca> wrote: > HI Sean, > > Thanks so much for the help! Unfortunately... > > Using: > MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="" ,fill=TRUE,row.names=FALSE,quote="",comment.char="") > > Hi, Sally. Please keep replies on the list so that we can all learn from the questions and answers. If you run this by itself, do you get what you expect? > This script does not work. No MGLnew is created. > Produces the following error message (refers to another part of the > script): > Error in getClass(Class, where = topenv(parent.frame())) : > "AnnotatedDataFrame" is not a defined class > > Did you forget to load Biobase? If you are running this as a "script", then the error probably comes before the read.table(), so the read.table() will never run. Sean > Here is my entire post-preprocessing script: > > exprdata<-read.table("exprsData.txt", header=TRUE,sep="\t",row.names=1, > as.is=TRUE,fill=TRUE,) > class(exprdata) > #[1] "data.frame" > dim(exprdata) > #[1] 17328 28 > colnames(exprdata) > head(exprdata) > #printout too long to paste > > phenotypicdata<-read.table("phenotypicdata.txt",row.names=1,header=T RUE,sep="\t") > class(phenotypicdata) > #returns: [1] "data.frame" > dim(phenotypicdata) > #returns: [1] 28 2 > colnames(phenotypicdata) > #returns: [1] "Species" "Time" > rownames(phenotypicdata) > #Coerse exprdata into a matrix > myexprdata<-as.matrix(exprdata) > write.table(myexprdata,file="myexprdata.txt",sep="\t",col.names=NA) > class(myexprdata) > #[1] "matrix" > rownames(myexprdata) > colnames(myexprdata) > #Coerse phenotypicdata into a data frame > myphenotypicdata<-as.data.frame(phenotypicdata) > > write.table(myphenotypicdata,file="myphenotypicdatacheck.txt",sep="\ t",col.names=NA) > rownames(myphenotypicdata) > colnames(myphenotypicdata) > #[1] "species" "time" > summary(myphenotypicdata) > all(rownames(myphenotypicdata)==colnames(myexprdata)) > #[1] TRUE > #Create annotated Data Frame > adf<-new("AnnotatedDataFrame",data=phenotypicdata) > #dim means: dimension of an object. > dim(adf) > #rowNames columnNames > # 28 2 > rownames(adf) > #NULL > #read in galfile > readGAL("Galfile.gal") > #Create eset object > > > eSet<-new("ExpressionSet",exprs=myexprdata,phenoData=adf,annotation= "Galfile.gal") > > #Read in targets file > targets <- readTargets("targets.txt") > targets > # Set up character list defining your arrays, include replicates > TS <- paste(targets$Species, targets$Time, sep=".") > #This script returns the following: > TS > # Turn TS into a factor variable which facilitates fitting > TS <- factor(TS) > #This script returns the following > design <- model.matrix(~0+TS) > #write design object to text file > write.table(design,file="design.txt",sep="\t",col.names=NA) > colnames(design) <- levels(TS) > #for eset put in your M values - see ?lmFit for object types > fit <- lmFit(eSet, design) > cont.matrix<-makeContrasts(s0vss24=s.0-s.24, s24vss48=s.24-s.48, > s48vss96=s.48-s.96, c0vsc24=c.0-c.24, c24vsc48=c.24-c.48, > c48vsc96=c.48-c.96, levels=design) > write.table(cont.matrix,file="cont.matrix.txt",sep="\t",col.names=NA) > # estimate the contrasts and put in fit2 > fit2 <- contrasts.fit(fit, cont.matrix) > fit2 <- eBayes(fit2) > #END OF STATISTICAL ANALYSIS > > write.table(fit2, "fit2.txt",sep="\t") #WORKS > #fit2<-read.table(fit2, file="fit2.txt",sep="\t") #WORKS > > #read in mastergenelist (MGL) > > > MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="" ,fill=TRUE,row.names=FALSE,quote="",comment.char="") > #WORKS > write.table(MGLnew,file="MGLnew.txt",sep="\t") #works > > #combine fit2$genes with MGLnew so that annotations show alongside IDs > > > fit2genes<-cbind(fit2$genes, MGLnew) > write.table(fit2genes,file="fit2genes.txt",sep="\t") > > #this script removes all rows from fit2 where the F pvalue is greater than > 0.0001 WORKS > > #> eb <- eb[eb$F.p.value < 0.1,] # r-help suggest > #fit2<- fit2[fit2$F.p.value < 0.0001,] #WORKS > > > > #remove empty rows > > #fit2<-fit2[-grep('^empty',row.names(fit2)),] # WORKS > > > > #remove rows which have a U (GFP) WORKS > > #fit2<-fit2[-grep('^U',row.names(fit2)),] #WORKS ONLY TAKES OUT ROW NAMES > STARTING WITH U > #write.table(fit2,file="fit2.txt",sep="\t") > > > #s0vss24<-topTable(fit2,coef="s0vss24",number=3434,adjust.method="BH ",p.value=1,genelist=cbind(fit2$genes,MGLnew, > stringsAsFactors = FALSE)) > #write.table(s0vss24,file="s0vss24.txt",sep="\t") > > > > #s24vss48<-topTable(fit2,coef="s24vss48",number=3434,adjust.method=" BH",p.value=1) > #write.table(s24vss48,file="s24vss48.txt",sep="\t") > > > > #s48vss96<-topTable(fit2,coef="s48vss96",number=3434,adjust.method=" BH",p.value=1) > #write.table(s48vss96,file="s48vss96.txt",sep="\t") > > > > #c0vsc24<-topTable(fit2,coef="c0vsc24",number=3434,adjust.method="BH ",p.value=1) > #write.table(c0vsc24,file="c0vsc24.txt",sep="\t") > > > > #c24vsc48<-topTable(fit2,coef="c24vsc48",number=3434,adjust.method=" BH",p.value=1) > #write.table(c24vsc48,file="c24vsc48.txt",sep="\t") > > > > #c48vsc96<-topTable(fit2,coef="c48vsc96",number=3434,adjust.method=" BH",p.value=1) > #write.table(c48vsc96,file="c48vsc96.txt",sep="\t") > > > > > > > > ----- Original Message ----- > *From:* Sean Davis <seandavi@gmail.com> > *To:* Sally <sagoldes@shaw.ca> > *Cc:* bioconductor@stat.math.ethz.ch > *Sent:* Tuesday, February 17, 2009 4:03 PM > *Subject:* Re: [BioC] Adding annotations to fit2 - still having problems > > > > On Tue, Feb 17, 2009 at 6:57 PM, Sally <sagoldes@shaw.ca> wrote: > >> Hello, >> >> I am using Limma for microarray analysis. Ultimately I want to >> add/combine/append (not sure what word is correct) a file named MGL.txt >> (mastergene list) to fit2 so that when topTable prints it will have the >> annotation information for each gene. The first few lines of MGL.txt looks >> like this (there are blank cells/missing data). There are 17,328 rows and 6 >> columns in this file. >> >> well accession bitscore evalue symbol >> U179971039 GFP >> empty1 Blank Well >> empty2 Blank Well >> empty3 Blank Well >> empty4 Blank Well >> empty5 Blank Well >> empty6 Blank Well >> empty7 Blank Well >> empty8 Blank Well >> CA054869 UNKNOWN >> CB490276 G2/mitotic-specific cyclin-B2 Q60FX9 524.24 3.58E-148 ccnb2 >> CA769480 Endothelial differentiation-related factor 1 homolog Q6PBY3 >> 214.16 8.45E-55 edf1 >> >> >> >> MGL.txt was read in with: >> >> MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings=" ",fill=TRUE,row.names=1) >> When MGLnew is printed on screen it looks like this (first few lines) >> >> accession bitscore evalue symbol >> U179971039 <na> NA NA <na> >> empty1 <na> NA NA <na> >> empty2 <na> NA NA <na> >> empty3 <na> NA NA <na> >> empty4 <na> NA NA <na> >> empty5 <na> NA NA <na> >> empty6 <na> NA NA <na> >> empty7 <na> NA NA <na> >> empty8 <na> NA NA <na> >> CA054869 <na> NA NA <na> >> CB490276 Q60FX9 524.24 3.58e-148 ccnb2 >> CA769480 Q6PBY3 214.16 8.45e-55 edf1 >> >> >> When I then check the # of rows using nrow(MGLnew) it says there are 3312 >> rows, even though the original file had 17,328 rows. >> Also it doesn't include the column "well" in MGLnew. >> >> >> Why are these two things happening? >> > > It looks like you might want to set row.names=FALSE. Does that fix the > column name problem? > > As for the other, try setting quote="",comment.char="" and see if that > helps in the read.table. Sometimes, a character in the file ends up causing > R to parse the line in a manner that you didn't expect. > > Sean > > > >> >> >> >> >> ----- Original Message ----- From: "john seers (IFR)" < >> john.seers@bbsrc.ac.uk> >> To: "Sally" <sagoldes@shaw.ca>; <bioconductor@stat.math.ethz.ch> >> Sent: Tuesday, February 17, 2009 8:35 AM >> Subject: RE: [BioC] Adding annotations to fit2 >> >> >> >> >> Hi Sally >> >> Is fit2 by any chance the output from limma fitting a linear model? If >> not this is not relevant. >> >> If it is so I think you may be able to do something like >> >> fit2$genes<-cbind(fit2$genes, MGLnew) >> >> If that does not work, if you are using topTable a bit later, you can do >> something like this: >> >> tt<-topTable(eb, number=ngenes, genelist=cbind(eb$genes, MGLnew)) >> >> >> Regards >> John >> >> >> --- >> >> -----Original Message----- >> From: bioconductor-bounces@stat.math.ethz.ch >> [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Sally >> Sent: 17 February 2009 03:35 >> To: bioconductor@stat.math.ethz.ch >> Cc: Sally >> Subject: [BioC] Adding annotations to fit2 >> >> I want merge fit2 with a txt file I'll call MGL (mastergenelist) which >> contains gene id information. I am using a custom cDNA array. The >> reason is that I want the gene ID information along-side the rownames >> (which are the accession IDs). Both have identical row names. MGL has >> missing data. >> >> I have tried: >> >> MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings=" ",fil >> l=TRUE) #WORKS >> write.table(MGLnew,file="MGLnew.txt",sep="\t") #WORKS >> fit2<-merge(fit2,MGLnew,by="row.names") #NOT WORKING >> >> When I run this I get the following error message: >> >> Error in dim(data) <- dim : attempt to set an attribute on NULL >> >> What does this error message mean? How do I fix the problem? >> >> Sally Goldes >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]

ADD COMMENT • link 16.2 years ago Sean Davis 21k

Login before adding your answer.