On Tue, Feb 17, 2009 at 9:34 PM, Sally <sagoldes@shaw.ca> wrote:
> HI Sean,
>
> Thanks so much for the help! Unfortunately...
>
> Using:
> MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings=""
,fill=TRUE,row.names=FALSE,quote="",comment.char="")
>
>
Hi, Sally.
Please keep replies on the list so that we can all learn from the
questions
and answers.
If you run this by itself, do you get what you expect?
> This script does not work. No MGLnew is created.
> Produces the following error message (refers to another part of the
> script):
> Error in getClass(Class, where = topenv(parent.frame())) :
> "AnnotatedDataFrame" is not a defined class
>
>
Did you forget to load Biobase? If you are running this as a
"script", then
the error probably comes before the read.table(), so the read.table()
will
never run.
Sean
> Here is my entire post-preprocessing script:
>
> exprdata<-read.table("exprsData.txt",
header=TRUE,sep="\t",row.names=1,
> as.is=TRUE,fill=TRUE,)
> class(exprdata)
> #[1] "data.frame"
> dim(exprdata)
> #[1] 17328 28
> colnames(exprdata)
> head(exprdata)
> #printout too long to paste
>
> phenotypicdata<-read.table("phenotypicdata.txt",row.names=1,header=T
RUE,sep="\t")
> class(phenotypicdata)
> #returns: [1] "data.frame"
> dim(phenotypicdata)
> #returns: [1] 28 2
> colnames(phenotypicdata)
> #returns: [1] "Species" "Time"
> rownames(phenotypicdata)
> #Coerse exprdata into a matrix
> myexprdata<-as.matrix(exprdata)
> write.table(myexprdata,file="myexprdata.txt",sep="\t",col.names=NA)
> class(myexprdata)
> #[1] "matrix"
> rownames(myexprdata)
> colnames(myexprdata)
> #Coerse phenotypicdata into a data frame
> myphenotypicdata<-as.data.frame(phenotypicdata)
>
> write.table(myphenotypicdata,file="myphenotypicdatacheck.txt",sep="\
t",col.names=NA)
> rownames(myphenotypicdata)
> colnames(myphenotypicdata)
> #[1] "species" "time"
> summary(myphenotypicdata)
> all(rownames(myphenotypicdata)==colnames(myexprdata))
> #[1] TRUE
> #Create annotated Data Frame
> adf<-new("AnnotatedDataFrame",data=phenotypicdata)
> #dim means: dimension of an object.
> dim(adf)
> #rowNames columnNames
> # 28 2
> rownames(adf)
> #NULL
> #read in galfile
> readGAL("Galfile.gal")
> #Create eset object
>
>
> eSet<-new("ExpressionSet",exprs=myexprdata,phenoData=adf,annotation=
"Galfile.gal")
>
> #Read in targets file
> targets <- readTargets("targets.txt")
> targets
> # Set up character list defining your arrays, include replicates
> TS <- paste(targets$Species, targets$Time, sep=".")
> #This script returns the following:
> TS
> # Turn TS into a factor variable which facilitates fitting
> TS <- factor(TS)
> #This script returns the following
> design <- model.matrix(~0+TS)
> #write design object to text file
> write.table(design,file="design.txt",sep="\t",col.names=NA)
> colnames(design) <- levels(TS)
> #for eset put in your M values - see ?lmFit for object types
> fit <- lmFit(eSet, design)
> cont.matrix<-makeContrasts(s0vss24=s.0-s.24, s24vss48=s.24-s.48,
> s48vss96=s.48-s.96, c0vsc24=c.0-c.24, c24vsc48=c.24-c.48,
> c48vsc96=c.48-c.96, levels=design)
>
write.table(cont.matrix,file="cont.matrix.txt",sep="\t",col.names=NA)
> # estimate the contrasts and put in fit2
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2)
> #END OF STATISTICAL ANALYSIS
>
> write.table(fit2, "fit2.txt",sep="\t") #WORKS
> #fit2<-read.table(fit2, file="fit2.txt",sep="\t") #WORKS
>
> #read in mastergenelist (MGL)
>
>
> MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings=""
,fill=TRUE,row.names=FALSE,quote="",comment.char="")
> #WORKS
> write.table(MGLnew,file="MGLnew.txt",sep="\t") #works
>
> #combine fit2$genes with MGLnew so that annotations show alongside
IDs
>
>
> fit2genes<-cbind(fit2$genes, MGLnew)
> write.table(fit2genes,file="fit2genes.txt",sep="\t")
>
> #this script removes all rows from fit2 where the F pvalue is
greater than
> 0.0001 WORKS
>
> #> eb <- eb[eb$F.p.value < 0.1,] # r-help suggest
> #fit2<- fit2[fit2$F.p.value < 0.0001,] #WORKS
>
>
>
> #remove empty rows
>
> #fit2<-fit2[-grep('^empty',row.names(fit2)),] # WORKS
>
>
>
> #remove rows which have a U (GFP) WORKS
>
> #fit2<-fit2[-grep('^U',row.names(fit2)),] #WORKS ONLY TAKES OUT ROW
NAMES
> STARTING WITH U
> #write.table(fit2,file="fit2.txt",sep="\t")
>
>
> #s0vss24<-topTable(fit2,coef="s0vss24",number=3434,adjust.method="BH
",p.value=1,genelist=cbind(fit2$genes,MGLnew,
> stringsAsFactors = FALSE))
> #write.table(s0vss24,file="s0vss24.txt",sep="\t")
>
>
>
> #s24vss48<-topTable(fit2,coef="s24vss48",number=3434,adjust.method="
BH",p.value=1)
> #write.table(s24vss48,file="s24vss48.txt",sep="\t")
>
>
>
> #s48vss96<-topTable(fit2,coef="s48vss96",number=3434,adjust.method="
BH",p.value=1)
> #write.table(s48vss96,file="s48vss96.txt",sep="\t")
>
>
>
> #c0vsc24<-topTable(fit2,coef="c0vsc24",number=3434,adjust.method="BH
",p.value=1)
> #write.table(c0vsc24,file="c0vsc24.txt",sep="\t")
>
>
>
> #c24vsc48<-topTable(fit2,coef="c24vsc48",number=3434,adjust.method="
BH",p.value=1)
> #write.table(c24vsc48,file="c24vsc48.txt",sep="\t")
>
>
>
> #c48vsc96<-topTable(fit2,coef="c48vsc96",number=3434,adjust.method="
BH",p.value=1)
> #write.table(c48vsc96,file="c48vsc96.txt",sep="\t")
>
>
>
>
>
>
>
> ----- Original Message -----
> *From:* Sean Davis <seandavi@gmail.com>
> *To:* Sally <sagoldes@shaw.ca>
> *Cc:* bioconductor@stat.math.ethz.ch
> *Sent:* Tuesday, February 17, 2009 4:03 PM
> *Subject:* Re: [BioC] Adding annotations to fit2 - still having
problems
>
>
>
> On Tue, Feb 17, 2009 at 6:57 PM, Sally <sagoldes@shaw.ca> wrote:
>
>> Hello,
>>
>> I am using Limma for microarray analysis. Ultimately I want to
>> add/combine/append (not sure what word is correct) a file named
MGL.txt
>> (mastergene list) to fit2 so that when topTable prints it will have
the
>> annotation information for each gene. The first few lines of
MGL.txt looks
>> like this (there are blank cells/missing data). There are 17,328
rows and 6
>> columns in this file.
>>
>> well accession bitscore evalue symbol
>> U179971039 GFP
>> empty1 Blank Well
>> empty2 Blank Well
>> empty3 Blank Well
>> empty4 Blank Well
>> empty5 Blank Well
>> empty6 Blank Well
>> empty7 Blank Well
>> empty8 Blank Well
>> CA054869 UNKNOWN
>> CB490276 G2/mitotic-specific cyclin-B2 Q60FX9 524.24 3.58E-148
ccnb2
>> CA769480 Endothelial differentiation-related factor 1 homolog
Q6PBY3
>> 214.16 8.45E-55 edf1
>>
>>
>>
>> MGL.txt was read in with:
>>
>> MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="
",fill=TRUE,row.names=1)
>> When MGLnew is printed on screen it looks like this (first few
lines)
>>
>> accession bitscore evalue symbol
>> U179971039 <na> NA NA <na>
>> empty1 <na> NA NA <na>
>> empty2 <na> NA NA <na>
>> empty3 <na> NA NA <na>
>> empty4 <na> NA NA <na>
>> empty5 <na> NA NA <na>
>> empty6 <na> NA NA <na>
>> empty7 <na> NA NA <na>
>> empty8 <na> NA NA <na>
>> CA054869 <na> NA NA <na>
>> CB490276 Q60FX9 524.24 3.58e-148 ccnb2
>> CA769480 Q6PBY3 214.16 8.45e-55 edf1
>>
>>
>> When I then check the # of rows using nrow(MGLnew) it says there
are 3312
>> rows, even though the original file had 17,328 rows.
>> Also it doesn't include the column "well" in MGLnew.
>>
>>
>> Why are these two things happening?
>>
>
> It looks like you might want to set row.names=FALSE. Does that fix
the
> column name problem?
>
> As for the other, try setting quote="",comment.char="" and see if
that
> helps in the read.table. Sometimes, a character in the file ends up
causing
> R to parse the line in a manner that you didn't expect.
>
> Sean
>
>
>
>>
>>
>>
>>
>> ----- Original Message ----- From: "john seers (IFR)" <
>> john.seers@bbsrc.ac.uk>
>> To: "Sally" <sagoldes@shaw.ca>; <bioconductor@stat.math.ethz.ch>
>> Sent: Tuesday, February 17, 2009 8:35 AM
>> Subject: RE: [BioC] Adding annotations to fit2
>>
>>
>>
>>
>> Hi Sally
>>
>> Is fit2 by any chance the output from limma fitting a linear model?
If
>> not this is not relevant.
>>
>> If it is so I think you may be able to do something like
>>
>> fit2$genes<-cbind(fit2$genes, MGLnew)
>>
>> If that does not work, if you are using topTable a bit later, you
can do
>> something like this:
>>
>> tt<-topTable(eb, number=ngenes, genelist=cbind(eb$genes,
MGLnew))
>>
>>
>> Regards
>> John
>>
>>
>> ---
>>
>> -----Original Message-----
>> From: bioconductor-bounces@stat.math.ethz.ch
>> [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Sally
>> Sent: 17 February 2009 03:35
>> To: bioconductor@stat.math.ethz.ch
>> Cc: Sally
>> Subject: [BioC] Adding annotations to fit2
>>
>> I want merge fit2 with a txt file I'll call MGL (mastergenelist)
which
>> contains gene id information. I am using a custom cDNA array. The
>> reason is that I want the gene ID information along-side the
rownames
>> (which are the accession IDs). Both have identical row names. MGL
has
>> missing data.
>>
>> I have tried:
>>
>> MGLnew<-read.table(file="MGL.txt",sep="\t",header=TRUE,na.strings="
",fil
>> l=TRUE) #WORKS
>> write.table(MGLnew,file="MGLnew.txt",sep="\t") #WORKS
>> fit2<-merge(fit2,MGLnew,by="row.names") #NOT WORKING
>>
>> When I run this I get the following error message:
>>
>> Error in dim(data) <- dim : attempt to set an attribute on NULL
>>
>> What does this error message mean? How do I fix the problem?
>>
>> Sally Goldes
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@stat.math.ethz.ch
>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@stat.math.ethz.ch
>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
[[alternative HTML version deleted]]