Entering edit mode
Scott Ochsner
▴
300
@scott-ochsner-599
Last seen 10.3 years ago
Thomas,
I wanted to asses the performance of random gene lists which do not
have any overlap with myCuratedList hence the step to remove them from
the universe of possible genes prior to random gene selection. If I
leave the curated genes in, random lists could potentially be produced
with significant
similarity to myCuratedList. I'm interested in the chance occurrence
of unique gene lists with similar classification performance as
myCuratedList. I certainly have an open mind with this point if
others can come up good reasons why this may be a bad idea.
Scott
________________________________
From: Thomas Hampton [mailto:Thomas.H.Hampton@Dartmouth.EDU]
Sent: Wed 9/10/2008 3:40 PM
To: Ochsner, Scott A
Cc: bioconductor@stat.math.ethz.ch
Subject: Re: [BioC] Generating random gene lists: does sample/resample
generate random sets
I would not have taken the curated list out. That strikes me as
a significant bias. Am I missing something?
Tom
On Sep 10, 2008, at 4:03 PM, Ochsner, Scott A wrote:
> Dear BioC,
>
> I would like feedback as to the appropriateness of the following
> procedure to produce a set of 1000 random gene lists, each list of
> length 2000. The idea is to use the set of random gene lists to
> assess how often random gene lists of size x can reproduce or
> improve the classification performance of
> myCuratedList.
>
>
> #remove myCuratedList from the universe of possible genes. The
> "eset" object is your standard ExpressionSet object.
>> length(myCuratedList)
> [1] 2000
>> Index<-setdiff(1:length(rownames(exprs(eset))),myCuratedList)
>> length(Index)
> [1] 20277
> #generate 1000 random gene lists using the genes in Index. The
> code for resample is taken from the help pages for sample.
>
>> randomMatrix<-replicate(1000,resample(index,2000))
>> dim(randomMatrix)
> [1] 2000 1000
>
>
> I've verified that each column does not contain repeated genes as
> should be the case with resample without replacement.
>
> Is there a standard procedure for doing the above or is what I've
> done kosher?
>
>
> Scott A. Ochsner, Ph.D.
> NURSA Bioinformatics
> Molecular and Cellular Biology
> Baylor College of Medicine
> Houston, TX. 77030
> phone: 713-798-6227
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/
> gmane.science.biology.informatics.conductor
[[alternative HTML version deleted]]