Entering edit mode
I also have a data set with differing numbers of spot replications. I
used
lme to analyze these data, gene by gene.
Basically, I wrote a little function that pulls the spot information
out of
the array, removes the flagged spots and does other data cleaning, and
then
runs lme (using "try" in case it bombs). Then I use
"split" to split the array data by geneID, and lapply to apply the
function
to every gene.
Is this slow? Yes. But once it is tested I just get it started on
Friday
at 5, and by Monday at 9 I have my results.
The major drawback is that I am doing a gene by gene ANOVA. The major
advantage is that I can safely remove flagged spots, instead of trying
to
fudge in some values to maintain the balance.
--Naomi Altman
At 11:40 PM 10/16/2003, Gordon Smyth wrote:
>At 11:53 PM 16/10/2003, Jason Skelton wrote:
>>Gordon Smyth wrote:
>>>
>>>I would use the limma commands lmFit (or lm.series or gls.series)
>>>followed by makeContrasts, eBayes and classifyTests. See the
earliers posts:
>>Thanks for this infomation Gordon I'll try this and see what results
I
>>get.........
>>
>>On a different note
>>The arrays I have tested LIMMA on have 2 duplicates and are spaced
evenly
>>throughout the array and so have no problems running your functions.
>>
>>Someone else at the Sanger Insitite would like to be able to use
LIMMA
>>but the number of duplicates for each gene differs on their array
e.g for
>>some genes their are two copies and for others there would be four
copies
>>or more which inturn obviously effects spacing etc between
replicates.
>>I'm not sure why they would want differing numbers of copies of
genes but
>>they would like to be able to estimate the correlation between these
>>genes anyway and obviously see the results as one data point per
merged gene.
>
>I haven't implemented this in limma because it seems to me that it
might
>invalidate the assumptions behind the duplicate correlation approach.
See
>the earlier post:
>
>https://stat.ethz.ch/pipermail/bioconductor/2003-August/002224.html
>
>>I've tried to think of how this can be done but it seems overly
complex
>>and I'm not sure if it is at all possible in R or Limma.
>>
>>I'm guessing there is no way of carryout the correlation, series
model
>>fits etc based simply on the "Name" specified in the GAL files ?
>
>No.
>
>Cheers
>Gordon
>
>>or some how specifying the duplicate number for each gene seperately
>>and somehow merging this information for use as a parameter ?
>>
>>I'm doubting very much that this can be done at all but it's worth
>>asking ;-)
>>
>>thanks
>>
>>Jason
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor@stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348
(Statistics)
University Park, PA 16802-2111
