Entering edit mode
Bob
▴
20
@bob-419
Last seen 10.3 years ago
Thanks for your reply.
I'm still a bit puzzled over the range of expression values returned
from RMA.
Using the following:
my.affy<- ReadAffy()
my.eset <- rma(my.affy)
summary(exprs(my.eset[,1]))
Returns:
Min. : 3.083
1st Qu.: 5.711
Median : 6.973
Mean : 7.144
3rd Qu.: 8.491
Max. :13.848
This shows the minimum expression value is 3.083 - but the tissue I'm
using cannot be expressing all of these genes (I'm hybridizing to the
U133A chip).
So, I guess there are two main questions:
* Should RMA only be used for comparative studies? What if someone
wanted to create a database of all genes expressed in tissue X? (not
that I'm doing this, but what if?) What I'd like to do is filter the
gene list so I can cut down on the number of tests in the multiple
testing routine (and hence get better numbers).
* What exactly is the expression value that RMA returns? I know it is
log2 transformed, but I don't understand what it corresponds to.
Sorry if these questions are answered somewhere - I've looked but
maybe not looked well enough.
Thanks in advance.
"Rafael A. Irizarry" <ririzarr@jhsph.edu> wrote:
hi! i don't know of any good references. in practice i don't like to
arbitrarily decide on a such cut-offs. this could be very problematic
if
you use MAS 5.0, but with other expression measures such as pm only
li wong and rma you usually don't need this filtering step.
sorry i cant be of more help,
rafael
On Tue, 26 Aug 2003, Bob wrote:
> Hello,
> I have started using bioconductor (which is great, by
> the way), and I have a question regarding how to
> choose a minimum expression threshold.
>
> I have read in the Affymetrix cel files, calculated
> expression using rma(), and now have a data frame with
> ~22k expression values across 14 samples (using the
> U133A chip). There are expression values for each
> Affy spot - although it is probably not true that this
> tissue expresses all 22k genes. My question is how do
> I choose a threshold above which I consider the gene
> to be expressed?
>
> In addition (please correct me if I'm wrong), using
> only the number of expressed genes (or at least not
> all of the spots) will make for better values using
> the multtest package.
>
> Can someone point me in the right direction, or point
> out some good references on this topic?
>
> Thanks.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
---------------------------------
[[alternative HTML version deleted]]
Dear Dr.Huber I downloaded affymetrix microarray data from GEO database. After finally palying around in Rstudio, I managed to get a CSV file from CEL file. This file consits of 11 rows: Affymetrix ID(sometimes duplicated) and represented as s_at, x_at, a_at, logFC, AverageExpression, t, P.value, adj.p.val, B, ensembl-gene-id, gene biotype and external gene name. I am not able to understand what threshold should be consider if now I want to have a list of upregulated and downregulated protein coding genes from such a list. Which column must be considered? Thank you Amruta