This probably is a general R question, but I couldn't find anything
useful. I
found all sort of stuff on how to filter using functions based on the
values
within the matrix, but nothing like this.
I have a list of genes in a file that I want to look at, how can I
filter my
matrix of genes to match the ones in the list?
gene_list.tab with 250 genes:
probe{tab}description
affy_blah1{tab}affy gene of interest 1
affy_blah2{tab}affy gene of interest 2
..
dim(my.metric)
[1] 22625 11
mmfun <- function() # to filter
ffun <- filterfun(mmfun)
my.fmetric <- genefilter(my.metric,ffun)
dim(my.fmetric) ## This should give 250 and 11
I would use the %in% function. This assumes that your matrix of gene
values has the gene names appended somehow (row.names, or the first
column). Since you are doing affy stuff, the easiest way is to use the
exprSet holding your data.
index <- gene_list.tab[,1] %in% geneNames(eset)
-or-
index <- gene_list.tab[,1] %in% row.names(my.metric)
Then subset using the index.
subset.data <- my.metric[index,]
HTH,
Jim
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
>>> <mcolosim@brandeis.edu> 08/18/04 11:29AM >>>
This probably is a general R question, but I couldn't find anything
useful. I
found all sort of stuff on how to filter using functions based on the
values
within the matrix, but nothing like this.
I have a list of genes in a file that I want to look at, how can I
filter my
matrix of genes to match the ones in the list?
gene_list.tab with 250 genes:
probe{tab}description
affy_blah1{tab}affy gene of interest 1
affy_blah2{tab}affy gene of interest 2
..
dim(my.metric)
[1] 22625 11
mmfun <- function() # to filter
ffun <- filterfun(mmfun)
my.fmetric <- genefilter(my.metric,ffun)
dim(my.fmetric) ## This should give 250 and 11
_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Jim,
Thanks for the hint about %in%, where did you find this function? I
couldn't
find any thing about it.
Also, it works the other way:
index <- geneNames(eset) %in% gene_list.tab[,1]
Marc
Quoting James MacDonald <jmacdon@med.umich.edu>:
> I would use the %in% function. This assumes that your matrix of gene
> values has the gene names appended somehow (row.names, or the first
> column). Since you are doing affy stuff, the easiest way is to use
the
> exprSet holding your data.
>
> index <- gene_list.tab[,1] %in% geneNames(eset)
> -or-
> index <- gene_list.tab[,1] %in% row.names(my.metric)
>
> Then subset using the index.
>
> subset.data <- my.metric[index,]
>
>
> >>> <mcolosim@brandeis.edu> 08/18/04 11:29AM >>>
> This probably is a general R question, but I couldn't find anything
> useful. I
> found all sort of stuff on how to filter using functions based on
the
> values
> within the matrix, but nothing like this.
>
> I have a list of genes in a file that I want to look at, how can I
> filter my
> matrix of genes to match the ones in the list?
>
> gene_list.tab with 250 genes:
> probe{tab}description
> affy_blah1{tab}affy gene of interest 1
> affy_blah2{tab}affy gene of interest 2
> ..
>
> dim(my.metric)
> [1] 22625 11
>
> mmfun <- function() # to filter
> ffun <- filterfun(mmfun)
> my.fmetric <- genefilter(my.metric,ffun)
> dim(my.fmetric) ## This should give 250 and 11
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
mcolosim@brandeis.edu writes:
> Jim,
>
> Thanks for the hint about %in%, where did you find this function? I
couldn't
> find any thing about it.
?%in%
should provide the help page. It does for me (though under Emacs).
(i.e. help("%in%") )
Interesting that help.search("in") isn't too useful (in fact, it seems
to miss it).
best,
-tony
--
Anthony Rossini Research Associate Professor
rossini@u.washington.edu
http://www.analytics.washington.edu/
Biomedical and Health Informatics University of Washington
Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research
Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email
CONFIDENTIALITY NOTICE: This e-mail message and any
attachme...{{dropped}}
Quoting "A.J. Rossini" <rossini@blindglobe.net>:
> mcolosim@brandeis.edu writes:
>
> > Jim,
> >
> > Thanks for the hint about %in%, where did you find this function?
I
> couldn't
> > find any thing about it.
>
> ?%in%
>
> should provide the help page. It does for me (though under Emacs).
>
> (i.e. help("%in%") )
>
> Interesting that help.search("in") isn't too useful (in fact, it
seems
> to miss it).
>
>
I'm using an old version of R (1.8.1) and ?%in% doesn't work. However,
help("%in%") does. I know it is time up update everything.
Marc
Marc,
I think finding the function you are looking for is usually more of an
art than a science. I have no idea how I found %in%, but my usual
method
for finding functions that I think probably exist goes like this:
1.) help.search("something that I think might be a reasonable name for
the function")
2.) google it to within an inch of its life ;-D. Usually I prepend an
R
on the google search to possibly limit the results to actual R
functions. There are also search pages on www.r-project.org and
www.bioconductor.org that will search the mail list archives.
3.) Look at code for functions that I already know might do something
similar and see how they do it.
By this time I have usually found what I am looking for, plus a bunch
of other stuff that may come in handy in the future. However, if I
still
am hitting a wall, I ask on either the BioC or R-help listserv.
Best,
Jim
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
>>> <mcolosim@brandeis.edu> 08/18/04 01:25PM >>>
Jim,
Thanks for the hint about %in%, where did you find this function? I
couldn't
find any thing about it.
Also, it works the other way:
index <- geneNames(eset) %in% gene_list.tab[,1]
Marc
Quoting James MacDonald <jmacdon@med.umich.edu>:
> I would use the %in% function. This assumes that your matrix of gene
> values has the gene names appended somehow (row.names, or the first
> column). Since you are doing affy stuff, the easiest way is to use
the
> exprSet holding your data.
>
> index <- gene_list.tab[,1] %in% geneNames(eset)
> -or-
> index <- gene_list.tab[,1] %in% row.names(my.metric)
>
> Then subset using the index.
>
> subset.data <- my.metric[index,]
>
>
> >>> <mcolosim@brandeis.edu> 08/18/04 11:29AM >>>
> This probably is a general R question, but I couldn't find anything
> useful. I
> found all sort of stuff on how to filter using functions based on
the
> values
> within the matrix, but nothing like this.
>
> I have a list of genes in a file that I want to look at, how can I
> filter my
> matrix of genes to match the ones in the list?
>
> gene_list.tab with 250 genes:
> probe{tab}description
> affy_blah1{tab}affy gene of interest 1
> affy_blah2{tab}affy gene of interest 2
> ..
>
> dim(my.metric)
> [1] 22625 11
>
> mmfun <- function() # to filter
> ffun <- filterfun(mmfun)
> my.fmetric <- genefilter(my.metric,ffun)
> dim(my.fmetric) ## This should give 250 and 11
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Dear group,
I did SAM, T-test analyses and obtained p-values. Now,
these files look like these:
T-test Values:
> Gli_X0_X1_pvals[1:5]
100_g_at 1000_at 1001_at 1002_f_at 1003_s_at
0.80033009 0.31943016 0.33078591 0.05216239 0.08957325
Fold change(Avg. Diff):
> MyBrain_X0_X1_Exp_FCs[1:5]
100_g_at 1000_at 1001_at 1002_f_at 1003_s_at
1.0176023 0.9274588 0.8752550 1.1056984 1.1096023
My question:
Using annotation package how can I convert the probe
ID's to Gene names. how do i incorporate gene name in
place of 100_g_at?
2. How can I choose/filter P-values from T-test that
are less than 0.01 to 0 ?
3. How can write the values into a table with 3
colnames:
Gene, P-value, Fold change
I am doing this for first time. Please help me.
Thank you.
Regards,
PS
See comments below.
On Wed, 2004-08-18 at 18:54, S Peri wrote:
> Dear group,
>
> I did SAM, T-test analyses and obtained p-values. Now,
> these files look like these:
>
> T-test Values:
>
> > Gli_X0_X1_pvals[1:5]
> 100_g_at 1000_at 1001_at 1002_f_at 1003_s_at
>
> 0.80033009 0.31943016 0.33078591 0.05216239 0.08957325
>
>
>
> Fold change(Avg. Diff):
> > MyBrain_X0_X1_Exp_FCs[1:5]
> 100_g_at 1000_at 1001_at 1002_f_at 1003_s_at
> 1.0176023 0.9274588 0.8752550 1.1056984 1.1096023
>
>
> My question:
>
> Using annotation package how can I convert the probe
> ID's to Gene names. how do i incorporate gene name in
> place of 100_g_at?
>
There are annotation packages in BioConductor. But if you want a
quick and dirty solution, get the CDF file from affymetrix and
merge it in excel. These will have all the information you need
but may be slightly outdated.
Can anyone on the list comment the merits of doing this
versus using the BioConductor annotation package ?
> 2. How can I choose/filter P-values from T-test that
> are less than 0.01 to 0 ?
You can write this yourself with a few ifelse(), which(), subset()
commands.
> 3. How can write the values into a table with 3
> colnames:
> Gene, P-value, Fold change
mat <- cbind( genename, pvalue, foldchange )
write.table(mat, file="aaa.txt", sep="\t", quote=FALSE)
> I am doing this for first time. Please help me.
>
> Thank you.
>
> Regards,
> PS
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
On Aug 19, 2004, at 6:36 AM, Adaikalavan Ramasamy wrote:
> See comments below.
>
> On Wed, 2004-08-18 at 18:54, S Peri wrote:
>>
>> Using annotation package how can I convert the probe
>> ID's to Gene names. how do i incorporate gene name in
>> place of 100_g_at?
>>
>
> There are annotation packages in BioConductor. But if you want a
> quick and dirty solution, get the CDF file from affymetrix and
> merge it in excel. These will have all the information you need
> but may be slightly outdated.
>
> Can anyone on the list comment the merits of doing this
> versus using the BioConductor annotation package ?
>
The annotation packages contain much more information than just the
gene name (like gene ontology, homologous genes, etc.). If one has a
vector of affy IDs from, for example, the hgu95av2 chip, getting the
gene symbol is as simple as:
getSYMBOL(myaffyids,"hgu95av2")
See ?getSYMBOL for more help on getting ids of various types from the
affy identifiers.
Also, many of the other packages (GOstats, ontoTools, etc.) make
extensive use of the annotation packages, so, while the "quick and
dirty" approach will give you simple information, it does pay off if
one is going to do post-processing of results in R to learn how to use
the annotation packages. If all one needs is a gene name, either way
works (but I still think the annotation package is a more robust
solution).
Sean
Dear group,
I have list of genes (say ~120) from a pathway. Can I
use 'genefilter' functions OR any other function to
pick (only those I need for my pathway) fold-change
values, p-value and LocuID from the output table that
I created using write.table.
Thank you all for your valuble suggestion for my
previous query about annotate package and writing the
output to a table (REF:Annotate Package: How do I get
the gene names and how do I write my matrix). I could
make things work on my desk. It was my mistake to
iterate over element again and again even after using
a 'for' loop.
Eg: for (i on x){
y <- do something
i = i+1
}
i realized later that i =i+1 is not needed.
Thank you
PS
mcolosim@brandeis.edu writes:
> Quoting "A.J. Rossini" <rossini@blindglobe.net>:
>
>> mcolosim@brandeis.edu writes:
>>
>> > Jim,
>> >
>> > Thanks for the hint about %in%, where did you find this function?
I
>> couldn't
>> > find any thing about it.
>>
>> ?%in%
>>
>> should provide the help page. It does for me (though under Emacs).
>>
>> (i.e. help("%in%") )
>>
>> Interesting that help.search("in") isn't too useful (in fact, it
seems
>> to miss it).
>>
>>
>
> I'm using an old version of R (1.8.1) and ?%in% doesn't work.
However,
> help("%in%") does. I know it is time up update everything.
?"%in%" might be the right thing. Emacs takes care of that for me.
best,
-tony
--
Anthony Rossini Research Associate Professor
rossini@u.washington.edu
http://www.analytics.washington.edu/
Biomedical and Health Informatics University of Washington
Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research
Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email
CONFIDENTIALITY NOTICE: This e-mail message and any
attachme...{{dropped}}
?%in% won't work with any version of R (except it appears to work
under
Emacs - more Rossini magic, I assume?).
At the R prompt you have to use ?"%in%
Best,
Jim
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
>>> <mcolosim@brandeis.edu> 08/18/04 01:58PM >>>
Quoting "A.J. Rossini" <rossini@blindglobe.net>:
> mcolosim@brandeis.edu writes:
>
> > Jim,
> >
> > Thanks for the hint about %in%, where did you find this function?
I
> couldn't
> > find any thing about it.
>
> ?%in%
>
> should provide the help page. It does for me (though under Emacs).
>
> (i.e. help("%in%") )
>
> Interesting that help.search("in") isn't too useful (in fact, it
seems
> to miss it).
>
>
I'm using an old version of R (1.8.1) and ?%in% doesn't work. However,
help("%in%") does. I know it is time up update everything.
Marc
_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor