GO annotation in R

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.2 years ago

I have the matrix as follows : probes GSM362180 GSM362181 GSM362188 GSM362189 GSM362192 244901 5.094871713 4.626623079 4.554272515 4.748604391 4.759221647 244902 5.194528083 4.985930299 4.817426064 5.151654407 4.838741605 244903 5.412329253 5.352970877 5.06250609 5.305709079 8.365082403 244904 5.529220594 5.28134657 5.467445095 5.62968933 5.458388909 244905 5.024052699 4.714631878 4.792865831 4.843975286 4.657188246 244906 5.786557533 5.242403911 5.060605782 5.458148567 5.890061836 I would like to extract only the first column as follows : ids <- scr[,1] and then biocLite("GO.db") library("AnnotationDbi") biocLite("org.At.tair.db") biocLite("ath1121501.db") library("ath1121501.db") genenames <- org.At.tairGENENAME[ids] number<-org.At.tairENTREZID[ids] xx<-toTable(entrez) yy<-toTable(number) complete<-merge(xx,yy) I get an error in this step and unable to proceed further. Is it because ids <- scr[,1] is a factor ? Iam not sure how to store the id names to carry out the annotation correctly .I would like to use GO.db to find the Terms associated with the go Ids, displaying the result as a data frame with my probes and their corresponding TAIR ID and TAIR genename and annotation. -- output of sessionInfo(): R version 2.15 Linux. -- Sent via the guest posting facility at bioconductor.org.

Annotation GO Annotation GO • 1.3k views

ADD COMMENT • link updated 13.0 years ago by Marc Carlson ★ 7.2k • written 13.0 years ago by Guest User ★ 13k

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 9.3 years ago

United States

Hi Priya, If I assume that your initial extraction worked (and that you didn't just slice out a column of scores). Then you should have something like this: ids <- c("244901","244902","244903") Now, right off the bat, those IDs are not probe IDs, and they are not standard TAIR IDs either. So what are they? I can't assume that they are entrez gene IDs , because even though they look like entrez gene IDs, they actually map to mouse (or at least these 1st three do). So it's hard for me to help you with these IDs. But for the sake of giving you some kind of answer that may help you, I will continue on. ## So lets suppose that you did have some real probe IDs, and that since you are working on arabidopsis, you have probe IDs like this: ids <- c("261585_at","261568_at","261584_at") ## And then lets suppose that you wanted to get the GO IDs, the TAIR IDs and gene names. Well then I could just use select() like this: res1 <- select(ath1121501.db, keys= ids, cols=c("GO","TAIR","GENENAME"), keytype="PROBEID") res1 ## And then separately, I could use the GO.db package to also lookup the term names for these GOIDs library(GO.db) res2 <- select(GO.db, keys = res1$GO, cols="TERM", keytype="GOID") res2 ## And then you could merge the two results together like this ## (please note that whenever you use merge you should try to specify the merge columns) res3 <- merge(res1, res2, by.x="GO", by.y="GOID") res3 Anyhow, I hope this helps you, please let me know if it doesn't. Marc On 11/06/2012 06:15 AM, priya [guest] wrote: > I have the matrix as follows : > > probes GSM362180 GSM362181 GSM362188 GSM362189 GSM362192 > 244901 5.094871713 4.626623079 4.554272515 4.748604391 4.759221647 > 244902 5.194528083 4.985930299 4.817426064 5.151654407 4.838741605 > 244903 5.412329253 5.352970877 5.06250609 5.305709079 8.365082403 > 244904 5.529220594 5.28134657 5.467445095 5.62968933 5.458388909 > 244905 5.024052699 4.714631878 4.792865831 4.843975286 4.657188246 > 244906 5.786557533 5.242403911 5.060605782 5.458148567 5.890061836 > > I would like to extract only the first column as follows : > ids<- scr[,1] > and then > > biocLite("GO.db") > library("AnnotationDbi") > biocLite("org.At.tair.db") > biocLite("ath1121501.db") > library("ath1121501.db") > genenames<- org.At.tairGENENAME[ids] > number<-org.At.tairENTREZID[ids] > xx<-toTable(entrez) > yy<-toTable(number) > complete<-merge(xx,yy) > > I get an error in this step and unable to proceed further. Is it because ids<- scr[,1] is a factor ? > > Iam not sure how to store the id names to carry out the annotation correctly .I would like to use GO.db to find the Terms associated with the go Ids, displaying the result as a data frame with my probes and their corresponding TAIR ID and TAIR genename and annotation. > > > -- output of sessionInfo(): > > R version 2.15 > Linux. > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 13.0 years ago Marc Carlson ★ 7.2k

Login before adding your answer.