Entering edit mode
Hello
I have had to review recently an analysis I did some time ago. This
was done on affymetrix hgu133plus2 chips with R 2.4 and BioC 1.9 I
have re-run the analyses using R 2.9 and BioC 2.4 (sessionInfo below).
I have been surprised by the changes in the annotations: Many
probesets that had had an annotation have become NA's whereas some
have changed their symbol and their Entrez gene.
To be specific I summarize my question with the top genes of my list
The list I obtained 2 years ago is:
probeset locuslink symbol
238900_at 3123 HLA-DRB1
232583_at 8440 NCK2
236307_at 60468 BACH2
223620_at 2857 GPR34
219759_at 64167 LRAP
201702_s_at 5514 PPP1R10
232882_at 2308 FOXO1A
213446_s_at 8826 IQGAP1
234033_at 9693 RAPGEF2
243006_at 2534 FYN
244648_at 54520 CCDC93
243691_at 23142 DCUN1D4
239264_at 60412 EXOC4
243546_at 143686 SESN3
205239_at 374 AREG
1565703_at 55520 ELAC1
244061_at 55843 ARHGAP15
230505_at 26037 SIPA1L1
242688_at 9320 TRIP12
1556474_a_at 285097 FLJ38379
232614_at 596 BCL2
1565689_at 3839 KPNA3
236685_at NA NA
225173_at 93663 ARHGAP18
241893_at 4249 MGAT5
I used the following code to reproduce the issue with the annotations:
#####################################################################
## Verification using R 2.9 & BioC 2.4
#####################################################################
> probes<-c("238900_at" , "232583_at", "236307_at" ,"223620_at" ,
"219759_at" ,
+ "201702_s_at" , "232882_at" , "213446_s_at", "234033_at",
"243006_at" ,
+ "244648_at" , "243691_at" , "239264_at" , "243546_at" ,
"205239_at" ,
+ "1565703_at" , "244061_at" , "230505_at" , "242688_at" ,
"1556474_a_at",
+ "232614_at" , "1565689_at" , "236685_at" , "225173_at" ,
"241893_at")
>
> library(hgu133plus2.db)
> library(annotate)
>
> entrezs<- getEG(probes, "hgu133plus2")
> symbols<- getSYMBOL(probes, "hgu133plus2")
> sel2<- cbind(probes, entrezs, symbols)
> sel2
probes entrezs symbols
238900_at "238900_at" "100133484" "LOC100133484"
232583_at "232583_at" NA NA
236307_at "236307_at" NA NA
223620_at "223620_at" "2857" "GPR34"
219759_at "219759_at" "64167" "ERAP2"
201702_s_at "201702_s_at" "5514" "PPP1R10"
232882_at "232882_at" NA NA
213446_s_at "213446_s_at" "8826" "IQGAP1"
234033_at "234033_at" NA NA
243006_at "243006_at" NA NA
244648_at "244648_at" NA NA
243691_at "243691_at" NA NA
239264_at "239264_at" NA NA
243546_at "243546_at" NA NA
205239_at "205239_at" "374" "AREG"
1565703_at "1565703_at" "4089" "SMAD4"
244061_at "244061_at" NA NA
230505_at "230505_at" "145474" "LOC145474"
242688_at "242688_at" NA NA
1556474_a_at "1556474_a_at" "285097" "FLJ38379"
232614_at "232614_at" NA NA
1565689_at "1565689_at" NA NA
236685_at "236685_at" NA NA
225173_at "225173_at" "93663" "ARHGAP18"
241893_at "241893_at" NA NA
> sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] annotate_1.22.0 hgu133plus2.db_2.2.11 RSQLite_0.7-1
DBI_0.2-4 AnnotationDbi_1.6.0 Biobase_2.4.1
loaded via a namespace (and not attached):
[1] xtable_1.5-5
#############################################
Many probesets seem to have changed.
Can someone explain to me what is happening (or what may I be doing
wrong)?
The same code does not work with R 2.4 but if I change hgu133plus2.db
by hgu133plus2 and getEG by getLL I obtain the original results:
###############################################
### Review of annotatons with R 2.4 and BioC 1.9
###############################################
### This code is executed on a clean new session with R 2. and BioC
1.9
> probes<-c("238900_at" , "232583_at", "236307_at" ,"223620_at" ,
"219759_at" ,
+ "201702_s_at" , "232882_at" , "213446_s_at", "234033_at",
"243006_at" ,
+ "244648_at" , "243691_at" , "239264_at" , "243546_at" ,
"205239_at" ,
+ "1565703_at" , "244061_at" , "230505_at" , "242688_at" ,
"1556474_a_at",
+ "232614_at" , "1565689_at" , "236685_at" , "225173_at" ,
"241893_at")
>
>LLs<- getLL(rownames(sel), "hgu133plus2")
>symbols<- getSYMBOL(rownames(sel), "hgu133plus2")
>sel1<- cbind(probes, LLs, symbols)
>sel1
probes LLs symbols
238900_at "238900_at" "3123" "HLA-DRB1"
232583_at "232583_at" "8440" "NCK2"
236307_at "236307_at" "60468" "BACH2"
223620_at "223620_at" "2857" "GPR34"
219759_at "219759_at" "64167" "ERAP2"
201702_s_at "201702_s_at" "5514" "PPP1R10"
232882_at "232882_at" "2308" "FOXO1"
213446_s_at "213446_s_at" "8826" "IQGAP1"
234033_at "234033_at" "9693" "RAPGEF2"
243006_at "243006_at" "2534" "FYN"
244648_at "244648_at" "54520" "CCDC93"
243691_at "243691_at" "23142" "DCUN1D4"
239264_at "239264_at" "60412" "EXOC4"
243546_at "243546_at" "143686" "SESN3"
205239_at "205239_at" "374" "AREG"
1565703_at "1565703_at" "4089" "SMAD4"
244061_at "244061_at" "55843" "ARHGAP15"
230505_at "230505_at" "145474" "LOC145474"
242688_at "242688_at" "9320" "TRIP12"
1556474_a_at "1556474_a_at" "285097" "FLJ38379"
232614_at "232614_at" "596" "BCL2"
1565689_at "1565689_at" "3839" "KPNA3"
236685_at "236685_at" NA NA
225173_at "225173_at" "93663" "ARHGAP18"
241893_at "241893_at" "4249" "MGAT5"
> sessionInfo()
R version 2.4.1 (2006-12-18)
i386-pc-mingw32
locale:
LC_COLLATE=Spanish_Spain.1252;LC_CTYPE=Spanish_Spain.1252;LC_MONETARY=
Spanish_Spain.1252;LC_NUMERIC=C;LC_TIME=Spanish_Spain.1252
attached base packages:
[1] "tools" "stats" "graphics" "grDevices"
[5] "utils" "datasets" "methods" "base"
other attached packages:
annotate Biobase hgu133plus2
"1.12.1" "1.12.2" "1.14.0"
########################################################
In summary. If I use R 2.4/BioC 1.9 I obtain the same results I
ibtained 2 years ago, but If I do the same steps using R2.9/BioC2.4
the results change dramatically.
I have repeated the analyses using BioC 2.01 in R 2.7 and BioC 2.2 in
R 2.8 (results not shown here). BioC 2.0 yield the same as 1.9 and
BioC 2.2 the same as 2.4,
Any help to understand what's happening would be appreciated
Alex Sanchez
----------------------------------------------------------------------
-------------------------------
Dr. Alex Sánchez. Statistics Department. University of Barcelona.
Facultat de Biologia UB. Avda Diagonal 645. 08028 Barcelona. Spain
asanchez_at_ub.edu
Statistics and Bioinformatics Unit
Institut de Recerca. Hospital Universitari Vall 'Hebron
Passeig Vall d'Hebron 112-119. 08034 Barcelona
asanchez_at_ir.vhebron.net
----------------------------------------------------------------------
------------------------------
[[alternative HTML version deleted]]