empty values when extracting beta values using the GEOquery
1
0
Entering edit mode
Carolin • 0
@c7258737
Last seen 2.1 years ago
Portugal

Hi all,

I am not very experience with working in R and am running into the following error:

I am trying to extract the table of beta values from the 450k Illumina assay data set GSE61380 on the Gene Expression Omnibus using the code below. It worked well for other data sets, but in this one I have five samples (GSM1503509, GSM1503517, GSM1503522, GSM1503524, GSM1503525) that only return NAs. Interestingly, those are all the female samples of the data set, so I guess that is not by coincidence but I donĀ“t see any difference in the organization of data from male and female samples. When I look into the data table within the GSMlist, the beta values are there, so I am "loosing" them in the process of creating a data frame from the $VALUE in the data table. I hope someone can help me further.

Thanks, Carolin

library(BiocManager)
library(GEOquery)
library(dplyr)

gse<- getGEO("GSE61380",GSEMatrix=FALSE) #GSEMatriX had to be set to FALSE for the following steps to work
##make sure that all of the GSMs are from the same platform:
gsmplatforms <- lapply(GSMList(gse),function(x) {Meta(x)$platform_id})
head(gsmplatforms)
##If they are you can proceed with
gsmlist <- GSMList(gse) #to get the list of all GSM 
# get the probeset ordering
probesets <- Table(GPLList(gse)[[1]])$ID
# make the data matrix from the VALUE columns from each GSM 
# being careful to match the order of the probesets in the platform with those in the GSMs
data.matrix <- do.call('cbind',lapply(gsmlist,function(x) 
{tab <- Table(x)
mymatch <- match(probesets,tab$ID_REF)
return(tab$VALUE[mymatch])
}))

Results:

session info:
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base  

other attached packages:
 [1] lumi_2.50.0                                       
 [2] IlluminaHumanMethylation450kmanifest_0.4.0        
 [3] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1
 [4] minfi_1.42.0                                      
 [5] bumphunter_1.38.0                                 
 [6] locfit_1.5-9.6                                    
 [7] iterators_1.0.14                                  
 [8] foreach_1.5.2                                     
 [9] Biostrings_2.64.0                                 
[10] XVector_0.36.0                                    
[11] SummarizedExperiment_1.26.0                       
[12] MatrixGenerics_1.8.0  
[13] matrixStats_0.62.0                                
[14] GenomicRanges_1.48.0                              
[15] GenomeInfoDb_1.34.2                               
[16] IRanges_2.30.0                                    
[17] S4Vectors_0.34.0                                  
[18] dplyr_1.0.10                                      
[19] GEOquery_2.66.0                                   
[20] Biobase_2.56.0                                    
[21] BiocGenerics_0.44.0                               
[22] BiocManager_1.30.19 

>head(data.matrix)
     GSM1503499 GSM1503500 GSM1503501 GSM1503502 GSM1503503 GSM1503504 GSM1503505 GSM1503506 GSM1503507
[1,] 0.41355043 0.56418932 0.54025471 0.40691577 0.46972631 0.45607333  0.4643009  0.4931707 0.35833699
[2,] 0.84803068 0.81087147 0.85702235 0.82328461 0.81902802 0.86374916  0.8314105  0.8864755 0.82498740
[3,] 0.50388175 0.42545290 0.37409633 0.39843678 0.51132252 0.46674150  0.4676011  0.5228090 0.44926351
[4,] 0.86577685 0.83432248 0.85594714 0.88428414 0.84440467 0.86997741  0.8616151  0.8819407 0.84325473
[5,] 0.37064915 0.32756148 0.40436536 0.38826454 0.40114029 0.38934984  0.4060475  0.3258828 0.28145007
[6,] 0.09154394 0.06799166 0.09573691 0.05955089 0.06101147 0.08034284  0.1089532  0.0780580 0.08195993
     GSM1503508 GSM1503509 GSM1503510 GSM1503511 GSM1503512 GSM1503513 GSM1503514 GSM1503515 GSM1503516
[1,] 0.44322170         NA 0.41800920  0.3976947 0.40166818 0.41311807 0.41223355 0.46943568 0.43233183
[2,] 0.86218374         NA 0.79261189  0.7885919 0.86425598 0.87240860 0.85175773 0.84363113 0.86930969
[3,] 0.43261736         NA 0.47306780  0.4288925 0.37944154 0.47241740 0.44480832 0.46066579 0.40216191
[4,] 0.80441305         NA 0.90317678  0.8395202 0.84382313 0.84923787 0.86118794 0.81881974 0.89227705
[5,] 0.27235286         NA 0.40867223  0.4015340 0.34978985 0.37300237 0.43426375 0.31597128 0.35896960
[6,] 0.07556178         NA 0.08236327  0.1906769 0.06753515 0.08765025 0.05392182 0.07266497 0.07478928
     GSM1503517 GSM1503518 GSM1503519 GSM1503520 GSM1503521 GSM1503522 GSM1503523 GSM1503524 GSM1503525
[1,]         NA 0.47431720 0.45687939 0.46646230 0.44409902         NA  0.4894551         NA         NA
[2,]         NA 0.89878732 0.84939845 0.87269076 0.87357208         NA  0.8854879         NA         NA
[3,]         NA 0.40335692 0.43550064 0.47369855 0.44548559         NA  0.4082833         NA         NA
[4,]         NA 0.86456979 0.87486284 0.84615118 0.85770545         NA  0.8189913         NA         NA
[5,]         NA 0.46743924 0.48145035 0.28816066 0.47723922         NA  0.3614606         NA         NA
[6,]         NA 0.06609402 0.05622663 0.07099919 0.08569245         NA  0.1019801         NA         NA
     GSM1503526 GSM1503527 GSM1503528 GSM1503529 GSM1503530 GSM1503531
[1,]  0.4508956 0.38191045 0.40482146 0.37511230 0.35584502  0.3536273
[2,]  0.8127178 0.83116632 0.88256683 0.80537954 0.88485828  0.8126514
[3,]  0.4346078 0.37101632 0.52398558 0.42709756 0.42487552  0.4842869
[4,]  0.8435249 0.83192070 0.90559256 0.85480326 0.88494896  0.9127295
[5,]  0.4263578 0.31819248 0.49592397 0.34929792 0.31813783  0.2563915
[6,]  0.2193310 0.08670669 0.04609017 0.07623453 0.07488773  0.1145267
>
GSE61380 GEOquery • 712 views
ADD COMMENT
0
Entering edit mode
Basti ▴ 780
@7d45153c
Last seen 1 day ago
France

It only concerns a few CpGs if you look further in your data.matrix, so I suspect these are CpGs that are located on Y chromosome. You can verify with an annotation of 450k probes

ADD COMMENT

Login before adding your answer.

Traffic: 929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6