rma normalization using microarray data
1
0
Entering edit mode
Charmy • 0
@978988de
Last seen 16 months ago
United States

I am not able to do this can someone explain what should I do to solve this issue

Code should be placed in three backticks as shown below

library(tibble)
library(GEOquery)
library(limma)
library(umap)
library(tidyverse)
library(affy)
if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install(version = "3.14")
BiocManager::install("hgu133plus2.db")
BiocManager::install("hgu133plus2cdf")
BiocManager::install("hgu133plus2probe",force=TRUE)

library("hgu133plus2probe")
library("hgu133plus2cdf")
library("hgu133plus2.db")






#gset is assigned the value of variable that is the data that is being accessed 

#GSEmatrix=True means soft files are being opened 
#Contains DataSet information, experiment variable subsets, expression value measurements and gene symbols

#AnnotGPL means that 
#A boolean defaulting to FALSE as to 
#whether or not to use the Annotation GPL information. 
#These files are nice to use because they contain up-to-date information remapped from 
# Entrez Gene on a regular basis. However, they do not exist for all GPLs; in general, 
#they are only available for GPLs referenced by a GDS

# load series and platform data from GEO
getGEOSuppFiles("GSE28829")



#untar files, since files are in a compressed format 
#the files are being extracted in the external directory called data

untar("GSE28829/GSE28829_RAW.tar",exdir='data/')

#reading in cells files using the affy package and providig teh path and store it in variable which is raw data object
raw.data <-ReadAffy(celfile.path = "data/")

#rma normalisation 
normalized.data<- rma(raw.data)

# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

> library(GEOquery)
> library(limma)
> library(umap)
> library(tidyverse)
> library(affy)
> if (!require("BiocManager", quietly = TRUE))
+   install.packages("BiocManager")
> BiocManager::install(version = "3.14")
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

Bioconductor version 3.14 (BiocManager 1.30.19), R 4.1.1 (2021-08-10)
Old packages: 'boot', 'class', 'cluster', 'digest', 'foreign', 'ggpp', 'htmltools', 'jpeg', 'jsonlite', 'lattice', 'MASS',
  'Matrix', 'mgcv', 'nlme', 'nnet', 'openssl', 'png', 'rpart', 'sass', 'spatial', 'stringr', 'survival', 'testthat', 'XML'
Update all/some/none? [a/s/n]: 
n
> BiocManager::install(version = "3.14")
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

Bioconductor version 3.14 (BiocManager 1.30.19), R 4.1.1 (2021-08-10)
Old packages: 'boot', 'class', 'cluster', 'digest', 'foreign', 'ggpp', 'htmltools', 'jpeg', 'jsonlite', 'lattice', 'MASS',
  'Matrix', 'mgcv', 'nlme', 'nnet', 'openssl', 'png', 'rpart', 'sass', 'spatial', 'stringr', 'survival', 'testthat', 'XML'
Update all/some/none? [a/s/n]: 
n
> BiocManager::install(version = "3.14")
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

Bioconductor version 3.14 (BiocManager 1.30.19), R 4.1.1 (2021-08-10)
Old packages: 'boot', 'class', 'cluster', 'digest', 'foreign', 'ggpp', 'htmltools', 'jpeg', 'jsonlite', 'lattice', 'MASS',
  'Matrix', 'mgcv', 'nlme', 'nnet', 'openssl', 'png', 'rpart', 'sass', 'spatial', 'stringr', 'survival', 'testthat', 'XML'
Update all/some/none? [a/s/n]: BiocManager::install("hgu133plus2.db")
Update all/some/none? [a/s/n]: BiocManager::install("hgu133plus2cdf")
Update all/some/none? [a/s/n]: 
n
> BiocManager::install("hgu133plus2probe",force=TRUE)
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

Bioconductor version 3.14 (BiocManager 1.30.19), R 4.1.1 (2021-08-10)
Installing package(s) 'hgu133plus2probe'
installing the source package ‘hgu133plus2probe’

trying URL 'https://bioconductor.org/packages/3.14/data/annotation/src/contrib/hgu133plus2probe_2.18.0.tar.gz'
Content type 'application/octet-stream' length 8505171 bytes (8.1 MB)
==================================================
downloaded 8.1 MB

* installing *source* package ‘hgu133plus2probe’ ...
** using staged installation
** R
** data
** byte-compile and prepare package for lazy loading
Warning messages:
1: package ‘AnnotationDbi’ was built under R version 4.1.2 
2: package ‘S4Vectors’ was built under R version 4.1.3 
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Warning: package ‘AnnotationDbi’ was built under R version 4.1.2
Warning: package ‘S4Vectors’ was built under R version 4.1.3
** testing if installed package can be loaded from final location
Warning: package ‘AnnotationDbi’ was built under R version 4.1.2
Warning: package ‘S4Vectors’ was built under R version 4.1.3
** testing if installed package keeps a record of temporary installation path
* DONE (hgu133plus2probe)

The downloaded source packages are in
    ‘/private/var/folders/lm/38qkbylx6rzbmkpdcywfhxcw0000gn/T/RtmpLLvcTY/downloaded_packages’
Old packages: 'boot', 'class', 'cluster', 'digest', 'foreign', 'ggpp', 'htmltools', 'jpeg', 'jsonlite', 'lattice', 'MASS',
  'Matrix', 'mgcv', 'nlme', 'nnet', 'openssl', 'png', 'rpart', 'sass', 'spatial', 'stringr', 'survival', 'testthat', 'XML'
Update all/some/none? [a/s/n]: library("hgu133plus2probe")
Update all/some/none? [a/s/n]: library("hgu133plus2cdf")
Update all/some/none? [a/s/n]: 
n
> # load series and platform data from GEO
> getGEOSuppFiles("GSE28829")
trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE28nnn/GSE28829/suppl//GSE28829_RAW.tar?tool=geoquery'
Content type 'application/x-tar' length 157736960 bytes (150.4 MB)
==================================================
downloaded 150.4 MB

                                                 size isdir mode               mtime               ctime
/Users/charmyshah/GSE28829/GSE28829_RAW.tar 157736960 FALSE  644 2022-12-12 10:50:20 2022-12-12 10:50:20
                                                          atime uid gid      uname grname
/Users/charmyshah/GSE28829/GSE28829_RAW.tar 2022-12-12 10:50:17 501  20 charmyshah  staff
> untar("GSE28829/GSE28829_RAW.tar",exdir='data/')
> #reading in cells files using the affy package and providig teh path and store it in variable which is raw data object
> raw.data <-ReadAffy(celfile.path = "data/")
> #rma normalisation 
> normalized.data<- rma(raw.data)
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

Error in getCdfInfo(object) : 
  Could not obtain CDF environment, problems encountered:
Specified environment does not contain 2.0
Library - package 2.0cdf not installed
Bioconductor - 2.0cdf not available

sessionInfo( )
r version 4.1.1
cdf • 799 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 12 hours ago
United States

Those celfiles appear to have incorrect headers

> library(affyio)
> read.celfile.header(dir()[2])
$cdfName
[1] "2.0"

$`CEL dimensions`
Cols Rows 
1164 1164

It should say HG-U133_plus2. Since they don't have the correct header, you have to specify the cdfname.

> dat <- ReadAffy(filenames = dir()[2:24], cdfname = "hgu133plus2")
> eset <- rma(dat)
Background correcting
Normalizing
Calculating Expression
>
ADD COMMENT

Login before adding your answer.

Traffic: 552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6