Question: Using Limma to normalize data sets
0
12 months ago by
mahm20
mahm20 wrote:

I want to read data from the CEL files of GSE53454 and GSE76896 ,both from same platform. Could someone suggest me the steps to be followed.

I had a chance to look at the documentation limmaUsersGuide(). I could find examples of other types other than affy. Could someone suggest me a tutorial on how to process the data from affy?

affy limma • 571 views
modified 12 months ago by thokall160 • written 12 months ago by mahm20
Answer: Using Limma to normalize data sets
1
12 months ago by
thokall160
Swedish Museum of Natural History
thokall160 wrote:

The limma manual has information on importing and analysing Affymetrix data. The second example in section 3.2 contains some basic info. If you are struggling with something more specific please include the code that you have tried so far, as that makes it easier to help out.

Many thanks for the response. I had gone through that section of the code in 3.2, which is illustrated for 2 color arrays. I want to first load the CEL files of one color array. I'm sorry, I don't have a code .I have worked only with GEO query parsing package. For instance in GEO library, there is getGEO('GSExxxx') command to automatically fetch  the GSE file from the database. How do we get started here? I couldn't really understand what the "targets.txt" (of example in section 3.2)is.

I'm a beginner. Excuse me for the naive questions

The second examples uses the ReadAffy function that Will import Affymetrix data. The function need a Vector of filenames to read (in the exemple this information stored in target$FileName. You can hence download the files you are interested (the .cel files) and create a vector with these file names and then import your data with the ReadAffy function as follows Affydata <- ReadAffy(fileNameVector) ADD REPLYlink modified 12 months ago • written 12 months ago by thokall160 I have downloaded the RAW.tar file which contains the .cdf.gz file and .CEL.gz file for each sample. Can we give the GSExxxx_RAW.tar file as the names in the filename vector? Actually there are more than 50 .CEL files. Secondly, I tried, filename <- c(".../data/GSE76896_RAW.tar",".../data/GSE53454_RAW.tar") > Affydata <- ReadAffy(filename) Error: file names must be specified using a character vector, not a ‘list’ But the vector is of character type. > is.character(filename) [1] TRUE Have I missed something? ADD REPLYlink written 12 months ago by mahm20 you need to untar the downloaded object so that you can see the files that this archive contains. If you look at the command given in the limma manual it supplies the character vector to the argument filenames (sorry if I confused you earlier). If your files are in "Data/GSE53454_RAW" you can list all zipped CEL files in that directory and then import this using the ReadAffy function. Try this: downloadedAffyFiles <- list.files(path = "Data/GSE53454_RAW", pattern = "CEL.gz$"
AffyData <- ReadAffy(filenames = downloadedAffyFiles)



I tried as you suggested,

The following error appears :(

> downloadedAffyFiles <- list.files(path = "../Data/GSE53454_RAW", pattern = "CEL.gz$") > AffyData <- ReadAffy(filenames = downloadedAffyFiles) Error: the following are not valid files: GSM1293805_10_4_Control_0h.CEL.gz GSM1293806_10_4_Control_12h.CEL.gz GSM1293807_10_4_Control_1h.CEL.gz GSM1293808_10_4_Control_24h.CEL.gz GSM1293809_10_4_Control_2h.CEL.gz GSM1293810_10_4_Control_36h.CEL.gz GSM1293811_10_4_Control_48h.CEL.gz GSM1293812_10_4_Control_4h.CEL.gz GSM1293813_10_4_Control_60h.CEL.gz GSM1293814_10_4_Control_72h.CEL.gz GSM1293815_10_4_Control_84h.CEL.gz GSM1293816_10_4_Control_8h.CEL.gz GSM1293817_10_4_Control_96h.CEL.gz GSM1293818_10_4_Cytok_04h.CEL.gz GSM1293819_10_4_Cytok_12h.CEL.gz GSM1293820_10_4_Cytok_1h.CEL.gz GSM1293821_10_4_Cytok_24h.CEL.gz GSM1293822_10_4_Cytok_2h.CEL.gz GSM1293823_10_4_Cytok_36h.CEL.gz GSM1293824_10_4_Cytok_48h.CEL.gz GSM1293825_10_4_Cytok_60h.CEL.gz GSM1293826_10_4_Cytok_72h.CEL.gz GSM1293827_10_4_Cytok_84h.CEL.gz GSM1293828_10_4_Cytok_96h.CEL.gz GSM1293829_19_10_Control_0h.CEL.gz GSM1293830_19_10_Control_108h.CE ADD REPLYlink written 12 months ago by mahm20 at the help page of ReadAffy (found by ?ReadAffy) find the option 'compress': compress: are the CEL files compressed? Thus (assuming the last file in your list (GSM1293830_19_10_Control_108h.CE) has a wrong extension because of an incomplete copy/paste error): AffyData <- ReadAffy(filenames = downloadedAffyFiles, compress=TRUE) ADD REPLYlink modified 12 months ago • written 12 months ago by Guido Hooiveld2.5k Check that your working directory contain the files of interest or modify your code to contain the complete path downloadedAffyFiles <- list.files("~/Downloads/GSE53454_RAW/", pattern = "CEL.gz", full.names=TRUE)   ADD REPLYlink written 12 months ago by thokall160 The files are in the current working directory. Also, GSM1293830_19_10_Control_108h.CE was displayed in the terminal.The file in with the correct extension in the directory.Now ,I get a status that reads "Adjusting for non-specific binding.Killed" ? Is something wrong? AffyBatch object size of arrays=1164x1164 features (56 kb) cdf=HG-U133_Plus_2 (54675 affyids) number of samples=90 number of genes=54675 annotation=hgu133plus2 notes= Warning messages: 1: replacing previous import ‘AnnotationDbi::tail’ by ‘utils::tail’ when loading ‘hgu133plus2cdf’ 2: replacing previous import ‘AnnotationDbi::head’ by ‘utils::head’ when loading ‘hgu133plus2cdf’ Adjusting for optical effect..........................................................................................Done. Computing affinitiesLoading required package: AnnotationDbi Loading required package: stats4 Loading required package: IRanges Loading required package: S4Vectors Attaching package: ‘S4Vectors’ The following object is masked from ‘package:base’: expand.grid .Done. Adjusting for non-specific binding.Killed I'm running the following, library(gcrma) library(limma) downloadedAffyFiles <- list.files(path = "../Data/GSE53454_RAW/", pattern = "CEL.gz$",full.names=TRUE)
AffyData
eset <- gcrma(AffyData)
eset

I have tried your code and I do not have this issue. Below is the output from your code block on my machine. Since I can not reproduce your problem it is hard to know what is going on, but I suggest you double-check that you have the latest version of bioconductor and updated versions of all the packages you use.

> AffyData <- ReadAffy(filenames = downloadedAffyFiles)
> eset <- gcrma(tt)
Computing affinities[1] "Checking to see if your internet connection works..."
installing the source package 'hgu133plus2probe'

trying URL 'https://bioconductor.org/packages/3.7/data/annotation/src/contrib/hgu133plus2probe_2.18.0.tar.gz'
Content type 'application/x-gzip' length 8505171 bytes (8.1 MB)
==================================================

* installing *source* package 'hgu133plus2probe' ...
** R
** data
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (hgu133plus2probe)

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

expand.grid

.Done.
Normalizing
Calculating Expression
>

I'm updated to R version 3.5.1 (2018-07-02) and Bioconductor 3.7.Now, I find this error

eset <- gcrma(AffyData)
Computing affinities[1] "Checking to see if your internet connection works..."
trying URL 'https://bioconductor.org/packages/3.7/data/annotation/src/contrib/hgu133plus2probe_2.18.0.tar.gz'
Content type 'application/x-gzip' length 8505171 bytes (8.1 MB)
==================================================

Error in (function (package, help, pos = 2, lib.loc = NULL, character.only = FALSE,  :
there is no package called ‘hgu133plus2probe’
1: In system2(cmd0, args, env = env, stdout = outfile, stderr = outfile,  :
system call failed: Cannot allocate memory
2: In system2(cmd0, args, env = env, stdout = outfile, stderr = outfile,  :
error in running command
3: In install.packages(probepackage, lib = lib, repos = biocinstallRepos(),  :
installation of package ‘hgu133plus2probe’ had non-zero exit status

Since I can not reproduce your problems I can not really be of much help here, besides trying to install packages that fails on your own and then use the vignette of gcrma to see how to use this package efficiently.

Good luck!

Fixed things!!

Works perfect.

Thanks a lot for the tremendous support!