Using Limma to normalize data sets
1
0
Entering edit mode
mahm ▴ 20
@mahm-16884
Last seen 5.5 years ago

I want to read data from the CEL files of GSE53454 and GSE76896 ,both from same platform. Could someone suggest me the steps to be followed.

I had a chance to look at the documentation limmaUsersGuide(). I could find examples of other types other than affy. Could someone suggest me a tutorial on how to process the data from affy?

 

 

 

limma affy • 2.8k views
ADD COMMENT
1
Entering edit mode
thokall ▴ 160
@thokall-14310
Last seen 4 weeks ago
Swedish Museum of Natural History

 

The limma manual has information on importing and analysing Affymetrix data. The second example in section 3.2 contains some basic info. If you are struggling with something more specific please include the code that you have tried so far, as that makes it easier to help out.

 

ADD COMMENT
0
Entering edit mode

Many thanks for the response. I had gone through that section of the code in 3.2, which is illustrated for 2 color arrays. I want to first load the CEL files of one color array. I'm sorry, I don't have a code .I have worked only with GEO query parsing package. For instance in GEO library, there is getGEO('GSExxxx') command to automatically fetch  the GSE file from the database. How do we get started here? I couldn't really understand what the "targets.txt" (of example in section 3.2)is.

I'm a beginner. Excuse me for the naive questions

ADD REPLY
0
Entering edit mode

The second examples uses the ReadAffy function that Will import Affymetrix data. The function need a Vector of filenames to read (in the exemple this information stored in target$FileName. You can hence download the files you are interested (the .cel files) and create a vector with these file names and then import your data with the ReadAffy function as follows

Affydata <- ReadAffy(fileNameVector)

 

ADD REPLY
0
Entering edit mode

I have downloaded the RAW.tar file which contains the  .cdf.gz file and .CEL.gz file for each sample.

Can we give the GSExxxx_RAW.tar file as the names in the filename vector? Actually there are more than 50 .CEL files.

Secondly, I tried,

filename <- c(".../data/GSE76896_RAW.tar",".../data/GSE53454_RAW.tar")
> Affydata <- ReadAffy(filename)
Error: file names must be specified using a character vector, not a ‘list’

But the vector is of character type.
> is.character(filename)
[1] TRUE

Have I missed something?

 

ADD REPLY
0
Entering edit mode

you need to untar the downloaded object so that you can see the files that this archive contains. If you look at the command given in the limma manual it supplies the character vector to the argument filenames (sorry if I confused you earlier).

If your files are in "Data/GSE53454_RAW" you can list all zipped CEL files in that directory and then import this using the ReadAffy function.

Try this:

downloadedAffyFiles <- list.files(path = "Data/GSE53454_RAW", pattern = "CEL.gz$"
AffyData <- ReadAffy(filenames = downloadedAffyFiles)

 

 
ADD REPLY
0
Entering edit mode

I tried as you suggested,

The following error appears :(

> downloadedAffyFiles <- list.files(path = "../Data/GSE53454_RAW", pattern = "CEL.gz$")
> AffyData <- ReadAffy(filenames = downloadedAffyFiles)
Error: the following are not valid files:
    GSM1293805_10_4_Control_0h.CEL.gz
   GSM1293806_10_4_Control_12h.CEL.gz
   GSM1293807_10_4_Control_1h.CEL.gz
   GSM1293808_10_4_Control_24h.CEL.gz
   GSM1293809_10_4_Control_2h.CEL.gz
   GSM1293810_10_4_Control_36h.CEL.gz
   GSM1293811_10_4_Control_48h.CEL.gz
   GSM1293812_10_4_Control_4h.CEL.gz
   GSM1293813_10_4_Control_60h.CEL.gz
   GSM1293814_10_4_Control_72h.CEL.gz
   GSM1293815_10_4_Control_84h.CEL.gz
   GSM1293816_10_4_Control_8h.CEL.gz
   GSM1293817_10_4_Control_96h.CEL.gz
   GSM1293818_10_4_Cytok_04h.CEL.gz
   GSM1293819_10_4_Cytok_12h.CEL.gz
   GSM1293820_10_4_Cytok_1h.CEL.gz
   GSM1293821_10_4_Cytok_24h.CEL.gz
   GSM1293822_10_4_Cytok_2h.CEL.gz
   GSM1293823_10_4_Cytok_36h.CEL.gz
   GSM1293824_10_4_Cytok_48h.CEL.gz
   GSM1293825_10_4_Cytok_60h.CEL.gz
   GSM1293826_10_4_Cytok_72h.CEL.gz
   GSM1293827_10_4_Cytok_84h.CEL.gz
   GSM1293828_10_4_Cytok_96h.CEL.gz
   GSM1293829_19_10_Control_0h.CEL.gz
   GSM1293830_19_10_Control_108h.CE

 

ADD REPLY
0
Entering edit mode

at the help page of ReadAffy (found by ?ReadAffy) find the option 'compress':

compress: are the CEL files compressed?

 

Thus (assuming the last file in your list (GSM1293830_19_10_Control_108h.CE) has a wrong extension because of an incomplete copy/paste error):

AffyData <- ReadAffy(filenames = downloadedAffyFiles, compress=TRUE)
ADD REPLY
0
Entering edit mode

Check that your working directory contain the files of interest or modify your code to contain the complete path

downloadedAffyFiles <- list.files("~/Downloads/GSE53454_RAW/", pattern = "CEL.gz", full.names=TRUE)
 
ADD REPLY
0
Entering edit mode

The files are in the current working directory. Also, GSM1293830_19_10_Control_108h.CE was displayed in the terminal.The file in with the correct extension in the directory.Now ,I get a status that reads "Adjusting for non-specific binding.Killed" ? Is something wrong?

AffyBatch object
size of arrays=1164x1164 features (56 kb)
cdf=HG-U133_Plus_2 (54675 affyids)
number of samples=90
number of genes=54675
annotation=hgu133plus2
notes=
Warning messages:
1: replacing previous import ‘AnnotationDbi::tail’ by ‘utils::tail’ when loading ‘hgu133plus2cdf’
2: replacing previous import ‘AnnotationDbi::head’ by ‘utils::head’ when loading ‘hgu133plus2cdf’
Adjusting for optical effect..........................................................................................Done.
Computing affinitiesLoading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

.Done.
Adjusting for non-specific binding.Killed

I'm running the following,

library(gcrma)
library(limma)
downloadedAffyFiles <- list.files(path = "../Data/GSE53454_RAW/", pattern = "CEL.gz$",full.names=TRUE)
AffyData <- ReadAffy(filenames = downloadedAffyFiles)
AffyData
eset <- gcrma(AffyData)
eset
ADD REPLY
0
Entering edit mode

I have tried your code and I do not have this issue. Below is the output from your code block on my machine. Since I can not reproduce your problem it is hard to know what is going on, but I suggest you double-check that you have the latest version of bioconductor and updated versions of all the packages you use.

> AffyData <- ReadAffy(filenames = downloadedAffyFiles)
> eset <- gcrma(tt)
Adjusting for optical effect.........................................Done.
Computing affinities[1] "Checking to see if your internet connection works..."
installing the source package 'hgu133plus2probe'

trying URL 'https://bioconductor.org/packages/3.7/data/annotation/src/contrib/hgu133plus2probe_2.18.0.tar.gz'
Content type 'application/x-gzip' length 8505171 bytes (8.1 MB)
==================================================
downloaded 8.1 MB

* installing *source* package 'hgu133plus2probe' ...
** R
** data
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (hgu133plus2probe)

The downloaded source packages are in
    '/private/var/folders/2t/bbthtm7j4tb5xdqls3yt61_r0000gn/T/RtmpVq2fpC/downloaded_packages'
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

.Done.
Adjusting for non-specific binding..........................................................................................Done.
Normalizing
Calculating Expression
>
ADD REPLY
0
Entering edit mode

I'm updated to R version 3.5.1 (2018-07-02) and Bioconductor 3.7.Now, I find this error

eset <- gcrma(AffyData)
Adjusting for optical effect..........................................................................................Done.
Computing affinities[1] "Checking to see if your internet connection works..."
trying URL 'https://bioconductor.org/packages/3.7/data/annotation/src/contrib/hgu133plus2probe_2.18.0.tar.gz'
Content type 'application/x-gzip' length 8505171 bytes (8.1 MB)
==================================================
downloaded 8.1 MB


The downloaded source packages are in
    ‘/tmp/RtmpnOAh1t/downloaded_packages’
Error in (function (package, help, pos = 2, lib.loc = NULL, character.only = FALSE,  :
  there is no package called ‘hgu133plus2probe’
In addition: Warning messages:
1: In system2(cmd0, args, env = env, stdout = outfile, stderr = outfile,  :
  system call failed: Cannot allocate memory
2: In system2(cmd0, args, env = env, stdout = outfile, stderr = outfile,  :
  error in running command
3: In install.packages(probepackage, lib = lib, repos = biocinstallRepos(),  :
  installation of package ‘hgu133plus2probe’ had non-zero exit status

 

Could you please help?

 

ADD REPLY
0
Entering edit mode

Since I can not reproduce your problems I can not really be of much help here, besides trying to install packages that fails on your own and then use the vignette of gcrma to see how to use this package efficiently.

Good luck!

ADD REPLY
0
Entering edit mode

Fixed things!!

Works perfect.

Thanks a lot for the tremendous support!

ADD REPLY

Login before adding your answer.

Traffic: 742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6