Search
Question: i need AGI code as my probset
0
gravatar for Angel
2.6 years ago by
Angel40
Berlin
Angel40 wrote:

hi, 

i have some CEL file and i did like below but i have X244901_at while i need AGI code even i used cdf 

library(affy)
library(vsn)
library(limma)
library(altcdfenvs)
library(simpleaffy)
# listing the cel files
celFiles <- list.celfiles()
# assigning the cel files to affyraw variable
affyraw=ReadAffy(filenames = celFiles)
# making cdf file
tmp.env=make.cdf.env("ATH1121501_At_TAIRG.cdf")
# performing vsn normalization
vsn.data <- expresso(affyraw, normalize.method="vsn", bg.correct=F, pmcorrect.method="pmonly", summary.method="medianpolish")
# examining the normalization
boxplot(affyraw,col="red")
plot(exprs(affyraw)[,1:2], log = "xy", pch=".",
     main="all")
# writing the result
write.table(vsn.data, file = "vsn1.txt", dec = ".", sep = "\t", quote = FALSE)

head(vsn.data[,1:2])
ExpressionSet (storageMode: lockedEnvironment)
assayData: 1 features, 2 samples 
  element names: exprs, se.exprs 
protocolData
  sampleNames: Col-0 24h primed.CEL.CEL Col-0 24h unprimed.CEL.CEL
  varLabels: ScanDate
  varMetadata: labelDescription
phenoData
  sampleNames: Col-0 24h primed.CEL.CEL Col-0 24h unprimed.CEL.CEL
  varLabels: sample
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation: ath1121501 

what is my fault please in the above code??? even i scared maybe i have written an incomplete normalized file

 

ADD COMMENTlink modified 2.4 years ago • written 2.6 years ago by Angel40
1
gravatar for Angel
2.4 years ago by
Angel40
Berlin
Angel40 wrote:

library (affy)

library (vsn)

Data<-ReadAffy()

eset <- expresso(Data, normalize.method="vsn", bg.correct=F, pmcorrect.method="pmonly", summary.method="medianpolish")

norm.data<-exprs(eset)

# The norm.data R object contains the normalized expression for every probeset in the ATH1 microarrays used in this example. In order to convert the probeset IDs to Arabidopsis gene identifiers, the fileftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/affy_ATH1_array_elements-2010-12-20.txt download from the TAIR database and place in the folder with the microarray data. In order to avoid ambiguous probeset associations (i.e. probesets that have multiple matches to genes), we only used probes that match only one gene in the Arabidopsis genome.
affy_names<-read.delim("affy_ATH1_array_elements-2010-12-20.txt",header=T)

# Select the columns that contain the probeset ID and corresponding AGI number. Please note that the positions used to index the matrix depend on the input format of the array elements file. You can change these numbers to index the corresponding columns if you are using a different format:
probe_agi<-as.matrix(affy_names[,c(1,5)])

# To associate the probeset with the corresponding AGI locus:
normalized.names<-merge(probe_agi,norm.data,by.x=1,by.y=0)[,-1]

# To remove probesets that do not match the Arabidopsis genome:
normalized.arabidopsis <-normalized.names[grep("AT",normalized.names[,1]),]

# To remove ambiguous probes:
normalized.arabidopsis.unambiguous<-normalized.arabidopsis[grep(pattern=";",normalized.arabidopsis[,1], invert=T),]

# In some cases, multiple probes match the same gene, due to updates in the annotation of the genome. To remove duplicated genes in the matrix:
normalized.agi.final<-normalized.arabidopsis.unambiguous[!duplicated(normalized.arabidopsis.unambiguous[,1]),]

# To assign the AGI number as row name:
rownames(normalized.agi.final)<-normalized.agi.final[,1]
normalized.agi.final<-normalized.agi.final[,-1]

#The resulting gene expression dataset contains unique row identifies (i.e. AGI locus), and different expression values obtained from different experiments on each column 

# To export this data matrix from R to a tab-delimited file use the following command. The file will be written to the folder that you set up as your working directory in R using the setwd() command in line 1 above:
write.table (normalized.agi.final,"vsn.txt", sep="\t",col.names=NA,quote=F)

ADD COMMENTlink written 2.4 years ago by Angel40
1
gravatar for James W. MacDonald
2.6 years ago by
United States
James W. MacDonald46k wrote:

Well, you are sort of doing random things here. First off, you don't need to generate your own cdfenv - you can just get that from MBNI. We used to provide a way to do this via biocLite(), but I guess that went away.

> download.file("http://mbni.org/customcdf/20.0.0/tairg.download/ath1121501attairgcdf_20.0.0.tar.gz", "ath1121501attairgcdf_20.0.0.tar.gz")
trying URL 'http://mbni.org/customcdf/20.0.0/tairg.download/ath1121501attairgcdf_20.0.0.tar.gz'
Content type 'application/x-gzip' length 1573285 bytes (1.5 MB)
==================================================
downloaded 1.5 MB

> install.packages("ath1121501attairgcdf_20.0.0.tar.gz", repos=NULL, type="source")
* installing *source* package  ath1121501attairgcdf  ...
** R
** data
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (ath1121501attairgcdf)

then you can do

abatch <- ReadAffy(cdfname = "ath1121501attaircdf")

eset <- justvsn(abatch)

And if you really think you should export the data you can do

write.exprs(eset, "vsn1.txt")

But I would instead recommend you continue with your analysis inside of R rather than whatever you were planning to do with that file.

ADD COMMENTlink written 2.6 years ago by James W. MacDonald46k

thank you,

my insisting on writing the data in a txt file is because  i need the normalized file as an input for another tool for GRN inference anyway i did like below

> setwd("/usr/data/nfs6/izadi/Fereshteh thesis2/Data/Microarray/CEL files")

> download.file("http://mbni.org/customcdf/20.0.0/tairg.download/ath1121501attairgcdf_20.0.0.tar.gz", "ath1121501attairgcdf_20.0.0.tar.gz")
trying URL 'http://mbni.org/customcdf/20.0.0/tairg.download/ath1121501attairgcdf_20.0.0.tar.gz'
Content type 'application/x-gzip' length 1573285 bytes (1.5 MB)
==================================================
downloaded 1.5 MB

> install.packages("ath1121501attairgcdf_20.0.0.tar.gz", repos=NULL, type="source")
Installing package into ‘/usr/people/home/izadi/R/x86_64-redhat-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
* installing *source* package ‘ath1121501attairgcdf’ ...
** R
** data
** inst
** preparing package for lazy loading
Creating a generic function for ‘nchar’ from package ‘base’ in package ‘S4Vectors’
** help
*** installing help indices
  converting help for package ‘ath1121501attairgcdf’
    finding HTML links ... done
    ath1121501attairgcdf                    html  
    ath1121501attairgdim                    html  
    geometry                                html  
** building package indices
** testing if installed package can be loaded
Creating a generic function for ‘nchar’ from package ‘base’ in package ‘S4Vectors’
* DONE (ath1121501attairgcdf)



> library(affy)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

    xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get,
    intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")',
    and for packages 'citation("pkgname")'.

> library(vsn)


> celFiles <- list.celfiles()
> abatch <- ReadAffy(filenames = celFiles, cdfname = "ath1121501attaircdf")
> eset <- justvsn(abatch)
vsn2: 506944 x 164 matrix (1 stratum). Please use 'meanSdPlot' to verify the fit.
> boxplot(eset,col="red")
Error in getCdfInfo(object) : 
  Could not obtain CDF environment, problems encountered:
Specified environment does not contain ath1121501attaircdf
Library - package ath1121501attaircdf not installed
Bioconductor - ath1121501attaircdf not available

> write.table(eset, file = "eset.txt", dec = ".", sep = "\t", quote = FALSE)
Error in as.data.frame.default(x[[i]], optional = TRUE) : 
  cannot coerce class "structure("AffyBatch", package = "affy")" to a data.frame

i only need a vsn normalized file that the rownames are AGI code not _at 

 

>

ADD REPLYlink written 2.6 years ago by Angel40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 141 users visited in the last hour