Question: Creating an Expression Set with a csv file
1
gravatar for Vani
4.3 years ago by
Vani20
United States
Vani20 wrote:

Hi,

Is it possible to create an expression set from 1. a text file containing an the results of an RMA normalization 2. a csv file containing the annotation for the specific affy chip pertaining to the normalized data. If yes please advise.

Thanks.

annotation eset • 5.4k views
ADD COMMENTlink modified 4.3 years ago by Diego Diez730 • written 4.3 years ago by Vani20
Answer: Creating an Expression Set with a csv file
2
gravatar for Diego Diez
4.3 years ago by
Diego Diez730
Japan
Diego Diez730 wrote:

It is possible to construct an ExpressionSet from separate files. For example imagine you have three separate files, one for the expression data (e.g. from RMA as you mentioned), another with probe annotations (like your csv one) and another with sample annotations. For example:

#===========================
# phenoData - sample annotations.
# pdata.txt (comma separated)
# 
# id,treatment
# sample1,1
# sample2,1
# sample3,0
# sample4,0

#===========================
# featureData - probe annotations.
# fdata.txt (comma separated)
# 
# id,symbol
# probe1,gene1
# probe2,gene2
# probe3,gene3
# probe4,gene4

#===========================
# expression data
# 
# exprs.txt (tab delimited)
# sample1 sample2 sample3 sample4
# probe1 10 9 11 8
# probe2 10 11 2 1
# probe3 2 3 12 10
# probe4 1 3 2 1

I assumed the annotations to be comma separated value files and the expression data to be tab separated file. But this does not matter- it only changes the R function used to read it. You can do then something like this:

library(Biobase)

# phenoData:
tmp <- read.csv("pdata.txt", row.names = 1)
pdata <- AnnotatedDataFrame(tmp)

# featureData:
tmp <- read.csv("fdata.txt", row.names = 1)
fdata <- AnnotatedDataFrame(tmp)

# expression data:
tmp <- read.table("exprs.txt")
m <- as.matrix(tmp)

## create ExpressionSet object:
eset <- new("ExpressionSet", exprs = m, phenoData = pdata, featureData = fdata)

pData(eset)
fData(eset)
eset$treatment

The only requirement (I think) is that the sample names and feature names agree between the different files.

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Diego Diez730
Answer: Creating an Expression Set with a csv file
1
gravatar for svlachavas
4.3 years ago by
svlachavas740
Greece/Athens/National Hellenic Research Foundation
svlachavas740 wrote:

Dear Vani,

suppose for the first case of the txt file, the file has headers with the sample names and the rows are the probesets.

You can use then in your current directory:

rma.file <- read.table("name of your file.txt", header=TRUE, sep="\t") # the last argument not neseccary

eset <- new("ExpressionSet", exprs=as.matrix(rma.file))

In similar way, you can use read,csv for the csv file . You can check also in more detail the above functions, including read.delim

 

 

ADD COMMENTlink written 4.3 years ago by svlachavas740

Thank you.

So if I wanted to add the csv file as a parameter in the expression: eset <- new("ExpressionSet", exprs=as.matrix(rma.file)), I would read in the csv file using the read.csv, then add it like this: eset <- new("ExpressionSet", exprs=as.matrix(rma.file), annotation = cvs.file)?

ADD REPLYlink written 4.3 years ago by Vani20

Dear Vani,

please excuse me because i misread your second part. By the annotation of the specific affy chip you mean the pheno data object ? that is, the phenotype of your data ? if so, you could use :

read.csv to read the csv file and convert it into a data.frame object(can be made after read.csv with the function as.data.frame) and the if the object is called  for instance dat2:

phenoData(eset) <- new("AnnotatedDataFrame", data=dat2)

ADD REPLYlink written 4.3 years ago by svlachavas740

The csv file contains the annotation of the HuGene-1_0-st-v1 affymetrix chip. So basically it has all the info like enterzID, GeneSymbol etc. Would it still be considered  to be a pheno data object?

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by Vani20

No, in my opinion there is no need to load it in r, as this is your annotation file, which you could use after your statistical analysis, to annotate your results. By "pheno data" object i meant the description of your samples: i.e disease, healthy, cancer, control etc. Moreover, although i have never used this specific platform of Affymetrix you could find useful the specific package of the specific HuGene platform (http://bioconductor.org/packages/release/data/annotation/html/pd.hugene.1.0.st.v1.html)

ADD REPLYlink written 4.3 years ago by svlachavas740

Cool. Thanks.

ADD REPLYlink written 4.3 years ago by Vani20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 232 users visited in the last hour