Question

Find differentially expressed values

0

Entering edit mode

Safana • 0

@safana-7745

Last seen 8.9 years ago

United States

Hello,

I have 2 files. first column will be the genes name and the other columns will be the cases with expression value for each gene in each case. in other word, the rows will be the expression values for the genes and the columns will be the case which represent the replicates. I am trying to find the differentially expression value for each gene between the two files. first file contains 300 replicates and the second file contains more than 400 replicates. I read the user guide SAGE profiles of normal and tumor tissue section and edgeR tutorial but they said I have to have only 2 columns. I would like to ask if there is any way to work with the whole files instead of divide then into 2 columns.

Thank you,

Safana

differential gene expression differential expression edgeR • 735 views

ADD COMMENT • link updated 8.9 years ago by Gordon Smyth 50k • written 8.9 years ago by Safana • 0

score 0 · Answer 1 · 2015-05-13

There are several examples in the edgeR User's Guide of reading the counts in from one big file. I've never known anyone before to have different groups in different files, but putting them together is pretty easy.

Do both files contain read counts? Do both files have the same number of rows, with the same gene names in the same order? Were both files produced by the same read alignment and read counting pipeline? If no, then where are the files actually from and are you sure that they are comparable? If yes, then you can simply proceed like this:

y1 <- read.delim("file1.txt")
y2 <- read.delim("file1.txt")
counts <- cbind(y1[,-1],y2[,-1])
genes <- y1[,1,drop=FALSE)
dge <- DGEList(counts=counts,genes=genes)
n1 <- ncol(y1)
n2 <- ncol(y2)
dge$samples$group <- rep(c(1,2), c(n1,n2))

Basically, reading data relies more on the read facilities of R itself than on functions in the edgeR package.