Question: Find differentially expressed values
0
4.6 years ago by
Safana0
United States
Safana0 wrote:

Hello,

I have 2 files. first column will be the genes name and the other columns will be the cases with expression value for each gene in each case. in other word, the rows will be the expression values for the genes and the columns will be the case which represent the replicates. I am trying to find the differentially expression value for each gene between the two files. first file contains 300 replicates and the second file contains more than 400 replicates. I read the user guide SAGE profiles of normal and tumor tissue section and edgeR tutorial but they said I have to have only 2 columns. I would like to ask if there is any way to work with the whole files instead of divide then into 2 columns.

Thank you,

Safana

modified 4.6 years ago by Gordon Smyth39k • written 4.6 years ago by Safana0
0
4.6 years ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

There are several examples in the edgeR User's Guide of reading the counts in from one big file. I've never known anyone before to have different groups in different files, but putting them together is pretty easy.

Do both files contain read counts? Do both files have the same number of rows, with the same gene names in the same order? Were both files produced by the same read alignment and read counting pipeline? If no, then where are the files actually from and are you sure that they are comparable? If yes, then you can simply proceed like this:

y1 <- read.delim("file1.txt")
counts <- cbind(y1[,-1],y2[,-1])
genes <- y1[,1,drop=FALSE)
dge <- DGEList(counts=counts,genes=genes)
n1 <- ncol(y1)
n2 <- ncol(y2)
dge$samples$group <- rep(c(1,2), c(n1,n2))

Basically, reading data relies more on the read facilities of R itself than on functions in the edgeR package.