Question: How to add row data to DESeqDataSetFromMatrix
0
gravatar for rattray56
4 weeks ago by
rattray560
rattray560 wrote:

I have a rawcount RNAseq Illumina data set, and a metadata table, but when I try to make a DEseq2DataSetFromMatrix it tells me I have one extra column in my countdata. I do, it is the gene names. How can I get around this. I have tried adding nrow= gene_ID and then just listing the genes, but I don't like the idea of separating the gene names from the count table. Just seems too risky. Any suggestions are really appreciated. Alison

deseq2 • 107 views
ADD COMMENTlink modified 29 days ago by Michael Love24k • written 4 weeks ago by rattray560

You need to post the top few lines of your files for anyone to be able to help you. Have you looked at the file formats used in datasets from tutorials, to see how they differ from what you have?

ADD REPLYlink written 29 days ago by swbarnes2210
Answer: How to add row data to DESeqDataSetFromMatrix
2
gravatar for Michael Love
29 days ago by
Michael Love24k
United States
Michael Love24k wrote:

When you read the file into R you should specify that the first column is the rownames.

ADD COMMENTlink written 29 days ago by Michael Love24k

I am logged in and have tried to post my code twice now... is there some trick?

ADD REPLYlink written 29 days ago by rattray560

I am logged in and have tried to post my code twice now... is there some trick?

ADD REPLYlink written 29 days ago by rattray560

Not so sure how to do that... help? Here is what I am attempting to do.
Also removed column 1 from cts data but I don't think that is the real issue (though advice appreciated!) Step1: import the counts and column data, delete unwanted columns. Import raw count data: cts <- read.table("RawCountFile_rsemgenes.txt", header = TRUE, sep = "\t") dim(cts)

[1] 47643 26

head(cts)

geneid clone57RNA clone43RNA2 clone67_RNA

1 ENSMUSG00000000001.4_Gnai3 10634 6954 6835

2 ENSMUSG00000000003.15_Pbsn 0 0 0

3 ENSMUSG00000000028.14_Cdc45 559 1570 807

4 ENSMUSG00000000031.15_H19 5748 174 4103

5 ENSMUSG00000000037.16_Scml2 37 194 49

6 ENSMUSG00000000049.11_Apoh 0 3 1

clone55RNA clone7RNA clone45RNA clone88RNA clone26RNA clone25RNA

1 6510 11463 7221 6256 7530 7268

2 0 0 0 0 0 0

3 1171 1089 1069 800 1088 1071

4 146 23529 435 1318 16302 101

5 96 52 147 45 97 84

6 0 0 0 0 0 0

(cut this off to save space) Import column data: coldat <- read.csv("brca2metadata.csv", header = TRUE, sep = ",") head(coldat)

clone_ID condition

1 clone57_RNA control

2 clone43RNA2 treated

3 clone67_RNA treated

4 clone55_RNA treated

5 clone7_RNA treated

6 clone45_RNA treated

dim(coldat)

[1] 25 2

notice that there is one more column in the cts data (presumably gene names, but lets find out) colnames(cts)

[1] "geneid" "clone57RNA" "clone43RNA2" "clone67_RNA"

[5] "clone55RNA" "clone7RNA" "clone45RNA" "clone88RNA"

[9] "clone26RNA" "clone25RNA" "clone11RNA" "clone35RNA_2"

[13] "clone91RNA" "clone83RNA" "clone3RNA" "clone53RNA"

[17] "clone6RNA" "clone12RNA" "clone69RNA" "clone94RNA"

[21] "clone95RNA" "clone70RNA" "clone36RNA" "clone29RNA"

[25] "clone58RNA" "clone54RNA_2"

rownames(coldat)

[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14"

[15] "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25"

rownames(coldat[1:25, 1])

NULL

So how do I get it to use the actual clone names for of the coldat, clearly not counting it as a column…. very frustrating! Until I can get those names to be the same, it will not be possible to construct a DESeq2 dataset! I can move the gene_ID names to a rownames column. but I cannot see how to remove the numbers on the coldata!

ADD REPLYlink written 28 days ago by rattray560

Thanks... I finally figured it out with some local help. Why was my question removed?

ADD REPLYlink written 27 days ago by rattray560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 279 users visited in the last hour