undefined columns selected
1
0
Entering edit mode
@405f7487
Last seen 15 days ago
United States

Hi,

I was doing differential genes analysis with DESeq2. My experiment design is very simple, I used a drug to treat the cancer cell line, I did total RNAseq for both control and drug-treated, and each of them had three repetitions. When I did the "Convert sample variable mappings to an appropriate form that DESeq2 can read" task, I got a error, "Error in [.data.frame(sampleInfo, , keep) : undefined columns selected".I tried "many answers", but still failed. Really need help!

the detailed codes as below:


gene_id sample1 sample2 sample3 sample4 sample5 sample6
1     MSTRG.1|DDX11L1      12       8      13      20      28      39
2             MSTRG.2    8647    8341    8044   18022   17711   20429
3 MSTRG.3|MIR1302-2HG       0       0       0       0       0       0
4   MSTRG.3|MIR1302-2       0       0       0       2       0       0
5     MSTRG.4|FAM138A       0       0       0       0       0       0
6      MSTRG.5|OR4G4P       0       0       0       0       0       0
> geneID <- myData$gene_id > sampleIndex <- grepl("sample\\d+",colnames(myData)) > myData <- as.matrix(myData[,sampleIndex]) > rownames(myData) <- geneID > head(myData) sample1 sample2 sample3 sample4 sample5 sample6 MSTRG.1|DDX11L1 " 12" " 8" " 13" " 20" " 28" " 39" MSTRG.2 " 8647" " 8341" " 8044" " 18022" " 17711" " 20429" MSTRG.3|MIR1302-2HG " 0" " 0" " 0" " 0" " 0" " 0" MSTRG.3|MIR1302-2 " 0" " 0" " 0" " 2" " 0" " 0" MSTRG.4|FAM138A " 0" " 0" " 0" " 0" " 0" " 0" MSTRG.5|OR4G4P " 0" " 0" " 0" " 0" " 0" " 0" > sampleInfo <- read.csv("PHENO_DATA.csv") > head(sampleInfo) ids.........groups 1 sample1 control1 2 sample2 control2 3 sample3 control3 4 sample4 lycorine1 5 sample5 lycorine2 6 sample6 lycorine3 > rownames(sampleInfo) <- sampleInfo$ids
> keep <- c("ids", "groups")
> sampleInfo <- sampleInfo[,keep]
Error in [.data.frame(sampleInfo, , keep) : undefined columns selected

DESeq2 • 158 views
0
Entering edit mode
swbarnes2 ▴ 800
@swbarnes2-14086
Last seen 7 hours ago
San Diego

Clearly, all that stuff you did after importing the count data made things worse. Why can't you set the rownames as you import?

0
Entering edit mode

Hi, Do you mean just leave the PHENO_DATA.csv as it was? But how to delete the first column, I mean the numbers "1, 2, 3, 4, 5, 6"? I checked a lot of examples, they set the sampleData (sampleInfo in my case) as below:

ids groups sample1 control1 sample2 control2 sample3 control3 sample4 lycorine1 sample5 lycorine2 sample6 lycorine3

0
Entering edit mode

You just need to do something like:

myData <- read.csv('gene_count_matrix.csv', row.names = 1, header = TRUE)


Then myData should represent numerical data, and the other issues should vanish.

By the way, your question is unrelated to DESeq2 and should probably have been asked on a more generic bioinformatics website.

0
Entering edit mode

Kevin Blighe Blighe Thank you for your help! Do you mean I got this error because I load my count data in a wrong way? But I got the error "Error in [.data.frame(sampleInfo, , keep) : undefined columns selected" from my sampleInfo data.

1
Entering edit mode

Yes, my suggestion will help to solve the ultimate error that you receive. Please check the input and output of every command that you are running; however, please start by first using:

myData <- read.csv('gene_count_matrix.csv', row.names = 1, header = TRUE)


There are other general issues. For example, when you run this, head(myData), one can clearly see how your object, myData, is non-numeric - all numbers are wrapped in quotation marks and have leading whitespace - why is this? How was gene_count_matrix.csv produced? Please show your screen to the person who produced this file (gene_count_matrix.csv).

Later, when you run sampleInfo <- read.csv("PHENO_DATA.csv"), you can see that it is not detecting the delimiter. Please use, with read.csv(), the correct value for sep, which is usually sep = ',' or sep = '\t'`.