htseq counts raw data preparation for DESeq2 input file
1
0
Entering edit mode
Pedram • 0
@Pedram-24588
Last seen 3.2 years ago

Hello.

I have the clinical data from the PAAD patients (n=182). However, when I am trying to do DE, I cannot prepare the data for dds command to kick off the DESeq2 commands.

I don't know what the problem is, but I get some errors while I looked for them on the internet and even went over the DESeq2 tutorial and other examples. It might be noteworthy to declare that I do not have the raw htseq count data and I tried to download it from TCGABiolinks package. I would be happy if I can know your worthy comments. I also attached my colData to this email. Here are the codes I run:

library(TCGAbiolinks) library(DESeq2)

query1 <- GDCquery(project = "TCGA-PAAD", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", file.type = "htseq.counts.gz", experimental.strategy = "RNA-Seq", legacy = F)

GDCdownload(query1, method = "api", files.per.chunk = 10)

maindata <- GDCprepare(query1)

tumorsamplesonly <- maindata[,maindata$sample_type_id == "01"]

coldata01 <- read.csv(file = "colData.csv", row.names = 1) class(coldata01)

row.names(coldata01) <- coldata01$bcr_patient_barcode

coldata01$tumor_status <- factor(coldata01$tumor_status, levels = c("with_tumor", "tumor_free", ordered = F))

(Now, in this level, I stuck in the following codes (codes are in order)

coldata01 <- coldata01[,!is.na(coldata01$tumor_status)]

(THE ERROR: Error in [.data.frame(coldata01, , !is.na(coldata01$tumor_status)) : undefined columns selected)

coldata01$tumor_status <- relevel(coldata01$tumor_status, ref = "low")

(THE ERROR: Error in relevel.factor(coldata01$tumor_status, ref = "low") : 'ref' must be an existing level)

colData(tumorsamplesonly) <- cbind(colData(tumorsamplesonly), coldata01)

(THE ERROR: Error in DataFrame(..., check.names = FALSE) : different row counts implied by arguments)

dds <- DESeqDataSet(maindata,coldata01, ~tumor_status)

Tip: I hope that the dollar sign between words in commands can be seen clearly. For example, in this command: coldata01$tumor_status

DESeq2 • 962 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 minutes ago
United States

The error is intended to be pretty clear. And it seems that's true?

You did

coldata01 <- coldata01[,!is.na(coldata01$tumor_status)]

and then you got

 Error in [.data.frame(coldata01, , !is.na(coldata01$tumor_status)) : undefined columns selected

Which says 'undefined columns selected'. Which is the same thing as saying that none of the columns are called 'tumor_status'.

ADD COMMENT
0
Entering edit mode

No, I have this column in my colData

ADD REPLY
0
Entering edit mode

Step through what you are doing:

DF <- DF[ , rows of DF where column X is not NA ]

Does this make any sense? You are subsetting columns based on rows. I don't see why you would want to do that.

Furthermore, go slow, and when you encounter an error, take a look at the individual pieces. You'll be able to figure things out on your own computer:

Before the line that gives an error, examine the pieces of the command:

coldata01$tumor_status
is.na(coldata01$tumor_status)
ADD REPLY
0
Entering edit mode

[1] tumor_free tumor_free with_tumor tumor_free with_tumor with_tumor tumor_free with_tumor [9] with_tumor with_tumor tumor_free with_tumor tumor_free with_tumor <NA> with_tumor [17] with_tumor with_tumor with_tumor with_tumor with_tumor with_tumor tumor_free tumor_free [25] with_tumor with_tumor <NA> with_tumor with_tumor tumor_free tumor_free tumor_free [33] tumor_free tumor_free tumor_free with_tumor tumor_free tumor_free with_tumor with_tumor [41] <NA> with_tumor with_tumor tumor_free with_tumor with_tumor with_tumor tumor_free [49] with_tumor with_tumor with_tumor tumor_free tumor_free with_tumor <NA> tumor_free [57] with_tumor <NA> <NA> with_tumor tumor_free tumor_free tumor_free with_tumor [65] with_tumor tumor_free with_tumor with_tumor <NA> with_tumor with_tumor with_tumor [73] with_tumor with_tumor with_tumor with_tumor with_tumor tumor_free <NA> <NA>
[81] with_tumor <NA> <NA> <NA> with_tumor with_tumor with_tumor with_tumor [89] tumor_free tumor_free with_tumor with_tumor with_tumor tumor_free with_tumor with_tumor [97] with_tumor with_tumor with_tumor tumor_free with_tumor tumor_free tumor_free with_tumor [105] tumor_free with_tumor with_tumor with_tumor with_tumor tumor_free tumor_free with_tumor [113] with_tumor with_tumor with_tumor with_tumor <NA> with_tumor tumor_free with_tumor [121] tumor_free tumor_free <NA> with_tumor with_tumor with_tumor with_tumor with_tumor [129] with_tumor with_tumor with_tumor with_tumor with_tumor with_tumor with_tumor tumor_free [137] with_tumor with_tumor with_tumor tumor_free <NA> with_tumor tumor_free with_tumor [145] with_tumor with_tumor tumor_free tumor_free tumor_free with_tumor with_tumor <NA>
[153] tumor_free with_tumor with_tumor <NA> with_tumor <NA> tumor_free tumor_free [161] with_tumor tumor_free with_tumor <NA> with_tumor tumor_free with_tumor with_tumor [169] with_tumor tumor_free <NA> with_tumor tumor_free with_tumor with_tumor tumor_free [177] <NA> tumor_free tumor_free Levels: with_tumor tumor_free FALSE

ADD REPLY
0
Entering edit mode

This is what I get from coldata01$tumor_status

ADD REPLY
0
Entering edit mode

I'm not necessarily asking to look over your code, so much as pointing out why what you did 1) didn't make sense and 2) lead to error.

You may want to work with someone with R experience to look over your code with you.

ADD REPLY
0
Entering edit mode

Thank you professor. Is there any forum to work with? because I put my questions on Biostar and stackoverflow and other related web forums and could not get the answer

ADD REPLY
0
Entering edit mode

I did give you an answer, but you have to work on it. I think I'm done replying to this thread now.

ADD REPLY

Login before adding your answer.

Traffic: 965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6