How to use DESeqDataSetFromHTSeqCount
1
0
Entering edit mode
biok0423 ▴ 20
@biok0423-23341
Last seen 4.1 years ago

I am analyzing FPKM RNA-seq data downloaded from DGC database. Now trying to identify genes that differently expressed between sanmples by conducting DESeqDataSetFromHTSeqCount. But I got an error saying:

Error in DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, : Gene IDs (first column) differ between files.

And I don`t know hot to figure it out and need suggestions. Here is the commands I have done.

`

setwd("~/Download/GC/FPKMs") directory <- "~/Download/GC/FPKMs" library(DESeq2) sampletable <- data.frame(sampleName = samplesheet$path, fileName = samplesheet$path, condition=samplesheet$label) sampletable

       sampleName          fileName condition

1 ECsample1.txt ECsample1.txt 1 2 ECsample2.txt ECsample2.txt 0 3 ECsample3.txt ECsample3.txt 1

[ reached 'max' / getOption("max.print") -- omitted 249 rows ]

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampletable, directory = directory, design= ~ condition)

Error in DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, : Gene IDs (first column) differ between files.

`

The content of each sample file is like:

ENSG00000242268.2 0.0 ENSG00000270112.3 0.00258202876781 ENSG00000167578.15 3.30893315419 ENSG00000273842.1 0.0 ENSG00000078237.5 8.05933601781 ENSG00000146083.10 15.0446810186 ENSG00000225275.4 0.0 ENSG00000158486.12 0.221675972087

Could you give me an advice what kind of data and format I need? Thank you!

deseq2 • 2.8k views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.3k
@swbarnes2-14086
Last seen 1 day ago
San Diego

I am analyzing FPKM RNA-seq data

This never ends well. FPKM is unsuitable for DESeq. It wants raw counts only. If you lie and pretend that's what you have when it's not, you can't trust your results.

ADD COMMENT
0
Entering edit mode

Thank you for the answer. I understand raw data should be provided to DESeq. But I am still wondering this would not be the reason for the error.

What DESqp is trying to refer gene_id? In the sample case, I can find GFF file which contains gene id and its transcripts. Also a csv file integrating all sample reads data with geneid, aside from read txt file of each sample.

ADD REPLY

Login before adding your answer.

Traffic: 483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6