Entering edit mode
Matteo Carrara
▴
20
@matteo-carrara-5734
Last seen 10.3 years ago
Hello,
I have been trying to learn how to perform a differential expression
analysis of RNA-seq data using the DEXSeq package lately and I
encountered
an unexpected behaviour in the function read.HTSeqCount: the function
fails
to load the file obtained from the python script "dexseq_count.py"
with the
following error message:
Error in strsplit(rownames(dcounts), ":") : non-character argument
I would really appreciate any pointers that might help me correct my
code
or my input files.
Here is what I have done:
- downloaded the mm9 GTF gene set from www.ensembl.org and run the
script
"dexseq_prepare_annotation.py"
- mapped my raw RNA-seq reads on the mm9 genome using tophat,
converting
the output in sorted SAM format
- run the script "dexseq_count.py" using the "flattened" GTF and the
SAM
file obtained before
- loaded the dataset in R using the function read.HTSeqCount() as
following:
--------------------------------------
> library(DEXSeq)
> wt<-read.HTSeqCount("./wt_mapped.counts", "WT",
flattenedfile="./flattened_mm9.gtf")
Error in strsplit(rownames(dcounts), ":") : non-character argument
--------------------------------------
As far as I could understand, the "pasilla" package, used for the
examples
in the vignette, provided a counts file under the name
"pasilla_gene_counts.tsv". Loading that file, however, results in the
same
error message.
All I could do was pinpointing the source of the error in the code of
the
function, although that did not help me in finding a solution or a
workaround:
After creating the data frame "dcounts" storing the counts and setting
the
row names, that same data frame is sub-set
dcounts <- dcounts[substr(rownames(dcounts), 1, 1) != "_",
]
This code, however changes the object dcounts in such a way that the
"rownames()" function returns NULL. The next statement is then bound
to
fail since it requires rownames(dcounts) to be a character or a vector
of
characters:
genesrle <- sapply(strsplit(rownames(dcounts), ":"), "[[",
1)
I am running R 2.15.2 and DEXSeq 1.4.0 from Bioconductor version
2.11,
but I was able to reproduce this on the devel version of R (2013-01-22
r61734) using DEXSeq_1.5.6 from Bioconductor version 2.12.
---------------------------------
>sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BiocInstaller_1.8.3 DEXSeq_1.4.0 Biobase_2.18.0
[4] BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] biomaRt_2.14.0 hwriter_1.3 RCurl_1.95-3 statmod_1.4.16
stringr_0.6.2
[6] tools_2.15.2 XML_3.95-0.1
--------------------------------
Thank you in advance for any help you can provide.
Best Regards,
--
Matteo Carrara
PhD Student in Complex Systems for Life Sciences
Department of Biotechnology and Health Science
MBC - Molecular Biotechnology Center
via Nizza, 52 Torino
ITALY
[[alternative HTML version deleted]]