Entering edit mode
Hello all, I am currently having issues importing quantification data for DESeq2 using the library(tximport)
package and tximport function. The method I am using to take my Salmon alignments through this process includes the below commands:
txdb <- makeTxDbFromGFF("GCF_000002285.3_CanFam3.1_genomic.gff", format = "gff")
saveDb(x=txdb, file = "gencode.v28.annotation.TxDb")
k <- keys(txdb, keytype = "TXNAME")
tx2gene <- select(txdb, k, "GENEID", "TXNAME")
dim(tx2gene)
length(k)
write.table(tx2gene, "tx2gene.gencode.v28.csv", sep = "\t", row.names = FALSE)
files <- file.path(dir,"salmon_quant", samples$sample, "quant.sf")
names(files) <- samples$sample
tx2gene <- read_csv(file.path(dir, "tx2gene.gencode.v28.csv"))
txi <- tximport(files, type="salmon", tx2gene=tx2gene)
And the error I am getting is:
reading in files with read_tsv
1 2 3 4 5 6 7 8 9 10 11 12 13
Error in attr(x, "names") <- as.character(value) :
'names' attribute [2] must be the same length as the vector [1]
Any input would be fantastic! I have not been able to find any remedies as of yet. Thank you!
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)
Matrix products: default
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.24.0 SummarizedExperiment_1.14.1 DelayedArray_0.10.0 BiocParallel_1.18.0
[5] matrixStats_0.54.0 GenomicFeatures_1.36.4 AnnotationDbi_1.46.0 Biobase_2.44.0
[9] rtracklayer_1.44.2 GenomicRanges_1.36.0 GenomeInfoDb_1.20.0 IRanges_2.18.1
[13] S4Vectors_0.22.0 BiocGenerics_0.30.0 tximportData_1.12.0 readr_1.3.1
[17] tximport_1.12.3 DBI_1.0.0 RSQLite_2.1.2
Thank you for the reply! The names(files) is the name of each of my samples and looks like:
Thanks!
Hmm, I'm not sure what's going on here. You get the error only after the files have been read in.
Can you give the code for how you run Salmon? I'm wondering if it's happening because of importing inferential replicates?
Sure thing! The Salmon script I used was:
I'm still not sure. What happens if you try to import a subset of the files? E.g. the first 3 or the last 3?
Hmm, I tried running with only the first 3 samples and get the same error.
I wonder if maybe my samples.txt file could have anything to do with it? My samples file looks like:
Thank you for taking the time to help me sort this out!
Could you email me some of the data so I can try? You can email to:
maintainer(“tximport”)
I took a shot and don't have an error:
Thank you so much Michael! For some reason it appears R did not like the way I was calling up my .sf files. I have tried pulling them in the way you had success with and it worked for me as well!
Thanks again for all of your time and help. Cheers!
Thanks! I've emailed your gmail when you get a chance. Best ~C