tximport Error in attr(x, "names") <- as.character(value) : 'names' attribute [2] must be the same length as the vector [1]
1
0
Entering edit mode
ScafioRuo • 0
@caranlove-21533
Last seen 6 weeks ago
United States

Hello all, I am currently having issues importing quantification data for DESeq2 using the library(tximport) package and tximport function. The method I am using to take my Salmon alignments through this process includes the below commands:

txdb <- makeTxDbFromGFF("GCF_000002285.3_CanFam3.1_genomic.gff", format = "gff")

saveDb(x=txdb, file = "gencode.v28.annotation.TxDb")
k <- keys(txdb, keytype = "TXNAME")
tx2gene <- select(txdb, k, "GENEID", "TXNAME")
dim(tx2gene)
length(k)
write.table(tx2gene, "tx2gene.gencode.v28.csv", sep = "\t", row.names = FALSE)

files <- file.path(dir,"salmon_quant", samples$sample, "quant.sf")
names(files) <- samples$sample
tx2gene <- read_csv(file.path(dir, "tx2gene.gencode.v28.csv"))

txi <- tximport(files, type="salmon", tx2gene=tx2gene)

And the error I am getting is:

reading in files with read_tsv
1 2 3 4 5 6 7 8 9 10 11 12 13 
Error in attr(x, "names") <- as.character(value) : 
  'names' attribute [2] must be the same length as the vector [1]

Any input would be fantastic! I have not been able to find any remedies as of yet. Thank you!

R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.24.0               SummarizedExperiment_1.14.1 DelayedArray_0.10.0         BiocParallel_1.18.0        
 [5] matrixStats_0.54.0          GenomicFeatures_1.36.4      AnnotationDbi_1.46.0        Biobase_2.44.0             
 [9] rtracklayer_1.44.2          GenomicRanges_1.36.0        GenomeInfoDb_1.20.0         IRanges_2.18.1             
[13] S4Vectors_0.22.0            BiocGenerics_0.30.0         tximportData_1.12.0         readr_1.3.1                
[17] tximport_1.12.3             DBI_1.0.0                   RSQLite_2.1.2              

software error RNAseq tximport R • 2.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

Can you show what names(files) looks like? It seems like this is causing a problem.

ADD COMMENT
0
Entering edit mode

Thank you for the reply! The names(files) is the name of each of my samples and looks like:

[1] "G1"   "G2"   "G3"   "G4"   "G5"   "G6"   "G7"   "G110" "G111" "G117" "G118" "G119" "G130"

Thanks!

ADD REPLY
0
Entering edit mode

Hmm, I'm not sure what's going on here. You get the error only after the files have been read in.

Can you give the code for how you run Salmon? I'm wondering if it's happening because of importing inferential replicates?

ADD REPLY
0
Entering edit mode

Sure thing! The Salmon script I used was:

salmon index -t GCF_000002285.3_CanFam3.1_rna.fa -i GW_salmon_index2

salmon quant -i /scratch/clove/canids/Reference/Salmon/GW_salmon_index2 -l A  -1 G1_L001_R1.fastq G1_L002_R1.fastq G1_L003_R1.fastq G1_L004_R1.fastq -2 G1_L001_R2.fastq G1_L002_R2.fastq G1_L003_R2.fastq G1_L004_R2.fastq -p 8 --validateMappings -o quants/G1_quant
ADD REPLY
0
Entering edit mode

I'm still not sure. What happens if you try to import a subset of the files? E.g. the first 3 or the last 3?

ADD REPLY
0
Entering edit mode

Hmm, I tried running with only the first 3 samples and get the same error.

> txi <- tximport(files, type="salmon", tx2gene=tx2gene)
reading in files with read_tsv
1 2 3 
Error in attr(x, "names") <- as.character(value) : 
  'names' attribute [2] must be the same length as the vector [1]

I wonder if maybe my samples.txt file could have anything to do with it? My samples file looks like:

  sample pop center  run
1     G1 CEZ  UNIGE L001
2     G2 CEZ  UNIGE L002
3     G3 CEZ  UNIGE L003

Thank you for taking the time to help me sort this out!

ADD REPLY
0
Entering edit mode

Could you email me some of the data so I can try? You can email to:

maintainer(“tximport”)

ADD REPLY
0
Entering edit mode

I took a shot and don't have an error:

txdb <- makeTxDbFromGFF("GCF_000002285.3_CanFam3.1_genomic.gff.gz", 
                        format = "gff")
k <- keys(txdb, keytype = "TXNAME")
tx2gene <- select(txdb, k, "GENEID", "TXNAME")
files <- c("~/Downloads/G110_quant.sf","~/Downloads/G111_quant.sf")
txi <- tximport(files, type="salmon", tx2gene=tx2gene)
reading in files with read_tsv
1 2
summarizing abundance
summarizing counts
summarizing length
> head(txi$counts)
          [,1]  [,2]
A1BG     0.000 1.000
A1CF     2.000 2.000
A2ML1    0.000 1.000
A3GALT2  2.000 0.000
A4GALT  16.098 6.538
A4GNT    0.000 0.000
ADD REPLY
0
Entering edit mode

Thank you so much Michael! For some reason it appears R did not like the way I was calling up my .sf files. I have tried pulling them in the way you had success with and it worked for me as well!

Thanks again for all of your time and help. Cheers!

ADD REPLY
0
Entering edit mode

Thanks! I've emailed your gmail when you get a chance. Best ~C

ADD REPLY

Login before adding your answer.

Traffic: 734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6