"lexical error: invalid char in json text" when using tximport
1
0
Entering edit mode
alallo • 0
@alallo-21363
Last seen 3.9 years ago

Hi, I have performed RNA-seq data quantification with salmon v0.14.1. When I tried to import the results into DESeq2 with tximport I have got this error:

> txi <- tximport(quant_files, type="salmon", tx2gene = tx2gene, ignoreTxVersion = FALSE)
reading in files with read_tsv
1 2 Error: lexical error: invalid char in json text.
                                      /Users/pcanion/Desktop/CD24_M
                     (right here) ------^

Here is the code I used to get to that point:

if (file.exists("/Users/pcanion/Desktop/CD24_Maria/R/meta/tx2gene.csv")) {
    tx2gene <- read.delim("/Users/pcanion/Desktop/CD24_Maria/R/meta/tx2gene.csv", sep = ",")
} else {
ensembl = useEnsembl(biomart="ensembl", 
                     dataset="hsapiens_gene_ensembl",
                     host = 'www.ensembl.org',
                      mirror = "uswest")

t2g <- biomaRt::getBM(attributes = c('ensembl_gene_id',
                                     'ensembl_transcript_id',
                                     'external_gene_name'), 
                      mart = ensembl)
t2g <- dplyr::rename(t2g, 
                     trans_id = ensembl_transcript_id,
                     ens_gene = ensembl_gene_id, 
                     ext_gene = external_gene_name)

tx2gene <- dplyr::select(t2g, trans_id, ens_gene)

write.csv(tx2gene, file="/Users/pcanion/Desktop/CD24_Maria/R/meta/tx2gene.csv", row.names = FALSE, quote=FALSE)
}
# locate salmon outputs
quant_files <- list.files("/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs", 
                          pattern="quant.sf", 
                          recursive = TRUE, full.names = TRUE)
dirs <- list.files("/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/")
names(quant_files) <- dirs
quant_files

# Load "transcript to gene"" file
tx2gene <- read.delim("/Users/pcanion/Desktop/CD24_Maria/R/meta/tx2gene.csv", sep = ",")

# tximport
txi <- tximport(quant_files, type="salmon", tx2gene = tx2gene, ignoreTxVersion = TRUE)

My list of quantification results looks like this:

> quant_files
                                                                    CD24_01_346-15-1_S1.salmon 
       "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_01_346-15-1_S1.salmon/quant.sf" 
                                                                  CD24_02_402-6-1_sc_S2.salmon 
     "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_02_402-6-1_sc_S2.salmon/quant.sf" 
                                                                  CD24_03_402-5-2_sc_S3.salmon 
     "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_03_402-5-2_sc_S3.salmon/quant.sf" 
                                                                    CD24_04_346-15-5_S4.salmon 
       "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_04_346-15-5_S4.salmon/quant.sf" 
                                                              CD24_05_402-5-2_Br_Met_S5.salmon 
 "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_05_402-5-2_Br_Met_S5.salmon/quant.sf" 
                                                                 CD24_06_346-25-5_sc_S6.salmon 
    "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_06_346-25-5_sc_S6.salmon/quant.sf" 
                                                                    CD24_07_346-15-4_S7.salmon 
       "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_07_346-15-4_S7.salmon/quant.sf" 
                                                              CD24_08_402-6-1_Br_Met_S8.salmon 
 "/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_08_402-6-1_Br_Met_S8.salmon/quant.sf" 
                                                             CD24_09_346-25-5_Br_met_S9.salmon 
"/Users/pcanion/Desktop/CD24_Maria/Phoenix/outs/CD24_09_346-25-5_Br_met_S9.salmon/quant.sf" 

And the tx2gene dataframe looks like this:

> head(tx2gene,5)
         trans_id        ens_gene
1 ENST00000387314 ENSG00000210049
2 ENST00000389680 ENSG00000211459
3 ENST00000387342 ENSG00000210077
4 ENST00000387347 ENSG00000210082
5 ENST00000386347 ENSG00000209082

Would anyone be able to help me with this error? Thanks!

tximport salmon • 2.5k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

Can you see if this previous post helps fix the issue:

https://support.bioconductor.org/p/126276/

ADD COMMENT
0
Entering edit mode

Thanks for the quick reply. I realised that in the error generated by tximport the numbers 1 2 before the word Error refer to the first and second files in my quant_files list. If I remove the second sample from the analysis (CD2402402-6-1scS2.salmon), tximport works. I checked the salmon_e.log and found this error after the bootstrapping:

/var/spool/torque/mom_priv/jobs/461964.SC: line 22: 32545 Segmentation fault      (core dumped) salmon quant -i /scratch/wsspaces/alallo-CD24-0/02_transcriptome/human_index -l A -1 /mnt/lustre/scratch/wsspaces/alallo-CD24-0/01_merge_lanes/fastq_merged/${samp}_R1_001.fastq.gz -2 /mnt/lustre/scratch/wsspaces/alallo-CD24-0/01_merge_lanes/fastq_merged/${samp}_R2_001.fastq.gz -p 32 -o outs/${samp}.salmon --validateMappings --seqBias --gcBias --numBootstraps 100

Something went wrong during the analysis of this sample. I have ran salmon again for this sample and it fixed the problem. Now, tximport works fine.

Sorry, the answer was in the error itself. I should have noticed before! Thanks for your help!

ADD REPLY
0
Entering edit mode

Great, thanks for the followup

ADD REPLY

Login before adding your answer.

Traffic: 753 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6