cqn package : Error in if (any(lengths <= 0)) stop("argument 'lengths' need to be greater than zero") : missing value where TRUE/FALSE needed
Sally




I am new to R and programming in general. I am currently working on RNAseq mouse data. I quantified my dataset with salmon then used Txtimport to create a countmatrix and finally deseq2. I am trying to run the cqn package but I keep having errors! Here are my code

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE)
ddsTxi <- DESeqDataSetFromTximport(txi, colData = samples, design = ~ condition)
dds <- DESeq(ddsTxi)

class: DESeqDataSet 
dim: 6 31 
metadata(1): version
assays(8): counts avgTxLength ... replaceCounts replaceCooks
rownames(6): ENSMUSG00000000001 ENSMUSG00000000003 ... ENSMUSG00000000037
rowData names(23): baseMean baseVar ... maxCooks replace
colnames: NULL
colData names(3): SampleID condition replaceable

dds2[is.na(dds2)] <- 0  #I started to remove na from dds then got my gc content
countsdds2 <- counts(dds2)
GC_content <- getGeneLengthAndGCContent(rownames(countsdds2), "mm10", mode="org.db")
                   length        gc
ENSMUSG00000000001   3262 0.4421179
ENSMUSG00000000028   2252 0.5006543
ENSMUSG00000000031   2460 0.5560708
ENSMUSG00000000037   6397 0.4864495
ENSMUSG00000000049   1594 0.5017579
ENSMUSG00000000056   4806 0.4936730

mcols(dds2)$gc <-  GC_content[,2]
mcols(dds2)$len <-  GC_content[,1]
fit <- cqn(countsdds2, mcols(dds2)$gc, mcols(dds2)$len)

Error in if (any(lengths <= 0)) stop("argument 'lengths' need to be greater than zero") : 
  missing value where TRUE/FALSE needed

If I try to remove NA from gc and len then I have another error because length and x don't have the same number of rows of counts.

If anyone can help me, I will be very grateful.

Thank you,

Normalization R cqn



If you have Ensembl IDs, you should use mode = "biomart" for getGeneLengthAndGCContent. The OrgDb packages are all natively based on NCBI Gene IDs, so you are (under the hood) mapping from Ensembl to NCBI gene IDs, which is not one-to-one, and there are any number of genes that don't map at all. On the other hand, using biomaRt which queries an Ensembl-based database will eliminate that mapping issue.

Thank you! It actually worked out!!


