Question

error: NA counts not permitted

0

Entering edit mode

clarisbaby • 0

@clarisbaby-14317

Last seen 6.0 years ago

hey everyone,
I am trying to run the following code for C. elegans paired end data using the following code:


library("biomartr", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4")
library(BSgenome.Celegans.UCSC.ce11)
export(Celegans, "dataset/ce11.fasta", compress=FALSE)
getGFF(db = "ensembl", organism = "Caenorhabditis elegans", reference = TRUE, path = file.path("dataset"))
system("gunzip dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz")
library(systemPipeR) 
args <- systemArgs(sysma="./param/tophat.param", mytargets="targetsPE.txt") 
system("bowtie2-build ./dataset/ce11.fasta ./dataset/ce11.fasta")
bampaths <- runCommandline(args=args)
read_statsDF <- alignStats(args)
write.table(read_statsDF, "results/alignStats.xls", row.names=FALSE, quote=FALSE, sep="\t")
txdb <- makeTxDbFromGFF(file="dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz", format="gff")
saveDb(txdb, file="dataset/database.sqlite")
txdb <- loadDb("./dataset/database.sqlite")

eByg <- exonsBy(txdb, by=c("tx"))
countDFeByg <- assays(counteByg)$counts
countDFeByg <- as.data.frame(countDFeByg)
countDFeByg[is.na(countDFeByg)] <- 0
write.table(countDFeByg, "results/countDFeByg.xls", col.names=NA, quote=FALSE, sep="\t")
countDF <- read.table("results/countDFeByg.xls")
cmp <- readComp(args, format="matrix", delim="-")
edgeDF <- run_edgeR(countDF=countDF, targets=targetsin(args), cmp=cmp[[1]], independent=FALSE, mdsplot="")
write.table(edgeDF, "results/edgeRcomp.xls", quote=FALSE, sep="\t", col.names = NA)
edgeDF <- read.table("./results/edgeRcomp.xls")
DEG_list <- filterDEGs(degDF=edgeDF, filter, plot = FALSE)

while running the following code i am facing the error as below
     Error in calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size,  : 
     NA counts not permitted
     Called from: calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size,
     method = method, refColumn = refColumn, logratioTrim = logratioTrim, sumTrim = sumTrim, 
     doWeighting = doWeighting, Acutoff = Acutoff, p = p)

I have already tried doing

countDFeByg[is.na(countDFeByg)] <- 0

which is said to remove the NA.

It would be very kind if someone may help me to solve the error or point out where I am going wrong

Thanks in advance.

Claris

systempiper edger rna-seq • 1.7k views

ADD COMMENT • link updated 6.0 years ago by Aaron Lun ★ 28k • written 6.0 years ago by clarisbaby • 0

score 0 · Answer 1 · 2018-04-05

0

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 39 minutes ago

The city by the bay

Obviously, whatever matrix is being passed to calcNormFactors has NA counts. Removing NAs in countDFeByg won't help if the NAs are being introduced afterwards.

Are the NAs present in countDF? What does anyis.na(countDF)) say?
Do any of the libraries have zero counts? This will result in NAs if run_edgeR() computes CPMs.

Otherwise, it's something happening inside run_edgeR() (I assume this is where the error is being raised), and you'll have to do debug(run_edgeR) to see how the NAs are being introduced.

ADD COMMENT • link 6.0 years ago Aaron Lun ★ 28k

0

Entering edit mode

anyNA(countDF) says false...

countDFeByg has all the values as zero. When i started looking around i found that it may be due to BAM and GFF files that use different chromosome names.

how can i solve that..

my gff file looks as below

##gff-version 3
##sequence-region I 1 15072434
##sequence-region II 1 15279421
##sequence-region III 1 13783801
##sequence-region IV 1 17493829
##sequence-region MtDNA 1 13794
##sequence-region V 1 20924180
##sequence-region X 1 17718942
#!genome-build WormBase WBcel235

where as the alignment file looks as

@HD VN:1.0 SO:coordinate

@SQ SN:chrI LN:15072434

@SQ SN:chrII LN:15279421

@SQ SN:chrIII LN:13783801

@SQ SN:chrIV LN:17493829

@SQ SN:chrM LN:13794

@SQ SN:chrV LN:20924180

@SQ SN:chrX LN:17718942

@PG ID:TopHat VN:2.1.0

Thanking you in advance.

ADD REPLY • link 6.0 years ago clarisbaby • 0

0

Entering edit mode

Well, obviously, that's going to be a problem. The solution is simple. Fix your GFF file so that the chromosome names have chr in front of them (or in the case of MtDNA, replaced with chrM). rtracklayer has a few commands for importing GFF files; you should be able to edit the sequence names, and then export it back to another file.

ADD REPLY • link 6.0 years ago Aaron Lun ★ 28k