Search
Question: error: NA counts not permitted
0
gravatar for clarisbaby
6 months ago by
clarisbaby0
clarisbaby0 wrote:
hey everyone,
I am trying to run the following code for C. elegans paired end data using the following code:


library("biomartr", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4")
library(BSgenome.Celegans.UCSC.ce11)
export(Celegans, "dataset/ce11.fasta", compress=FALSE)
getGFF(db = "ensembl", organism = "Caenorhabditis elegans", reference = TRUE, path = file.path("dataset"))
system("gunzip dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz")
library(systemPipeR) 
args <- systemArgs(sysma="./param/tophat.param", mytargets="targetsPE.txt") 
system("bowtie2-build ./dataset/ce11.fasta ./dataset/ce11.fasta")
bampaths <- runCommandline(args=args)
read_statsDF <- alignStats(args)
write.table(read_statsDF, "results/alignStats.xls", row.names=FALSE, quote=FALSE, sep="\t")
txdb <- makeTxDbFromGFF(file="dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz", format="gff")
saveDb(txdb, file="dataset/database.sqlite")
txdb <- loadDb("./dataset/database.sqlite")
eByg <- exonsBy(txdb, by=c("tx"))
countDFeByg <- assays(counteByg)$counts
countDFeByg <- as.data.frame(countDFeByg)
countDFeByg[is.na(countDFeByg)] <- 0
write.table(countDFeByg, "results/countDFeByg.xls", col.names=NA, quote=FALSE, sep="\t")
countDF <- read.table("results/countDFeByg.xls")
cmp <- readComp(args, format="matrix", delim="-")
edgeDF <- run_edgeR(countDF=countDF, targets=targetsin(args), cmp=cmp[[1]], independent=FALSE, mdsplot="")
write.table(edgeDF, "results/edgeRcomp.xls", quote=FALSE, sep="\t", col.names = NA)
edgeDF <- read.table("./results/edgeRcomp.xls")
DEG_list <- filterDEGs(degDF=edgeDF, filter, plot = FALSE)
while running the following code i am facing the error as below
     Error in calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size,  : 
     NA counts not permitted
     Called from: calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size,
     method = method, refColumn = refColumn, logratioTrim = logratioTrim, sumTrim = sumTrim, 
     doWeighting = doWeighting, Acutoff = Acutoff, p = p)

I have already tried doing 

countDFeByg[is.na(countDFeByg)] <- 0

which is said to remove the NA.

It would be very kind if someone may help me to solve the error or point out where I am going wrong

Thanks in advance.

Claris

 

ADD COMMENTlink modified 6 months ago by Aaron Lun21k • written 6 months ago by clarisbaby0
0
gravatar for Aaron Lun
6 months ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

Obviously, whatever matrix is being passed to calcNormFactors has NA counts. Removing NAs in countDFeByg won't help if the NAs are being introduced afterwards.

  • Are the NAs present in countDF? What does anyis.na(countDF)) say?
  • Do any of the libraries have zero counts? This will result in NAs if run_edgeR() computes CPMs.

Otherwise, it's something happening inside run_edgeR() (I assume this is where the error is being raised), and you'll have to do debug(run_edgeR) to see how the NAs are being introduced.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Aaron Lun21k

anyNA(countDF) says false...

countDFeByg has all the values as zero. When i started looking around i found that it may be due to BAM and GFF files that use different chromosome names.

how can i solve that..

my gff file looks as below

##gff-version 3
##sequence-region   I 1 15072434
##sequence-region   II 1 15279421
##sequence-region   III 1 13783801
##sequence-region   IV 1 17493829
##sequence-region   MtDNA 1 13794
##sequence-region   V 1 20924180
##sequence-region   X 1 17718942
#!genome-build WormBase WBcel235

where as the alignment file looks as

@HD VN:1.0 SO:coordinate

@SQ SN:chrI LN:15072434

@SQ SN:chrII LN:15279421

@SQ SN:chrIII LN:13783801

@SQ SN:chrIV LN:17493829

@SQ SN:chrM LN:13794

@SQ SN:chrV LN:20924180

@SQ SN:chrX LN:17718942

@PG ID:TopHat VN:2.1.0

 

Thanking you in advance.

ADD REPLYlink written 6 months ago by clarisbaby0

Well, obviously, that's going to be a problem. The solution is simple. Fix your GFF file so that the chromosome names have chr in front of them (or in the case of MtDNA, replaced with chrM). rtracklayer has a few commands for importing GFF files; you should be able to edit the sequence names, and then export it back to another file.

ADD REPLYlink written 6 months ago by Aaron Lun21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 186 users visited in the last hour