Entering edit mode
clarisbaby
•
0
@clarisbaby-14317
Last seen 6.7 years ago
hey everyone, I am trying to run the following code for C. elegans paired end data using the following code: library("biomartr", lib.loc="~/R/x86_64-pc-linux-gnu-library/3.4") library(BSgenome.Celegans.UCSC.ce11) export(Celegans, "dataset/ce11.fasta", compress=FALSE) getGFF(db = "ensembl", organism = "Caenorhabditis elegans", reference = TRUE, path = file.path("dataset")) system("gunzip dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz") library(systemPipeR) args <- systemArgs(sysma="./param/tophat.param", mytargets="targetsPE.txt") system("bowtie2-build ./dataset/ce11.fasta ./dataset/ce11.fasta") bampaths <- runCommandline(args=args) read_statsDF <- alignStats(args) write.table(read_statsDF, "results/alignStats.xls", row.names=FALSE, quote=FALSE, sep="\t") txdb <- makeTxDbFromGFF(file="dataset/Caenorhabditis_elegans.WBcel235.91_ensembl.gff3.gz", format="gff") saveDb(txdb, file="dataset/database.sqlite") txdb <- loadDb("./dataset/database.sqlite")
eByg <- exonsBy(txdb, by=c("tx")) countDFeByg <- assays(counteByg)$counts countDFeByg <- as.data.frame(countDFeByg) countDFeByg[is.na(countDFeByg)] <- 0 write.table(countDFeByg, "results/countDFeByg.xls", col.names=NA, quote=FALSE, sep="\t") countDF <- read.table("results/countDFeByg.xls") cmp <- readComp(args, format="matrix", delim="-") edgeDF <- run_edgeR(countDF=countDF, targets=targetsin(args), cmp=cmp[[1]], independent=FALSE, mdsplot="") write.table(edgeDF, "results/edgeRcomp.xls", quote=FALSE, sep="\t", col.names = NA) edgeDF <- read.table("./results/edgeRcomp.xls") DEG_list <- filterDEGs(degDF=edgeDF, filter, plot = FALSE)
while running the following code i am facing the error as below Error in calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size, : NA counts not permitted Called from: calcNormFactors.default(object = object$counts, lib.size = object$samples$lib.size, method = method, refColumn = refColumn, logratioTrim = logratioTrim, sumTrim = sumTrim, doWeighting = doWeighting, Acutoff = Acutoff, p = p)
I have already tried doing
countDFeByg[is.na(countDFeByg)] <- 0
which is said to remove the NA.
It would be very kind if someone may help me to solve the error or point out where I am going wrong
Thanks in advance.
Claris
anyNA(countDF) says false...
countDFeByg has all the values as zero. When i started looking around i found that it may be du
e to BAM and GFF files that use different chromosome names.
how can i solve that..
my gff file looks as below
##gff-version 3
##sequence-region I 1 15072434
##sequence-region II 1 15279421
##sequence-region III 1 13783801
##sequence-region IV 1 17493829
##sequence-region MtDNA 1 13794
##sequence-region V 1 20924180
##sequence-region X 1 17718942
#!genome-build WormBase WBcel235
where as the alignment file looks as
@HD VN:1.0 SO:coordinate
@SQ SN:chrI LN:15072434
@SQ SN:chrII LN:15279421
@SQ SN:chrIII LN:13783801
@SQ SN:chrIV LN:17493829
@SQ SN:chrM LN:13794
@SQ SN:chrV LN:20924180
@SQ SN:chrX LN:17718942
@PG ID:TopHat VN:2.1.0
Thanking you in advance.
Well, obviously, that's going to be a problem. The solution is simple. Fix your GFF file so that the chromosome names have
chr
in front of them (or in the case ofMtDNA
, replaced withchrM
). rtracklayer has a few commands forimport
ing GFF files; you should be able to edit the sequence names, and thenexport
it back to another file.