Question: p-values, q-values, duplicated entries in output - ballgown
gravatar for mkhasin
9 months ago by
mkhasin0 wrote:

Hi all,

For reference, I'm a relative newcomer to R and followed this tutorial to analyze my dataset of 2 conditions x 2 replicates in ballgown. I noticed that for many p values <<<0.5, the q values appeared disproportionately high (I did see that the OP of that tutorial posted the same question on BC several months ago, heh). When I went to inspect the output manually and ranking by q values, I found that all genes with an existing name (GENE1) had a duplicate that was just its accession (eg ChrN_00000). Normally just annoying, however: some fold changes are ever so slightly different (in the hundredths or thousandths place), which then impacts the p score, which then impacts the q score... 

I'm not sure what I did to get this strange output, and I wonder if that might be impacting the q scores. Here's what happened, from creating the bg object:

bg = ballgown(samples=as.vector(sample_full_path),pData=pheno_data)
bg_filt = subset(bg,"rowVars(texpr(bg)) >1",genomesubset=TRUE)
results_genes = stattest(bg_filt, feature="gene", covariate="condition", getFC=TRUE, meas="FPKM")
results_genes2 = merge(results_genes,bg_gene_names,by.x=c("id"),by.y=c("gene_id"))

followed by export as a tab-delimited text file. I was wondering if anybody has any insight? Thank you for your help!

ADD COMMENTlink written 9 months ago by mkhasin0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 173 users visited in the last hour