Question: p-values, q-values, duplicated entries in output - ballgown
gravatar for mkhasin
15 months ago by
mkhasin0 wrote:

Hi all,

For reference, I'm a relative newcomer to R and followed this tutorial to analyze my dataset of 2 conditions x 2 replicates in ballgown. I noticed that for many p values <<<0.5, the q values appeared disproportionately high (I did see that the OP of that tutorial posted the same question on BC several months ago, heh). When I went to inspect the output manually and ranking by q values, I found that all genes with an existing name (GENE1) had a duplicate that was just its accession (eg ChrN_00000). Normally just annoying, however: some fold changes are ever so slightly different (in the hundredths or thousandths place), which then impacts the p score, which then impacts the q score... 

I'm not sure what I did to get this strange output, and I wonder if that might be impacting the q scores. Here's what happened, from creating the bg object:

bg = ballgown(samples=as.vector(sample_full_path),pData=pheno_data)
bg_filt = subset(bg,"rowVars(texpr(bg)) >1",genomesubset=TRUE)
results_genes = stattest(bg_filt, feature="gene", covariate="condition", getFC=TRUE, meas="FPKM")
results_genes2 = merge(results_genes,bg_gene_names,by.x=c("id"),by.y=c("gene_id"))

followed by export as a tab-delimited text file. I was wondering if anybody has any insight? Thank you for your help!

ADD COMMENTlink written 15 months ago by mkhasin0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 144 users visited in the last hour