p-values, q-values, duplicated entries in output - ballgown
0
2
Entering edit mode
mkhasin ▴ 20
@mkhasin-13981
Last seen 5.7 years ago

Hi all,

For reference, I'm a relative newcomer to R and followed this tutorial to analyze my dataset of 2 conditions x 2 replicates in ballgown. I noticed that for many p values <<<0.5, the q values appeared disproportionately high (I did see that the OP of that tutorial posted the same question on BC several months ago, heh). When I went to inspect the output manually and ranking by q values, I found that all genes with an existing name (GENE1) had a duplicate that was just its accession (eg ChrN_00000). Normally just annoying, however: some fold changes are ever so slightly different (in the hundredths or thousandths place), which then impacts the p score, which then impacts the q score... 

I'm not sure what I did to get this strange output, and I wonder if that might be impacting the q scores. Here's what happened, from creating the bg object:

bg = ballgown(samples=as.vector(sample_full_path),pData=pheno_data)
bg_filt = subset(bg,"rowVars(texpr(bg)) >1",genomesubset=TRUE)
results_genes = stattest(bg_filt, feature="gene", covariate="condition", getFC=TRUE, meas="FPKM")
results_genes2 = merge(results_genes,bg_gene_names,by.x=c("id"),by.y=c("gene_id"))

followed by export as a tab-delimited text file. I was wondering if anybody has any insight? Thank you for your help!

ballgown rna-seq q value r • 984 views
ADD COMMENT

Login before adding your answer.

Traffic: 667 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6