Question

arrayExpress differential gene express analysis

0

Entering edit mode

jadepinket • 0

@jadepinket-12872

Last seen 7.0 years ago

I have a dataset that looks like below:

Gene ID	Gene Name	p-value1	log2foldchange1	p-value2	log2foldchange2	p-value3
ACEGIKM00000000001	AABR07013255.1	NA	0	NA	0	NA
ACEGIKM00000000007	Gad1	NA	0	NA	0	NA
ACEGIKM00000000008	Alx4	NA	0	NA	0	NA
ACEGIKM00000000009	Tmco5b	NA	0	NA	0	NA
ACEGIKM00000000010	Cbln1	NA	0	NA	0	NA
ACEGIKM00000000012	Tcf15	NA	0	NA	0	NA
ACEGIKM00000000017	Steap1	0.293657	0.2	0.176462	0.2	0.08213
ACEGIKM00000000021	AABR07061902.1	0.058899	-0.3	0.919169	0	0.95051
ACEGIKM00000000024	Hebp1	0.904233	0	0.589132	0.1	0.637529
ACEGIKM00000000033	Tmcc2	NA	0	NA	-0.1	NA
ACEGIKM00000000034	Nuak2	0.580938	-0.1	0.882088	0	0.800909

I want to Identify differentially expressed genes (for example, using p-value or fold change or both) for each treatment that includes direction of change, then I want to Identify the most important pathways impacted by the treatment. I also want to do visualizations that show the changes in gene expression as a function of treatment.

I also have a counts file, but since I already have this file with fold change and pvalues, I was hoping I could get my answers from this.

Any suggestions are appreciated.

deseq2 limma bioconductor edger microarray • 661 views

ADD COMMENT • link updated 7.0 years ago by Aaron Lun ★ 28k • written 7.0 years ago by jadepinket • 0

score 0 · Answer 1 · 2017-04-20

If the p-values are already BH-adjusted, you can just define DE genes as those with p-values below a desired threshold, e.g., 5%. If not, you'll have to adjust them yourself using p.adjust with method="BH". After you identify DE genes, you can identify enriched pathways/processes using goana or kegga from the limma package. Check whether your organism is supported, though; the IDs aren't familiar to me.

What you mean by "visualizations" is vague. If the log-fold changes describe the effect of treatment, you can just plot them. They look a bit weird, though, usually there's more than 1 decimal point.