Question: Correct for Gene Length bias in GO analysis of retrotransposition
0
gravatar for mujupas
2.7 years ago by
mujupas0
mujupas0 wrote:

Hi,

in my lab we have captured and sequenced L1 and ALU retrotransposons form many tissue samples from different donors/conditions.

We're now running GOstats using the list of detected somatic insertions withing Refseq genes +/- 1Kb in order to look for tissue-specific and condition-specific patterns for somatic retrotransposition events.  

A known issue in the field is that since neuronal related genes on average longer that other annotated protein-coding genes, neuronal-related GO terms will show up as the most enriched in any case, no matter the tissue in examination, unless a proper background noise filtering is applied. This can be easily verified by generating a list of random bedtools intervals to simulate a set of insertions from a real experiment, intersecting the intervals with Refseq genes coordinates and running a GO analysis on the intersection, as explained and illustrated in a nice review by Thomas C.A. et al. (http://www.ncbi.nlm.nih.gov/pubmed/23057747, Fig.1).

What is in your opinion the best way to correct for this bias in this kind of analyses?

Thank you in advance.

ADD COMMENTlink modified 2.7 years ago by Gordon Smyth39k • written 2.7 years ago by mujupas0
Answer: Correct for Gene Length bias in GO analysis of retrotransposition
1
gravatar for Gordon Smyth
2.7 years ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

The goseq package is specifically designed to do a GO analysis adjusting for gene length.

ADD COMMENTlink written 2.7 years ago by Gordon Smyth39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 224 users visited in the last hour