Question

Preferential sequencing of longer Genes in Illumina?

0

Entering edit mode

simplyphage • 0

@simplyphage-22396

Last seen 18 months ago

Italy

I was reading this paper from 2010 : https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-94#Sec11

It is mentioned that

One inherent bias of the illumina platform is the preferential sequencing of longer genes. Hence, longer genes are more likely declared as DE.

Is it true for the current Illumina platforms as well? And as a result, we observe low counts for some of the genes. For example, I am looking at the gene PYCR1, and after performing DEseq2 I have got good Log2foldchange (3.63) and Padj-value (1.95E-06), however, the basemean is about 23. Now, it is a known fact that in cancer this gene is upregulated, but the counts are not convincing. I am really confused about what to do here!

DESeq2 sequencing • 874 views

ADD COMMENT • link updated 3.1 years ago by victorbradford4 • 0 • written 3.1 years ago by simplyphage • 0

score 1 · Answer 1 · 2021-03-27

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 2 hours ago

United States

Yes longer genes have higher counts typically. This is taken into account for gene set testing in goseq which can be ran downstream of DESeq2 for gene set results.

For per gene results, you don’t modify the standard pipeline. DESeq2 has been extensively benchmarked.

ADD COMMENT • link 3.1 years ago Michael Love 41k