CpGs with genomic positions greater than the chromosome size in EPICv2?
1
0
Entering edit mode
@4e93be39
Last seen 1 day ago
Spain

Hi community! I am working with EPICv2 methylation data and converting it into GRanges objects for analysis. I have encountered four CpGs whose genomic position exceeds the size of the corresponding chromosome (in base pairs, bp). The positional information is extracted from the annotation of the IlluminaHumanMethylationEPICv2anno.20a1.hg38 package.

The CpGs in question are:

cg11930929_TC21 -> position: 152,640,955 on chr14, but chr14 size is 107,349,540 bp.

cg01734724_TC21 -> position: 189,373,481 on chr6, but chr6 size is 171,115,067 bp.

cg07146279_TC21 -> position: 140,288,268 on chr11, but chr11 size is 135,006,516 bp.

cg18295427_TC21 -> position: 125,359,061 on chr18, but chr18 size is 78,684,590 bp.

Has anyone else noticed this phenomenon, or am I making a mistake?

EPICv2 EPICv2manifest • 251 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

All of the annotation packages provided by Bioconductor are simply repackaging of existing data from one or more sources. In this case, the Locations DataFrame comes from the Epic-8v2-0-A2.csv file that Illumina provides. See here for Zuguang Gu's code, in the inst/scripts directory. Anyway, we can parse that file directly to get the chr and pos that Illumina provides:

$ grep cg11930929_TC21 EPIC-8v2-0_A2.csv | cut -d, -f 1,16,17
cg11930929_TC21,14,152640955

Why Illumina is specifying that one or more of the CpGs are off the end of the chromosome they are meant to be on is a mystery that only they can clear up.

Login before adding your answer.

Traffic: 358 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6