DNA methylation analysis without raw data IDAT files
Entering edit mode
Beginner ▴ 50
Last seen 11 months ago


I'm interested in working on DNA methylation data. I have downloaded TCGA data from GDC harmonised archive. There are no IDAT files. 

IDAT files are available only for GDC legacy archive. 

Dataframe "data" is with 485577 probes as rows and 439 columns. There are columns like Chromosome, Start position, End position, Gene and Samples with values for each probe.

For eg it looks like below:

        Chr    Start    End    Gene    GeneType    TranscriptID    TCGA-DD-A3A3-01A    TCGA-G3-AAV1-01A    TCGA-DD-AACX-01A    TCGA-DD-A4NI-01A    TCGA-G3-AAV4-01A    TCGA-DD-A1EG-11A
    cg00000029    chr16    53434200    53434201    RBL2    protein_coding    ENST00000262133.9    0.550913627    0.390846294    0.210664637    0.329930064    0.193362596    0.309831311
    cg00000108    chr3    37417715    37417716    C3orf35    lincRNA    ENST00000328376.8    NA    NA    NA    NA    NA    NA
    cg00000109    chr3    172198247    172198248    FNDC3B    protein_coding    ENST00000336824.7    NA    NA    NA    NA    NA    NA
    cg00000165    chr1    90729117    90729118    .    .    .    0.570880538    0.074518375    0.174949392    0.136944673    0.064590585    0.151404705
    cg00000236    chr8    42405776    42405777    VDAC3    protein_coding    ENST00000022615.7    0.914067333    0.845768766    0.901394742    0.922730081    0.910097231    0.887756996

I have seen many R packages like mini, Champ, miss methyl etc....But all the packages can be used only with IDAT files. And I'm not aware about how I can do the methylation analysis with data in a dataframe.

Any help is appreciated. 

r illimina 450k methylation minfi champ tcga • 735 views
Entering edit mode
Last seen just now
United States

?readTCGA in minfi may be helpful here.

Entering edit mode

Hi James,

I tried reading the methylation data with readTCGA. But it is not working. May be I'm wrong somewhere

The methylation data is in a dataframe "df" with rows as probes and columns like Chromosome, Start position, End position, Gene and Samples like mentioned in my question. 


This gave an error like below: 

Error in readLines(filename, n = 2) : 'con' is not a connection

Could you please show me an example. Thank you



Entering edit mode

Most help pages have an example, and readTCGA is no different. You can't just use a function without reading the help page, and if you had read the help page you wouldn't have tried to do what you did.

If you are planning to get anywhere with R and/or Bioconductor, you will need to become more self-sufficient. You either need to learn how to figure things out for yourself, by reading the help pages and vignettes and googling for answers, or you need to find somebody local who has the skills to do the work for you. Just trying something random and then asking for help on this support site isn't a good long term strategy.

Entering edit mode
Yuan Tian ▴ 240
Last seen 7 days ago
United Kingdom

Actually ChAMP can do analysis with simple beta matrix and pheno vector, it's not relying on IDAT file. So if you can extract beta information from your file into proper matrix, you can then use ChAMP for all following analysis. But ChAMP does not provide loading function from your file format to R session.


Yuan Tian

Entering edit mode

yes, I didn't find anything to read methylation data from matrix with champ. Any idea about any other functions?


Login before adding your answer.

Traffic: 506 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6