Hi,
Thanks for reading. I am trying to replicate this pipeline (COHCAP: City of Hope CpG Island Analysis Pipeline,Charles Warden November 4, 2016) and I am finding some issues.
I have generated 3 .txt files:
One for the **samples** (37 at all), with following structure
200053740020_R02C02 Group1
200053740019_R02C01 Group1
200053740019_R05C01 Group1
200053740019_R02C02 Group1
200053740019_R03C02 Group1
200053740095_R02C02 Group2
200053740095_R03C02 Group2
200053740034_R04C02 Group2
...
One for the **Beta** values, with following structure
200053740034_R03C02 200053740134_R01C02...
cg13869341 0.807718263 0.811597173...
cg14008030 0.630174852 0.664826422...
cg12045430 0.24139956 0.303547345...
...
Another one with the **RNA-seq** intensities
Gene Symbol 200053740034_R02C01 200053740034_R03C01
5S_rRNA -0.894804008 -1.514165631
7SK 4.559935916 5.869538661
A1BG -3.70215893 -0.777200037
A1BG-AS1 -2.117196429 -0.777200037
A1CF -3.70215893 -1.514165631
When I run following line with 450k-UCSC annotation I get 0 as result
> beta.table = COHCAP.annotate(beta.file, project.name, project.folder,platform="450k-UCSC")
[1] 483835 38
[1] 0 5
[1] 0 42
and the table looks like this
> beta.table
[1] SiteID Chr Loc Gene
[5] Island X200053740034_R03C02 X200053740134_R01C02 X200053740134_R05C02
[9] X200053740095_R01C01 X200053740095_R03C01 X200053740020_R03C02 X200053740019_R01C01
[13] X200053740006_R01C01 X200053740006_R04C01 X200053740006_R06C01 X200053740006_R02C02
[17] X200053740006_R03C02 X200053740006_R05C02 X200053740095_R02C02 X200053740095_R03C02
[21] X200053740034_R04C01 X200053740034_R04C02 X200053740134_R04C01 X200053740095_R06C01
[25] X200053740100_R04C01 X200053740100_R05C01 X200053740100_R03C02 X200053740020_R01C01
[29] X200053740020_R06C01 X200053740020_R01C02 X200053740020_R02C02 X200053740020_R05C02
[33] X200053740019_R02C01 X200053740019_R05C01 X200053740019_R02C02 X200053740019_R03C02
[37] X200053740006_R02C01 X200053740059_R01C01 X200053740059_R03C01 X200053740059_R01C02
[41] X200053740059_R02C02 X200053740059_R05C02
<0 rows> (or 0-length row.names)
When I run it with my annotation file I get something similar
> beta.table = COHCAP.annotate(beta.file, project.name, project.folder,platform="custom",annotation.file = "annotation.txt")
[1] 483835 38
[1] "Using custom island/gene annotations from : annotation.txt"
[1] 485512 33
[1] 0 33
[1] 0 70
And the table looks similar
> beta.table
[1] chr pos strand
[4] Name AddressA AddressB
[7] ProbeSeqA ProbeSeqB Type
[10] NextBase Color Probe_rs
[13] Probe_maf CpG_rs CpG_maf
[16] SBE_rs SBE_maf Islands_Name
[19] Relation_to_Island Forward_Sequence SourceSeq
[22] Random_Loci Methyl27_Loci UCSC_RefGene_Name
[25] UCSC_RefGene_Accession UCSC_RefGene_Group Phantom
[28] DMR Enhancer HMM_Island
[31] Regulatory_Feature_Name Regulatory_Feature_Group DHS
[34] X200053740034_R03C02 X200053740134_R01C02 X200053740134_R05C02
[37] X200053740095_R01C01 X200053740095_R03C01 X200053740020_R03C02
[40] X200053740019_R01C01 X200053740006_R01C01 X200053740006_R04C01
[43] X200053740006_R06C01 X200053740006_R02C02 X200053740006_R03C02
[46] X200053740006_R05C02 X200053740095_R02C02 X200053740095_R03C02
[49] X200053740034_R04C01 X200053740034_R04C02 X200053740134_R04C01
[52] X200053740095_R06C01 X200053740100_R04C01 X200053740100_R05C01
[55] X200053740100_R03C02 X200053740020_R01C01 X200053740020_R06C01
[58] X200053740020_R01C02 X200053740020_R02C02 X200053740020_R05C02
[61] X200053740019_R02C01 X200053740019_R05C01 X200053740019_R02C02
[64] X200053740019_R03C02 X200053740006_R02C01 X200053740059_R01C01
[67] X200053740059_R03C01 X200053740059_R01C02 X200053740059_R02C02
[70] X200053740059_R05C02
<0 rows> (or 0-length row.names)
Any idea of why this could be happening? It is the first time I am using COHCAP and I am probably missing something. Thanks very much for your time and attention and merry christmas!!
IOM