Question

MEDIPS - Annotation output explanation

0

Entering edit mode

vjain • 0

@vjain-15283

Last seen 6.0 years ago

Hi ,

I run a full script of MEDIPS and got output like this --

	chr	start	stop	CF	SampleC_result.bam.counts	SampleD_result.bam.counts	SampleC_result.bam.rpkm	SampleD_result.bam.rpkm	SampleC_result.bam.rms	SampleD_result.bam.rms	MSets1.counts.mean	MSets1.rpkm.mean	MSets1.rms.mean	MSets2.counts.mean	MSets2.rpkm.mean	MSets2.rms.mean	edgeR.logFC	edgeR.logCPM	edgeR.p.value	edgeR.adj.p.value	1_id	2_id
869	chr1	86801	86900	3	21	66	28.91	100.77	0.66	0.83	21	28.91	0.66	66	100.77	0.83	-1.7920893719	2.7474174303	8.33654332196427E-08	0.0172985775	NA	NA
2000066	chr1	200006501	200006600	4	37	0	50.94	0	0.73	0	37	50.94	0.73	0	0	0	8.1432140516	1.5378794034	3.00335172395078E-11	0.000006232	NA	NA
2000067	chr1	200006601	200006700	6	48	0	66.08	0	0.73	0	48	66.08	0.73	0	0	0	8.5175532823	1.8761919171	2.99277412121177E-14	6.21009608473806E-09	NA	NA
2000068	chr1	200006701	200006800	9	42	1	57.82	1.53	0.67	0.12	42	57.82	0.67	1	1.53	0.12	5.0886311128	1.7339671371	3.96394127476866E-11	8.22529706338322E-06	NA	NA
2000069	chr1	200006801	200006900	5	27	1	37.17	1.53	0.66	0.18	27	37.17	0.66	1	1.53	0.18	4.4536975675	1.1871136508	4.23547920710291E-07	0.0878874642	NA	NA
2097234	chr1	209723301	209723400	1	0	28	0	42.75	0	0.77	0	0	0	28	42.75	0.77	-7.8885206568	1.242042544	3.80157670010712E-09	0.0007888386	NA	NA
2105738	chr1	210573701	210573800	1	27	1	37.17	1.53	0.75	0.28	27	37.17	0.75	1	1.53	0.28	4.4536975675	1.1871136508	4.23547920710291E-07	0.0878874642	NA	NA
2163247	chr1	216324601	216324700	2	0	22	0	33.59	0	0.7	0	0	0	22	33.59	0.7	-7.5422568331	0.9382190482	2.41450299325845E-07	0.0501016615	ENSSSCG00000005207	NA
2271159	chr1	227115801	227115900	1	0	23	0	35.12	0	0.74	0	0	0	23	35.12	0.74	-7.6060506188	0.9934313723	1.20863824109425E-07	0.0250796061	NA	NA
3129473	chr1	312947201	312947300	1	38	4	52.31	6.11	0.8	0.48	38	52.31	0.8	4	6.11	0.48	3.0643134186	1.7084900082	4.54579647734759E-07	0.0943266406	NA	NA
3130975	chr1	313097401	313097500	1	68	15	93.61	22.9	0.89	0.68	68	93.61	0.89	15	22.9	0.68	2.0255720997	2.6267801872	3.8638156315668E-08	0.0080175333	NA	NA
3309724	chr2	15650901	15651000	1	27	1	37.17	1.53	0.75	0.28	27	37.17	0.75	1	1.53	0.28	4.4536975675	1.1871136508	4.23547920710291E-07	0.0878874642	ENSSSCG00000013248	NA
3309725	chr2	15651001	15651100	2	28	1	38.55	1.53	0.73	0.25	28	38.55	0.73	1	1.53	0.25	4.5059155442	1.230895305	2.19611172180891E-07	0.0455699771	ENSSSCG00000013248	NA
3309726	chr2	15651101	15651200	1	27	0	37.17	0	0.75	0	27	37.17	0.75	0	0	0	7.6905369394	1.1398815626	3.02899397204007E-08	0.0062852534	ENSSSCG00000013248	NA
3515807	chr2	36259201	36259300	1	0	23	0	35.12	0	0.74	0	0	0	23	35.12	0.74	-7.6060506188	0.9934313723	1.20863824109425E-07	0.0250796061	NA	NA
3640700	chr2	48748501	48748600	4	0	24	0	36.65	0	0.66	0	0	0	24	36.65	0.66	-7.6671425888	1.0466404624	6.05043434315258E-08	0.0125548328	NA	NA
3835270	chr2	68205501	68205600	1	24	0	33.04	0	0.73	0	24	33.04	0.73	0	0	0	7.5214846498	0.9949480389	2.41450299325845E-07	0.0501016615	NA	NA
4070870	chr2	91765501	91765600	2	26	0	35.79	0	0.72	0	26	35.79	0.72	0	0	0	7.6363577383	1.0931939799	6.05043434315258E-08	0.0125548328	ENSSSCG00000014136	NA
5001207	chr3	22229801	22229900	3	31	0	42.68	0	0.72	0	31	42.68	0.72	0	0	0	7.8889443296	1.3126573776	1.90354050254007E-09	0.0003949904	ENSSSCG00000033674	NA

My query is -

1. How to analyse this result?

2. How to get exact position of chromosome ?

3. After annotation result show very less gene name , why ? and their is tow column in the end of table 1_id and 2_id , what these are ? is these column are result for sampleC and sample D ?

4. Can't understand the start and stop column information. any explanation? First row also shows some values what are these. ?

code ----

Set1 = MEDIPS.createSet(file = "SampleC_result.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws)
Set2 = MEDIPS.createSet(file = "SampleD_result.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws)

CS = MEDIPS.couplingVector(pattern = "CG", refObj = Set1)

#mr.edgeR = MEDIPS.meth(MSet1 = Set1, CSet = CS, p.adj = "bonferroni",diff.method = "edgeR", MeDIP = T, CNV = F, minRowSum = 10)
mr.edgeR_Set1 = MEDIPS.meth(MSet1 = Set1, CSet = CS, p.adj = "bonferroni",diff.method = "edgeR", MeDIP = T, CNV = F, minRowSum = 10)

mr.edgeR_both = MEDIPS.meth(MSet1 = Set1, MSet2 = Set2,CSet = CS, p.adj = "bonferroni",diff.method = "edgeR", MeDIP = T, CNV = F, minRowSum = 10)
View(mr.edgeR)
write.table(mr.edgeR, file="mredgeR_result.csv", row.names=FALSE, col.names=TRUE, sep='\t')

mr.edgeR.s = MEDIPS.selectSig(results = mr.edgeR, p.value = 0.1,adj = T, ratio = NULL, bg.counts = NULL, CNV = F)

ensembl = useDataset("sscrofa_gene_ensembl",mart=ensembl)
anno.mart.gene = MEDIPS.getAnnotation(dataset = c("sscrofa_gene_ensembl"),annotation = c("GENE"))
mr.edgeR.s.d = MEDIPS.setAnnotation(regions = mr.edgeR.s, annotation = anno.mart.gene)
write.table(mr.edgeR.s.d , file = "gene_ann_file.csv" ,row.names=FALSE, col.names=TRUE, sep='\t')

MEDIPS Output • 1.3k views

ADD COMMENT • link 6.0 years ago vjain • 0

score 0 · Answer 1 · 2018-05-01

Hi Vjain, My query is - 1. How to analyse this result? You table contains a list of potentially differentially enriched regions. What exactly this is and how you want to further analyze depends on your research question 2. How to get exact position of chromosome ? Please see columns chr | start | stop. 3. After annotation result show very less gene name , why ? and their is tow column in the end of table 1_id and 2_id , what these are ? is these column are result for sampleC and sample D ? Not all DERs can be associated with a gene given the default parameters for the distance to a TSS in MEDIPS.getAnnotation. There are plenty of other tools available to annotate genomic regions in various ways. 4. Can't understand the start and stop column information. any explanation? First row also shows some values what are these. ? These are the genomic locations of a potential differentially enriched region (DER). First row is likely the row number (before extraction of significant windows). Hope that helps, Lukas On May 1, 2018, at 4:59 AM, vjain [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote: Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""/> User vjain<https: support.bioconductor.org="" u="" 15283=""/> wrote Question: MEDIPS - Annotation output explanation <https: support.bioconductor.org="" p="" 108407=""/> : Hi , I run a full script of MEDIPS and got output like this -- chr start stop CF SampleC_result.bam.counts SampleD_result.bam.counts SampleC_result.bam.rpkm SampleD_result.bam.rpkm SampleC_result.bam.rms SampleD_result.bam.rms MSets1.counts.mean MSets1.rpkm.mean MSets1.rms.mean MSets2.counts.mean MSets2.rpkm.mean MSets2.rms.mean edgeR.logFC edgeR.logCPM edgeR.p.value edgeR.adj.p.value 1_id 2_id 869 chr1 86801 86900 3 21 66 28.91 100.77 0.66 0.83 21 28.91 0.66 66 100.77 0.83 -1.7920893719 2.7474174303 8.33654332196427E-08 0.0172985775 NA NA 2000066 chr1 200006501 200006600 4 37 0 50.94 0 0.73 0 37 50.94 0.73 0 0 0 8.1432140516 1.5378794034 3.00335172395078E-11 0.000006232 NA NA 2000067 chr1 200006601 200006700 6 48 0 66.08 0 0.73 0 48 66.08 0.73 0 0 0 8.5175532823 1.8761919171 2.99277412121177E-14 6.21009608473806E-09 NA NA 2000068 chr1 200006701 200006800 9 42 1 57.82 1.53 0.67 0.12 42 57.82 0.67 1 1.53 0.12 5.0886311128 1.7339671371 3.96394127476866E-11 8.22529706338322E-06 NA NA 2000069 chr1 200006801 200006900 5 27 1 37.17 1.53 0.66 0.18 27 37.17 0.66 1 1.53 0.18 4.4536975675 1.1871136508 4.23547920710291E-07 0.0878874642 NA NA 2097234 chr1 209723301 209723400 1 0 28 0 42.75 0 0.77 0 0 0 28 42.75 0.77 -7.8885206568 1.242042544 3.80157670010712E-09 0.0007888386 NA NA 2105738 chr1 210573701 210573800 1 27 1 37.17 1.53 0.75 0.28 27 37.17 0.75 1 1.53 0.28 4.4536975675 1.1871136508 4.23547920710291E-07 0.0878874642 NA NA 2163247 chr1 216324601 216324700 2 0 22 0 33.59 0 0.7 0 0 0 22 33.59 0.7 -7.5422568331 0.9382190482 2.41450299325845E-07 0.0501016615 ENSSSCG00000005207 NA 2271159 chr1 227115801 227115900 1 0 23 0 35.12 0 0.74 0 0 0 23 35.12 0.74 -7.6060506188 0.9934313723 1.20863824109425E-07 0.0250796061 NA NA 3129473 chr1 312947201 312947300 1 38 4 52.31 6.11 0.8 0.48 38 52.31 0.8 4 6.11 0.48 3.0643134186 1.7084900082 4.54579647734759E-07 0.0943266406 NA NA 3130975 chr1 313097401 313097500 1 68 15 93.61 22.9 0.89 0.68 68 93.61 0.89 15 22.9 0.68 2.0255720997 2.6267801872 3.8638156315668E-08 0.0080175333 NA NA 3309724 chr2 15650901 15651000 1 27 1 37.17 1.53 0.75 0.28 27 37.17 0.75 1 1.53 0.28 4.4536975675 1.1871136508 4.23547920710291E-07 0.0878874642 ENSSSCG00000013248 NA 3309725 chr2 15651001 15651100 2 28 1 38.55 1.53 0.73 0.25 28 38.55 0.73 1 1.53 0.25 4.5059155442 1.230895305 2.19611172180891E-07 0.0455699771 ENSSSCG00000013248 NA 3309726 chr2 15651101 15651200 1 27 0 37.17 0 0.75 0 27 37.17 0.75 0 0 0 7.6905369394 1.1398815626 3.02899397204007E-08 0.0062852534 ENSSSCG00000013248 NA 3515807 chr2 36259201 36259300 1 0 23 0 35.12 0 0.74 0 0 0 23 35.12 0.74 -7.6060506188 0.9934313723 1.20863824109425E-07 0.0250796061 NA NA 3640700 chr2 48748501 48748600 4 0 24 0 36.65 0 0.66 0 0 0 24 36.65 0.66 -7.6671425888 1.0466404624 6.05043434315258E-08 0.0125548328 NA NA 3835270 chr2 68205501 68205600 1 24 0 33.04 0 0.73 0 24 33.04 0.73 0 0 0 7.5214846498 0.9949480389 2.41450299325845E-07 0.0501016615 NA NA 4070870 chr2 91765501 91765600 2 26 0 35.79 0 0.72 0 26 35.79 0.72 0 0 0 7.6363577383 1.0931939799 6.05043434315258E-08 0.0125548328 ENSSSCG00000014136 NA 5001207 chr3 22229801 22229900 3 31 0 42.68 0 0.72 0 31 42.68 0.72 0 0 0 7.8889443296 1.3126573776 1.90354050254007E-09 0.0003949904 ENSSSCG00000033674 NA My query is - 1. How to analyse this result? 2. How to get exact position of chromosome ? 3. After annotation result show very less gene name , why ? and their is tow column in the end of table 1_id and 2_id , what these are ? is these column are result for sampleC and sample D ? 4. Can't understand the start and stop column information. any explanation? First row also shows some values what are these. ? code ---- Set1 = MEDIPS.createSet(file = "SampleC_result.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws) Set2 = MEDIPS.createSet(file = "SampleD_result.bam", BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws) CS = MEDIPS.couplingVector(pattern = "CG", refObj = Set1) #mr.edgeR = MEDIPS.meth(MSet1 = Set1, CSet = CS, p.adj = "bonferroni",diff.method = "edgeR", MeDIP = T, CNV = F, minRowSum = 10) mr.edgeR_Set1 = MEDIPS.meth(MSet1 = Set1, CSet = CS, p.adj = "bonferroni",diff.method = "edgeR", MeDIP = T, CNV = F, minRowSum = 10) mr.edgeR_both = MEDIPS.meth(MSet1 = Set1, MSet2 = Set2,CSet = CS, p.adj = "bonferroni",diff.method = "edgeR", MeDIP = T, CNV = F, minRowSum = 10) View(mr.edgeR) write.table(mr.edgeR, file="mredgeR_result.csv", row.names=FALSE, col.names=TRUE, sep='\t') mr.edgeR.s = MEDIPS.selectSig(results = mr.edgeR, p.value = 0.1,adj = T, ratio = NULL, bg.counts = NULL, CNV = F) ensembl = useDataset("sscrofa_gene_ensembl",mart=ensembl) anno.mart.gene = MEDIPS.getAnnotation(dataset = c("sscrofa_gene_ensembl"),annotation = c("GENE")) mr.edgeR.s.d = MEDIPS.setAnnotation(regions = mr.edgeR.s, annotation = anno.mart.gene) write.table(mr.edgeR.s.d , file = "gene_ann_file.csv" ,row.names=FALSE, col.names=TRUE, sep='\t') ________________________________ Post tags: MEDIPS, Output You may reply via email or visit MEDIPS - Annotation output explanation

score 0 · Answer 2 · 2018-05-02

0

Entering edit mode