FourC object from scratch
6
0
Entering edit mode
@nicolas-servant-1466
Last seen 22 months ago
France

Hi Felix,

You're rigth, I move to the BioC list.

I tried your exemple, it works well.

In the exptData, do I really need all these fields ? or can I use an empty string for projectPath, fragmentDir, reSequence1 ... ?

Then, just to clarify some points ;

fc@colData is your experimental design, so in your exemple, you have 6 samples from the same viewpoint "testdata".

What about rowData ? if I'm correct, these are the measures (counts) per viewpoint ? so here you only have interacting loci per viewpoint ? Which make sense with the dim of the counts object (6 samples x 2 measures)

I'm just wondering if you can have multiple viewpoints in the same FourC object ? with potentially different number of measures ...

Thank you again,

Best wishes

N.

--------------------------------

Hi Nicolas,

this might be interesting to other users as well. So I would recommend we move the discussion to the Biocondoctor Support Forum if this is fine with you.

Have a look at the script below. This should explain all the steps you need to do to add a custom fragment reference, viewpoint information and counts.

Best regards,
Felix

--------------------------------

library(FourCSeq)

exptData <- SimpleList(projectPath=tempdir(),
                       fragmentDir="re_fragments",
referenceGenomeFile=system.file("extdata/dm3_chr2L_1-6900.fa",
package="FourCSeq"),
                       reSequence1="GATC",
                       reSequence2="CATG",
                       primerFile="",
                       bamFilePath="")

colData <- DataFrame(viewpoint = "testdata",
                     condition = factor(rep(c("WE_68h", "MESO_68h", "WE_34h"),
                                            each=2),
                                        levels = c("WE_68h", "MESO_68h", "WE_34h")),
                     replicate = rep(c(1, 2),
                                     3),
                     bamFile = "",
                     sequencingPrimer="")

fc <- FourC(colData, exptData)
fc


## this step can also be skipped by adding your fragment reference (GRanges)
##rowData(fc) <- your Fragment GRanges
fc <- addFragments(fc)
fc

## counts would by your HiC counts, here I take a random sample
cnts <- matrix(sample(1:10, prod(dim(fc)), replace = TRUE),
               nrow=nrow(fc))
cnts

## make sure they are integers otherwise you will get an error
typeof(cnts)

counts(fc) <- cnts
fc

## manually add viewpoint information because there is no primerfile
colData(fc)$chr = "chr2L"
colData(fc)$start = 6027
colData(fc)$end = 6878

## now you should be able to continue with the workflow

 


On 10/09/2015 06:34 PM, Nicolas Servant wrote:
Dear Felix,

I'm developing the HiTC package on Bioconductor to analyse Hi-C data.
I would like to make a test to see whether the method implemented in FourCSeq can be used on some specific Hi-C viewpoints (so 4C like information).
So actually, my question is quite simple, let's say that I have a GRanges objects with a count information extracted from HiTC, and that I would like to enter into FourCSeq to detect valid interactions.
I saw that the FourC object require many information about the pre-processing such as the BAM file, etc.
Could give any advices to create such object from a count table ?

Thank you very much
Best wishes
Nicolas

hi-c HiTC fourcseq • 2.1k views
ADD COMMENT
1
Entering edit mode
felix.klein ▴ 150
@felixklein-6900
Last seen 5.8 years ago
Germany

Hi Nicolas,

for exptData you can use empty strings if you do not want to provide the information. However the fields are required by the class validity check so that everything works in the normal process. Same is true for colData.

colData(fc) (better accessed this way) returns the experimental design and you have to provide the fields given in the example.

rowData is a GRanges object that contains the genomic positions of the restriction fragments. During the analysis more data is added to the mcols by FourCSeq and calls to DESeq2. Here you should be able to add "resFrag", "cutSites", y_intervals(), and all other GRanges objects generated by HiTC.

The counts are stored in "assays" (= SimpleList of matrices). "counts(fc)" is a special accessor that returns the counts "assay" of a FourC object. All viewpoints must have the same number of fragments (= nrow). So it wont work if you have different fragment references, e.g. cut by different combinations of restriction enzymes.

The subsequent analysis also assumes that all data is from the same viewpoint (replicates and different conditions). So putting two different viewpoints into the FourC object will cause problems in the downstream analysis.

Best regards,

Felix

ADD COMMENT
0
Entering edit mode
@nicolas-servant-1466
Last seen 22 months ago
France

Hi Felix,

Still trying to adapt FourCSeq to other 'C' data. I build my FourC object from scratch, with many empty fields following our previous discussion. Then when I used the getZScores, it crashed because leftSize, rightSize, leftValid, rightValid are not defined.

The line

fragData$blind = fragData$leftSize < 0 & fragData$rightSize <  0

from getDistAroundVp() crashed.

I'm not sure to really understant what these fields mean ? Do I have to fix them to a constant ?

Thank you

Nicolas

 

ADD COMMENT
0
Entering edit mode
felix.klein ▴ 150
@felixklein-6900
Last seen 5.8 years ago
Germany

Hi Nicolas,

these settings are used for 4C libraries to filter out fragments that have do not contain a site of the second cutter at all. In this case the size is set to -1.

If you want to use all fragments then you can just set leftSize and rightSize to 1 and leftValid and rightValid to TRUE in the fragment data. This should to the trick.

Best regards,

Felix

 

ADD COMMENT
0
Entering edit mode
@nicolas-servant-1466
Last seen 22 months ago
France

ok. Thank you Felix. I will try it.

ADD COMMENT
0
Entering edit mode
@nicolas-servant-1466
Last seen 22 months ago
France

Still an issue with getDistAroundVp from getZScores

Error in mcols(mcols(fragData))[colnames(mcols(fragData)) %in% newCols,  :
  type / longueur incorrect (S4 / 0)

It seems that mcols(mcols(fragData)) is NULL in my case ?

Did I miss something ?

Thank you again

Nicolas

 

ADD COMMENT
0
Entering edit mode

might be useful ...

> fragData
GRanges object with 3999 ranges and 11 metadata columns:
         seqnames                 ranges strand   |        name  leftSize
            <Rle>              <IRanges>  <Rle>   | <character> <numeric>
     [1]     chrX     [6062965, 6064347]      *   |           1         1
     [2]     chrX     [6575454, 6575816]      *   |           3         1
     [3]     chrX     [6654254, 6654532]      *   |           2         1
     [4]     chrX     [6797051, 6797749]      *   |           2         1
     [5]     chrX     [6840652, 6841408]      *   |           1         1
     ...      ...                    ...    ... ...         ...       ...
  [3995]     chrX [166228487, 166228621]      *   |           3         1
  [3996]     chrX [166406018, 166406225]      *   |           1         1
  [3997]     chrX [166416551, 166417709]      *   |           1         1
  [3998]     chrX [166425067, 166425507]      *   |           1         1
  [3999]     chrX [166445050, 166445420]      *   |           4         1
         rightSize leftValid rightValid       mid      dist   posLeft  posRight
         <numeric> <logical>  <logical> <numeric> <numeric> <logical> <logical>
     [1]         1      TRUE       TRUE   6063656 -94613561      TRUE     FALSE
     [2]         1      TRUE       TRUE   6575635 -94101582      TRUE     FALSE
     [3]         1      TRUE       TRUE   6654393 -94022824      TRUE     FALSE
     [4]         1      TRUE       TRUE   6797400 -93879817      TRUE     FALSE
     [5]         1      TRUE       TRUE   6841030 -93836187      TRUE     FALSE
     ...       ...       ...        ...       ...       ...       ...       ...
  [3995]         1      TRUE       TRUE 166228554  65551337     FALSE      TRUE
  [3996]         1      TRUE       TRUE 166406121  65728904     FALSE      TRUE
  [3997]         1      TRUE       TRUE 166417130  65739913     FALSE      TRUE
  [3998]         1      TRUE       TRUE 166425287  65748070     FALSE      TRUE
  [3999]         1      TRUE       TRUE 166445235  65768018     FALSE      TRUE
             blind     vpChr
         <logical> <logical>
     [1]     FALSE      TRUE
     [2]     FALSE      TRUE
     [3]     FALSE      TRUE
     [4]     FALSE      TRUE
     [5]     FALSE      TRUE
     ...       ...       ...
  [3995]     FALSE      TRUE
  [3996]     FALSE      TRUE
  [3997]     FALSE      TRUE
  [3998]     FALSE      TRUE
  [3999]     FALSE      TRUE
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

 

ADD REPLY
0
Entering edit mode
Hello Nicolas, this is something I forgot. You also have to annoate the mcols(mcols(fc)). The following is done by the FourC package pipeline: > data(fc) > names(mcols(fc)) [1] "leftSize" "rightSize" "leftValid" "rightValid" > mcols(mcols(fc)) DataFrame with 4 rows and 2 columns type description <character> <character> 1 referenceFragments 2 referenceFragments 3 referenceFragments 4 referenceFragments > data(fcf) > names(mcols(fcf)) [1] "leftSize" "rightSize" "leftValid" "rightValid" [5] "mid" "dist" "posLeft" "posRight" [9] "blind" "vpChr" "tooClose" "lowCounts" [13] "selectedForFit" "baseMean" "baseVar" "allZero" [17] "dispGeneEst" "dispFit" "dispersion" "dispIter" [21] "dispOutlier" "dispMAP" > as.data.frame(mcols(mcols(fcf))) type description 1 referenceFragments 2 referenceFragments 3 referenceFragments 4 referenceFragments 5 viewpointInfo 6 viewpointInfo 7 viewpointInfo 8 viewpointInfo 9 viewpointInfo 10 viewpointInfo 11 fragmentSelection 12 fragmentSelection 13 fragmentSelection 14 intermediate the base mean over all rows 15 intermediate the base variance over all rows 16 intermediate all counts in a row are zero 17 intermediate gene-wise estimates of dispersion 18 intermediate fitted values of dispersion 19 intermediate final estimate of dispersion 20 intermediate number of iterations 21 intermediate dispersion flagged as outlier 22 intermediate maximum a posteriori estimate Best regards, Felix
ADD REPLY
0
Entering edit mode
@nicolas-servant-1466
Last seen 22 months ago
France

Thank you Felix. It works. If useful, I can send you the function I wrote to build a FourC object from a BED file.

Last question, when I apply the code from the vignette, I have no significant interaction, although I can see on the viewpoint plot some points higher than the dotted blue line (in both replicates). Would you have any idea ?

Thank you. Nicolas

ADD COMMENT
0
Entering edit mode
Hello Nicolas, additional to the z-score we calculate associated p-values which are adjusted for multiple testing. It might be that the threshold for those is not met by these points. Try to increase fdrThresh in the addPeaks function before calling plotZScores and see if the points light up. Best regards, Felix On 12/08/2015 10:55 AM, Nicolas Servant [bioc] wrote: > Activity on a post you are following on support.bioconductor.org > <https: support.bioconductor.org=""> > > User Nicolas Servant <https: support.bioconductor.org="" u="" 1466=""/> wrote > Answer: FourC object from scratch > <https: support.bioconductor.org="" p="" 73222="" #75712="">: > > Thank you Felix. It works. If useful, I can send you the function I > wrote to build a FourC object from a BED file. > > Last question, when I apply the code from the vignette, I have no > significant interaction, although I can see on the viewpoint plot some > points higher than the dotted blue line (in both replicates). Would > you have any idea ? > > Thank you. Nicolas > > ------------------------------------------------------------------------ > > Post tags: hi-c, HiTC, fourcseq > > You may reply via email or visit > A: FourC object from scratch >
ADD REPLY
0
Entering edit mode

Hi Nicholas,

I'm using HiCPro & HiTC, would it be possible for you to post your function to build a FourC object from a BED file?

Thanks,

Jack

ADD REPLY

Login before adding your answer.

Traffic: 972 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6