Hi Rory,
Thanks for advising me to post the question to the main Bioconductor forum.
When trying to read in the peaksets with a command like this:
factorname =dba(sampleSheet="filename.csv")
I get this error message:
Error in Summary.factor(c(59L, 15L, 56L, 12L, 21L, 23L, 43L, 52L, 28L, :
‘max’ not meaningful for factors
I don’t have a PeakFormat column in the filename .csv file, and I am using BED files generated by Homer2. The BED files are plain tab delimited txt files with 4 columns; the 4th column has the peak scores in a format chr#-value.
I found that the deletion of the 4th column solves the problem and the peaksets are read in.
The question is if the peak scores are considered by DiffBind at all, and if the problem that arises later on (no way to get beyond factorname = dba.analyze(factorname) is a result of the reading in the peaksets?
Thanks in advance,
Iliya Lefterov
The reason for the fifth column are all set to 1 is that I think there is a bug in the Perl script of pos2bed.pl? Not matter you use "-5" option or not (-5 : Set 5th column to the value 1 instead of value in 6th column of pos file), the fifth column will be 1 always.
Hi Gord, I think it should be $3-1 for the start position.
My understanding is that BED format is 0-based for the second column (chromStart), while it is 1-based for the peak txt file from HOMER. I double checked the script of "pos2bed.pl", which is "my $start = $line[2]-1;" to get start position. If I am not right, please correct me because I used $3-1 currently. Thanks a lot!
Also, could I ask a question about HOMER? Do you know how how HOMER calculates the score for a given peak? To be honest, I don't know the meaning of the "peak score". Here what I got from the HOMER manual as below. Actually, I used 6th column added to transfered BED file. What the difference between them to Diffbind? Thank you so... much!