Question: RinGO problem
0
10.9 years ago by
Jianping Jin890
Jianping Jin890 wrote:
Dear list, I ran "runExonerate.sh" and had the per-chromosome output files condensed into one single file (see below, I read in the file just for viewing): > str(allchranno) 'data.frame': 390388 obs. of 6 variables: $SEQ_ID : Factor w/ 388250 levels "chr1:10000013-10000070",..: 31687 28404 33240 34681 29011 26915 ...$ PROBE_ID : Factor w/ 373478 levels "5313_0001_0001",..: 138251 325230 15265 268671 45500 270116 ... $CHROMOSOME: Factor w/ 21 levels "1","10","11",..: 2 2 2 2 2 2 2 2 2 2 ...$ POSITION : int 75573476 4540877 79390517 80647222 5734395 30338873 82085749 7386228 61247293 ... $LENGTH : int 50 50 50 60 50 60 50 50 50 52 ...$ MISMATCHES: int 0 0 0 0 0 0 0 0 0 0 ... But when I tried to map probes to the genome I got an error and warnings: probeAnno <- posToProbeAnno("C:/from_DriveD/Chip- chip/Bultman/allChromExonerateOut_scott.txt") Creating probeAnno mapping for chromosome 1 10 11 12 13 14 15 16 17 18 19 2 3 4 5 6 7 8 9 X Y Done. Error in validObject(.Object) : invalid class "probeAnno" object: FALSE In addition: Warning message: In validityMethod(object) : Some match positions end before they actually start. Please check elements 1.start and 1.end . Appreciate it if you can help! Jianping FYI: > sessionInfo() R version 2.8.0 (2008-10-20) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines tools stats graphics grDevices utils datasets methods base other attached packages: [1] Ringo_1.6.0 SparseM_0.78 RColorBrewer_1.0-2 vsn_3.8.0 affy_1.20.0 [6] limma_2.16.3 geneplotter_1.20.0 annotate_1.20.1 xtable_1.5-4 AnnotationDbi_1.4.2 [11] lattice_0.17-15 genefilter_1.22.0 survival_2.34-1 Biobase_2.2.1 loaded via a namespace (and not attached): [1] affyio_1.10.1 DBI_0.2-4 grid_2.8.0 KernSmooth_2.22-22 preprocessCore_1.4.0 [6] RSQLite_0.7-1 ################################## Jianping Jin Ph.D. Bioinformatics scientist Center for Bioinformatics Room 3133 Bioinformatics building CB# 7104 University of Chapel Hill Chapel Hill, NC 27599 Phone: (919)843-6105 FAX: (919)843-3103 E-Mail: jjin at email.unc.edu
• 750 views
modified 10.9 years ago by Joern Toedling720 • written 10.9 years ago by Jianping Jin890
0
10.9 years ago by
Joern Toedling720 wrote:
Hi Jianping, I am not completely sure what the source of the error is yet, so bear with me as I am trying to find out. First, something could be wrong with the merged Exonerate output file. Which version of Exonerate are you using? Can you also please tell me what the output of summary(allchranno$LENGTH) and summary(allchranno$POSITION) are? Another issue could be the use of factors for probe and chromosome identifiers. When reading in the output file using read.table, read.delim etc., please try the argument "as.is=TRUE", which will prevent the conversion of character vectors into factors. You can then directly supply the data.frame allchranno to the function posToProbeAnno. probeAnno <- posToProbeAnno(allchranno) Please tell me whether the error message still persists. If so could you please provide me with the file allChromExonerateOut_scott.txt or an excerpt thereof (please do not attach it to the mail but provide it for download on some server) such that I can further look into the issue. Best regards, Joern Jianping Jin wrote: > Dear list, > > I ran "runExonerate.sh" and had the per-chromosome output files > condensed into one single file (see below, > I read in the file just for viewing): > >> str(allchranno) > 'data.frame': 390388 obs. of 6 variables: > $SEQ_ID : Factor w/ 388250 levels "chr1:10000013-10000070",..: > 31687 28404 33240 34681 29011 26915 ... >$ PROBE_ID : Factor w/ 373478 levels "5313_0001_0001",..: 138251 > 325230 15265 268671 45500 270116 ... > $CHROMOSOME: Factor w/ 21 levels "1","10","11",..: 2 2 2 2 2 2 2 2 2 > 2 ... >$ POSITION : int 75573476 4540877 79390517 80647222 5734395 30338873 > 82085749 7386228 61247293 ... > $LENGTH : int 50 50 50 60 50 60 50 50 50 52 ... >$ MISMATCHES: int 0 0 0 0 0 0 0 0 0 0 ... > > But when I tried to map probes to the genome I got an error and warnings: > > probeAnno <- > posToProbeAnno("C:/from_DriveD/Chip- chip/Bultman/allChromExonerateOut_scott.txt") > > Creating probeAnno mapping for chromosome 1 10 11 12 13 14 15 16 17 18 > 19 2 3 4 5 6 7 8 9 X Y Done. > Error in validObject(.Object) : invalid class "probeAnno" object: FALSE > In addition: Warning message: > In validityMethod(object) : > Some match positions end before they actually start. > Please check elements 1.start and 1.end . > > Appreciate it if you can help! > > Jianping > -- Joern Toedling EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom Phone +44(0)1223 492566 Email toedling at ebi.ac.uk
Hi Joern, Please see below, --On Monday, January 12, 2009 5:44 PM +0000 Joern Toedling <toedling at="" ebi.ac.uk=""> wrote: > Hi Jianping, > > I am not completely sure what the source of the error is yet, so bear > with me as I am trying to find out. > First, something could be wrong with the merged Exonerate output file. > Which version of Exonerate are you using? exonerate-2.2.0-x86_64 > > Can you also please tell me what the output of > > summary(allchranno$LENGTH) > summary(allchranno$LENGTH) Min. 1st Qu. Median Mean 3rd Qu. Max. -72.00 50.00 50.00 50.13 51.00 75.00 > > and > > summary(allchranno$POSITION) > summary(allchranno$POSITION) Min. 1st Qu. Median Mean 3rd Qu. Max. 133400 37680000 75400000 77490000 112700000 197100000 > > are? Another issue could be the use of factors for probe and chromosome > identifiers. When reading in the output file using read.table, > read.delim etc., please try the argument "as.is=TRUE", which will > prevent the conversion of character vectors into factors. You can then > directly supply the data.frame allchranno to the function posToProbeAnno. > > probeAnno <- posToProbeAnno(allchranno) > > Please tell me whether the error message still persists. If so could you > please provide me with the file allChromExonerateOut_scott.txt > or an excerpt thereof (please do not attach it to the mail but provide > it for download on some server) such that I can further look into the > issue. > Yes. The problem is the same. I put up the data file on <http: seattle.med.unc.edu="" jjin=""/>. You can check that out. Thanks, Jianping > Best regards, > Joern > > Jianping Jin wrote: >> Dear list, >> >> I ran "runExonerate.sh" and had the per-chromosome output files >> condensed into one single file (see below, >> I read in the file just for viewing): >> >>> str(allchranno) >> 'data.frame': 390388 obs. of 6 variables: >> $SEQ_ID : Factor w/ 388250 levels "chr1:10000013-10000070",..: >> 31687 28404 33240 34681 29011 26915 ... >>$ PROBE_ID : Factor w/ 373478 levels "5313_0001_0001",..: 138251 >> 325230 15265 268671 45500 270116 ... >> $CHROMOSOME: Factor w/ 21 levels "1","10","11",..: 2 2 2 2 2 2 2 2 2 >> 2 ... >>$ POSITION : int 75573476 4540877 79390517 80647222 5734395 30338873 >> 82085749 7386228 61247293 ... >> $LENGTH : int 50 50 50 60 50 60 50 50 50 52 ... >>$ MISMATCHES: int 0 0 0 0 0 0 0 0 0 0 ... >> >> But when I tried to map probes to the genome I got an error and warnings: >> >> probeAnno <- >> posToProbeAnno("C:/from_DriveD/Chip- chip/Bultman/allChromExonerateOut_sc >> ott.txt") >> >> Creating probeAnno mapping for chromosome 1 10 11 12 13 14 15 16 17 18 >> 19 2 3 4 5 6 7 8 9 X Y Done. >> Error in validObject(.Object) : invalid class "probeAnno" object: FALSE >> In addition: Warning message: >> In validityMethod(object) : >> Some match positions end before they actually start. >> Please check elements 1.start and 1.end . >> >> Appreciate it if you can help! >> >> Jianping >> > > -- > Joern Toedling > EMBL - European Bioinformatics Institute > Wellcome Trust Genome Campus > Hinxton, Cambridge CB10 1SD > United Kingdom > Phone +44(0)1223 492566 > Email toedling at ebi.ac.uk > ################################## Jianping Jin Ph.D. Bioinformatics scientist Center for Bioinformatics Room 3133 Bioinformatics building CB# 7104 University of Chapel Hill Chapel Hill, NC 27599 Phone: (919)843-6105 FAX: (919)843-3103 E-Mail: jjin at email.unc.edu
Hello, well, the culprit(s) is/are the matches with a negative entry in LENGTH, as these are not supposed to happen. I am not sure how these came about, but it might have to do with changes in Exonerate (the scripts were written for Exonerate version 2.0.0). I shall investigate this further and get back to you. But for the moment, I am afraid you will have either to discard these lines with negative length matches before calling posToProbeAnno (how many are these?) or find a way to correct them in the Exonerate output file. Regards, Joern Jianping Jin wrote: > Hi Joern, > > Please see below, > > --On Monday, January 12, 2009 5:44 PM +0000 Joern Toedling > <toedling at="" ebi.ac.uk=""> wrote: > >> Hi Jianping, >> >> I am not completely sure what the source of the error is yet, so bear >> with me as I am trying to find out. >> First, something could be wrong with the merged Exonerate output file. >> Which version of Exonerate are you using? > > exonerate-2.2.0-x86_64 >> >> Can you also please tell me what the output of >> >> summary(allchranno$LENGTH) > >> summary(allchranno$LENGTH) > Min. 1st Qu. Median Mean 3rd Qu. Max. > -72.00 50.00 50.00 50.13 51.00 75.00 -- Joern Toedling EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom Phone +44(0)1223 492566 Email toedling at ebi.ac.uk
Hi Joern, There are 6247 probes with LENGTH <= 0 (actually <= -50). The minus sign may refer to the genome strain. I checked the demo data set which also contains minus values for LENGTH. The only difference I can tell is the format of PROBE_ID. In the demo file it is something beginning with MM, e.g. MM5000P01479955, while in my file it is something like 5313_0514_0052. Jianping --On Monday, January 12, 2009 6:46 PM +0000 Joern Toedling <toedling at="" ebi.ac.uk=""> wrote: > Hello, > > well, the culprit(s) is/are the matches with a negative entry in LENGTH, > as these are not supposed to happen. I am not sure how these came about, > but it might have to do with changes in Exonerate (the scripts were > written for Exonerate version 2.0.0). I shall investigate this further > and get back to you. But for the moment, I am afraid you will have > either to discard these lines with negative length matches before > calling posToProbeAnno (how many are these?) or find a way to correct > them in the Exonerate output file. > > Regards, > Joern > > Jianping Jin wrote: >> Hi Joern, >> >> Please see below, >> >> --On Monday, January 12, 2009 5:44 PM +0000 Joern Toedling >> <toedling at="" ebi.ac.uk=""> wrote: >> >>> Hi Jianping, >>> >>> I am not completely sure what the source of the error is yet, so bear >>> with me as I am trying to find out. >>> First, something could be wrong with the merged Exonerate output file. >>> Which version of Exonerate are you using? >> >> exonerate-2.2.0-x86_64 >>> >>> Can you also please tell me what the output of >>> >>> summary(allchranno$LENGTH) >> >>> summary(allchranno$LENGTH) >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> -72.00 50.00 50.00 50.13 51.00 75.00 > > -- > Joern Toedling > EMBL - European Bioinformatics Institute > Wellcome Trust Genome Campus > Hinxton, Cambridge CB10 1SD > United Kingdom > Phone +44(0)1223 492566 > Email toedling at ebi.ac.uk > ################################## Jianping Jin Ph.D. Bioinformatics scientist Center for Bioinformatics Room 3133 Bioinformatics building CB# 7104 University of Chapel Hill Chapel Hill, NC 27599 Phone: (919)843-6105 FAX: (919)843-3103 E-Mail: jjin at email.unc.edu