An ESR essay about software design, and how it applies to Bioconductor
2
0
Entering edit mode
@alex-f-bokov-306
Last seen 9.6 years ago
So here goes, I am about to risk getting myself blacklisted by the very people I can least afford to be blacklisted by, and at the very start of my career no less. Why am I taking this risk? Because I love Bioconductor, it's the most useful thing currently installed on my PC, and I'm deeply grateful to the developer and user community for making such a wonderful tool. The following constructive criticism is how I hope to make it better. Here is an essay by Eric S. Raymond describing the difficulties he had configuring a software package on Linux. Obviously the last person you'd think of as "computer illiterate", "lazy", or "clueless". http://www.catb.org/~esr/writings/cups-horror.html Once you wade through the technical minutia of his specific software struggle, the main message appears to be that software is often written by individuals who are so knowledgeable in their particular field that their idea of "obvious", "self explanatory", "intuitive", "user friendly", and even "adequately documented" may be completely different from the rest of humanity! I immediately thought of certain BioC packages I've recently bashed my head over (and over and over). At the end of the essay ESR presents a checklist for telling whether your software suffers from problems similar to the ones he describes. For the benefit of any package developers/maintainers who may still be reading this, here's my version of that checklist as revised specifically for Bioconductor: 1. What does the package look like to a computer person who isn't a statistician or a statistician who isn't a computer person? What would be the most obvious thing someone unfamiliar with your package would try to use it for... and if they did, would they succeed after having done nothing more than read the manpage? 2. Is there any dialogue in the Tcl widgets which is a dead end, without giving guidance on what the choices actually do? (although if you read ESR's essay you might conclude that there's no point to even having widgets, since a GUI does not automatically translate into user friendliness) 3. The requirement that end-users read documentation is NOT a sign of failure for a program such as R which mostly lacks a UI... but... * Is every argument, method, and slot of every non-private object documented in the manpage *for that object* (rather than referring to some other manpage which in turn refers to another manpage, ad nauseum)? * Are the usage examples you give in the manpage simple, general, and comprehensible both to statisticians who aren't computer people and computer people who aren't statisticians? Hint: gratuitous use of functions that aren't from the package you're documenting reduces comprehensibility. * Does the documentation rely on references to hardcopy publications to explain crucial portions of the object's functionality instead of using external references as supplementary/background material? * If there is a significant number of usage scenarios where the default argument values will be inappropriate, is the user warned? * Are the manpages in sync with the current package version? 4. Do you ever find yourself using any phrase resembling "The syntax is just like it is for the S-Plus version"? 5. Does your project welcome and respond to usability feedback from non-expert users? 6. Do error messages give enough information to be able to distinguish between malformed input/arguments, platform limitations (memory, drive space, access permissions), problems in R itself, and other ("other" presumably being the real bugs)? Thank you for your patience in reading this. I don't pretend to understand the technical complexity of your work, nor your motivations for doing it. However, if you do write open source software such as Bioconductor packages, it would be logical to at least assume that you want other people to use your software. Hopefully the above considerations will assist in making that happen.
GUI GUI • 1.4k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 6 weeks ago
United States
> friendly", and even "adequately documented" may be completely different > from the rest of humanity! I immediately thought of certain BioC > packages I've recently bashed my head over (and over and over). the developers are fairly responsive to questions > > At the end of the essay ESR presents a checklist for telling whether > your software suffers from problems similar to the ones he describes. > For the benefit of any package developers/maintainers who may still be > reading this, here's my version of that checklist as revised > specifically for Bioconductor: > > 1. What does the package look like to a computer person who isn't a > statistician or a statistician who isn't a computer person? What > would be the most obvious thing someone unfamiliar with your > package would try to use it for... and if they did, would they > succeed after having done nothing more than read the manpage? we've taken care to develop a "vignette" protocol in addition to man pages so that the user may get a holistic view of a software component's roles. all bioc packages have vignettes. admittedly these are not perfect but they help to illustrate and test interoperability. > 2. Is there any dialogue in the Tcl widgets which is a dead end, > without giving guidance on what the choices actually do? (although > if you read ESR's essay you might conclude that there's no point > to even having widgets, since a GUI does not automatically > translate into user friendliness) some widgets are extremely useful. no essay would convince me to eliminate them. there is clearly scope for improvement with some of them. we have taken care to provide widgetbuilding tools so that user/developers dissatisfied with the behavior of a given widget can try to design one that is more effective. > 3. The requirement that end-users read documentation is NOT a sign of > failure for a program such as R which mostly lacks a UI... but... > * Is every argument, method, and slot of every non-private > object documented in the manpage > *for that object* (rather than referring to some other > manpage which in turn refers to another manpage, ad nauseum)? that is the intention of the documentation validation protocol of R CMD check. it can be subverted, and when it is, we try to remedy it. > * Are the usage examples you give in the manpage simple, > general, and comprehensible both to statisticians who aren't > computer people and computer people who aren't > statisticians? Hint: gratuitous use of functions that aren't > from the package you're documenting reduces comprehensibility. perhaps not. perhaps you have a better example to contribute. again the vignettes help to provide context. there is also a browser for vignettes called vExplorer > * Does the documentation rely on references to hardcopy > publications to explain crucial portions of the object's > functionality instead of using external references as > supplementary/background material? perhaps. we have limited resources for what we are doing and sometimes a demand must be made on the user or reader to obtain an explanatory resource. > * If there is a significant number of usage scenarios where > the default argument values will be inappropriate, is the > user warned? > * Are the manpages in sync with the current package version? they should be, and there are mechanisms for verifying this. > 4. Do you ever find yourself using any phrase resembling "The syntax > is just like it is for the S-Plus version"? no. > 5. Does your project welcome and respond to usability feedback from > non-expert users? yes. > 6. Do error messages give enough information to be able to > distinguish between malformed input/arguments, platform > limitations (memory, drive space, access permissions), problems in > R itself, and other ("other" presumably being the real bugs)? in many cases, yes. in other cases, no. provide resources so that we can add programming effort to exceptionhandling features and this situation will improve. > > Thank you for your patience in reading this. I don't pretend to > understand the technical complexity of your work, nor your motivations > for doing it. However, if you do write open source software such as > Bioconductor packages, it would be logical to at least assume that you > want other people to use your software. Hopefully the above > considerations will assist in making that happen. it is happening.
ADD COMMENT
0
Entering edit mode
I would say Bioconductor is not extremely friendly, but neither is it extremely unfriendly to statisticians. I have had a number of statistics graduate students and others at the same level, get programs working for me with very little of my input. None of them had much prior experience with R. I do know R/Splus well enough to write primitive functions, so I can look at code when the programs do not work as expected. However, I am not very comfortable working with the objects and "slots" created by the Bioconductor software. Nevertheless, by using the Vignettes and on- line help, I have managed to write my own software that use these objects. I have had some problems, but by ploughing through this e-mail list I have solved most by myself. Most of my biologist collaborators are struggling more than I am, but still getting things to work - sometimes with my help and often without. All of us are finding the responsiveness of people on this list to be extremely helpful. Yes, we could probably do some things faster with commercial software. But then we would not get the most up-to-date methods, easy access to the functionality of R, and the benefit of interaction on this list. I have heard of vendors charging in the 30K range for small parts of the functionality that is already built into Bioconductor. In conclusion - my thanks to the developers and to people who answer questions posted to this list. Of course, if anyone wants to improve the documentation, I am all for it. --Naomi At 04:00 PM 2/27/2004, Vincent Carey 525-2265 wrote: > > friendly", and even "adequately documented" may be completely different > > from the rest of humanity! I immediately thought of certain BioC > > packages I've recently bashed my head over (and over and over). > >the developers are fairly responsive to questions > > > > > At the end of the essay ESR presents a checklist for telling whether > > your software suffers from problems similar to the ones he describes. > > For the benefit of any package developers/maintainers who may still be > > reading this, here's my version of that checklist as revised > > specifically for Bioconductor: > > > > 1. What does the package look like to a computer person who isn't a > > statistician or a statistician who isn't a computer person? What > > would be the most obvious thing someone unfamiliar with your > > package would try to use it for... and if they did, would they > > succeed after having done nothing more than read the manpage? > >we've taken care to develop a "vignette" protocol in addition >to man pages so that the user may get a holistic view of a software >component's roles. all bioc packages have vignettes. admittedly >these are not perfect but they help to illustrate and test >interoperability. > > > 2. Is there any dialogue in the Tcl widgets which is a dead end, > > without giving guidance on what the choices actually do? (although > > if you read ESR's essay you might conclude that there's no point > > to even having widgets, since a GUI does not automatically > > translate into user friendliness) > >some widgets are extremely useful. no essay would convince >me to eliminate them. there is clearly scope for improvement >with some of them. we have taken care to provide widgetbuilding >tools so that user/developers dissatisfied with the behavior >of a given widget can try to design one that is more effective. > > > 3. The requirement that end-users read documentation is NOT a sign of > > failure for a program such as R which mostly lacks a UI... but... > > * Is every argument, method, and slot of every non- private > > object documented in the manpage > > *for that object* (rather than referring to some other > > manpage which in turn refers to another manpage, ad nauseum)? > >that is the intention of the documentation validation protocol >of R CMD check. it can be subverted, and when it is, we try >to remedy it. > > > * Are the usage examples you give in the manpage simple, > > general, and comprehensible both to statisticians who aren't > > computer people and computer people who aren't > > statisticians? Hint: gratuitous use of functions that aren't > > from the package you're documenting reduces comprehensibility. > >perhaps not. perhaps you have a better example to contribute. >again the vignettes help to provide context. there is also >a browser for vignettes called vExplorer > > > * Does the documentation rely on references to hardcopy > > publications to explain crucial portions of the object's > > functionality instead of using external references as > > supplementary/background material? > >perhaps. we have limited resources for what we are doing and >sometimes a demand must be made on the user or reader to >obtain an explanatory resource. > > > * If there is a significant number of usage scenarios where > > the default argument values will be inappropriate, is the > > user warned? > > * Are the manpages in sync with the current package version? > >they should be, and there are mechanisms for verifying this. > > > 4. Do you ever find yourself using any phrase resembling "The syntax > > is just like it is for the S-Plus version"? > >no. > > > 5. Does your project welcome and respond to usability feedback from > > non-expert users? > >yes. > > > 6. Do error messages give enough information to be able to > > distinguish between malformed input/arguments, platform > > limitations (memory, drive space, access permissions), problems in > > R itself, and other ("other" presumably being the real bugs)? > >in many cases, yes. in other cases, no. provide resources >so that we can add programming effort to exceptionhandling >features and this situation will improve. > > > > > Thank you for your patience in reading this. I don't pretend to > > understand the technical complexity of your work, nor your motivations > > for doing it. However, if you do write open source software such as > > Bioconductor packages, it would be logical to at least assume that you > > want other people to use your software. Hopefully the above > > considerations will assist in making that happen. > >it is happening. > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY

Login before adding your answer.

Traffic: 829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6