package specifically dedicated to a manuscript
5
3
Entering edit mode
petyuk ▴ 70
@petyuk-7261
Last seen 4.2 years ago
United States

I am performing an analysis of targeted proteomics for a study with rather sophisticated study design.  As a result of this and other issues (like novelty of the data itself) I found myself writing a lot of custom code  (such as visualization, normalization, etc.).  The code just begs to organize itself into a package.  However, there would be no use for this package except reproducing the data analysis for the corresponding manuscript and/or perhaps exploring the data further.  Nonetheless, I really want my data analysis to be reproducible by others and somehow refer to the real code in the manuscript instead of referring to method names (like - this was done by random forest approach [ref]).

So this tentative package doesn't fit the "software" category, because it is not generic enough and tied to a specific study.  It doesn't fit into "data" packages because it is not data only.  "Annotation" type is out for obvious reasons too.  I guess there is always an option to forget about packaging R code all together and provide a long-long script in the supplement.

What would you suggest?

Thanks,

Vlad

 

 

package guidelines • 1.9k views
ADD COMMENT
3
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…

Dear Vlad

it is a great idea to keep and maintain your code and a reproducible transcript of the analyses for a paper as a package. For instance, Hiiragi2013 and HD2013SGI are recent cases where we did this for two of our papers, in case you want to see how we did it. The vignettes reproduce all major tables, figures and findings of the papers. Having that helps us a lot e.g. with addressing other scientists' questions on them.

Best wishes

Wolfgang

 

ADD COMMENT
1
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States

Definitely organize your code as a package, how ever you end up distributing it! It might be an experiment data package, or simply a package that you make available to others through non-Bioconductor means (e.g., github).

ADD COMMENT
1
Entering edit mode
Keith Hughitt ▴ 180
@keith-hughitt-6740
Last seen 10 months ago
United States

I would suggest checking out knitr if you haven't already. It makes it easy to combine text and figures with the code used to generate them.

You can even use it to write your entire manuscript in something like LaTeX or Markdown, and have the code used to the generate the figures embedded in the same document. This way, you can simply distribute a single document, along with any data requirements, etc. and someone could regenerate your entire manuscript, figures included.

Another possibility would be to have a separate knitr document which generates all the figures as a supplementary file, and write the main manuscript text separately using whatever approach you are comfortable with.

A quick search for "knitr reproducible research" should give you some pretty good ideas of where to start, e.g.

https://github.com/umd-byob/presentations/tree/master/2013/0903-knitr_reproducible_research

http://yihui.name/en/2012/06/enjoyable-reproducible-research/

ADD COMMENT
1
Entering edit mode
@herve-pages-1542
Last seen 3 days ago
Seattle, WA, United States

Hi Vlad,

I would second Martin: definitely put your code in a package. Even if it doesn't become a Bioconductor package (maybe it doesn't need to). Note that if you want to get more advice from the BioC developpers, you should really ask on the Bioc-devel mailing list. This support site is for Bioconductor users.

Cheers,

H.

 

ADD COMMENT
0
Entering edit mode
petyuk ▴ 70
@petyuk-7261
Last seen 4.2 years ago
United States

Thanks for suggesting an experiment type of package.  The ones I have looked at before contained no R code - just data. Examples of experimental packages with code Hiiragi2013 and HD2013SGI are right on target.  This is what I was looking for.

My guess is that the original intent of experimental packages was to serve as a supplement to software packages as opposed to become a data repository.  But perhaps once is while it is OK and helpful to come up an experimental package that is not directly supporting any of the software packages, but rather demonstrates a complicated analysis workflow.

Thank you all for suggestions!

Vlad

ADD COMMENT
0
Entering edit mode

It's a matter of resources. If now suddenly every biological paper that comes out were accompanied by a package like Hiiragi2013, the Bioconductor infrastructure (storage, bandwidth but really mostly, person-power) could be overwhelmed; and doing this well, could easily dilute effort even more needed elsewhere in the project. OTOH, every paper that does provide that is a better paper. And there are journals like Gigascience and G3, ...., where positive cross-talk seems possible.

ADD REPLY

Login before adding your answer.

Traffic: 732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6