We need your support to future-proof Bio-Formats
Fresh on the heels of another Bio-Formats release- it’s funding time again (when isn’t it??).
As many of you know, OME’s Bio-Formats library is heavily used by the imaging community, and is a great example of what can be achieved when scientists, engineers, software developers work together. The project has become a critical component of many open and commercial software tools. As just one metric of success, Bio-Formats was started >36 Mio times in 2015. That’s an average of >100k times each day.
Short version: We’re submitting a proposal for future-proofing Bio-Formats and extending the range of metadata OME supports. We need letters of support from the community!!! Please do send directly to Jason before Feb 26!!!
Longer version: Despite the success to date, it is worth considering various limitations of Bio-Formats, especially relative to the upcoming demands for imaging in the next 5-10 years. Bio-Formats’ core architecture was originally designed and implemented 2002-2004, and while it has undergone several updates (one example is the adoption of code generation to adapt to changes in the core OME Data Model), several aspects of its design and capabilities are aging and either are proving to be limitations or are entirely unsuited for the large, complex imaging datasets that are now being generated by the scientific community. A few examples:
There are others. In considering these issues, we are strongly aware that Bio-Formats isn’t broken, but the demands for interface solutions for imaging metadata are growing. Many of the features above are useful for existing applications, but are massive blocks for others.
To address these issues, we are submitting a funding proposal to the Wellcome Trust’s Biomedical Resource scheme. Our goal is to expand the types and sources of metadata that OME supports, and to ensure that the project’s technology can be used for evolving HCS, SRM and and several 3D imaging modalities, and for object-based data sources (e.g., AWS and other clouds). In particular, tools for accessing experimental and analytic metadata from standard formats (CSV, HDF5, etc.) should be supported.
In fact, we have the prototypes of this work done, as part of our development for the IDR project we’ve discussed here previously. Examples are in the URLs listed below . Getting data into the IDR has driven several updates to Bio-Formats and scripts for reading experimental and analytic metadata.
In support of funding future work on Bio-Formats, we’re asking the community to (again!!) send letters of support for future work on Bio-Formata and metadata by OME. These letters are hugely important to demonstrate community interest and need for the proposed work. Note that we will also be asking for continuing support of formats, etc. to keep the core functions of Bio-Formats up to date.
If you can provide a letter of support, please do send directly to me.
Thanks in advance for your help and support from the whole OME team.
 We have built and deployed in a demo version of the IDR at Dundee. This resource holds 37 TB of image data in 29M images, and includes all associated experimental (e.g., genes, RNAi, chemistry, geographic location), analytic (e.g., submitter-calculated regions and features), and functional annotations. Wherever possible, metadata in IDR links to external resources that are the authoritative resource for that metadata (Ensembl, NCBI, PubChem, etc.). Datasets in human cells (e.g., Plate8_Actinome1, Drosophila, and fungi (e.g., JL_120731_S6A; P105) are included. The full Mitocheck dataset and a comprehensive chemical screen in human cells are included. Finally, imaging from Tara Oceans, a global survey of plankton and other marine organisms is also included.
— February 18, 2016