Dear all,
With the rapid increase in use of digital pathology and other large XY extent microscopy images, we see the need for a properly specified and open file format that accommodates the multi-resolution pyramids required to conveniently exchange and rapidly render such images. Following is a draft that proposes an extension to the TIFF and OME-TIFF specifications for that. As with all file formats, success is only possible if a large enough part of the community supports it and starts using it. So your input and changes are essential to yield something that meets the needs of as many as possible.
Thanks,
- Damir
---------------------------
PyrTIFF: a TIFF specification extension for storing multi-resolution pyramids
TIFF (and its derivatives) is one of the most heavily used image file formats for scientific image data. Other formats, arguably more suited for scientific image data such as HDF5, have not yet achieved adoption rates anywhere near that of TIFF.
Unlike other popular image formats (JPEG, PNG, GIF, etc.), the flexibility of TIFF enables storage of complex image data types such as high bit-depth data, multi-channel data, Z-stacks, time series, etc.  and furthermore some TIFF derivatives (such as OME-TIFF) have codified a standardized way to store such data beyond the official TIFF specification.
The TIFF specification (v6.0 from 1992: https://www.awaresystems.be/imaging/tiff/specification/TIFF6.pdf) is officially owned by Adobe Systems Inc., but they have not touched the specification since 2002 (https://www.awaresystems.be/imaging/tiff/specification/TIFFphotoshop.pdf). An open source-minded group of volunteers have effectively managed a de facto evolved TIFF specification (e.g. incorporating the BigTIFF extension). This activity is loosely organized around the libtiff library (http://www.simplesystems.org/libtiff/). Thus we can pretend that TIFF is an open standard with libtiff as the open source reference implementation.
One critical needed capability, namely a way to store multi-resolution pyramids is needed for large XY-extent images such as digital pathology images, and that has not been standardized. A number of ad hoc implementations exist in open source (ImageMagick and VIPS), as well as commercially (Aperio’s SVS, Hamamatsu’s NDPI, Leica’s SCN, Ventana’s BIF/TIF, etc. However, none of these implementations are compatible with each other and most are not well-specified or documented and the commercial ones tend to be proprietary. Furthermore, all of these ad hoc approaches all impose significant and different restrictions on TIFF’s flexibility. We need an open but standardized specification and implementations thereof that support the flexibility needed for scientific image data.
The following is an attempt to define an extension to the TIFF specification that combines most of the desirable features of a number of existing extensions and derivations while keeping maximum compatibility with the existing TIFF 6.0 specification and with existing implementations and derivations. We’ll refer to this specification extension as “pyrTIFF”.
1)	While both TIFF and BigTIFF encodings are allowed, large XY-extent image data will likely require use of the BigTIFF extension and the method defined in libtiff 4.x is required for that. See: https://www.awaresystems.be/imaging/tiff/bigtiff.html
2)	For any data that includes multi-channel, Z stack, time series, or other dimensionality extension, the OME-TIFF specification is highly recommended. See: https://docs.openmicroscopy.org/ome-model/5.5.7/ome-tiff/specification.html
3)	For any image data that requires rich metadata, the OME-TIFF method and schema should be strongly considered if at all appropriate for the image data type. Alternative metadata options include EXIF for photography data or GeoTIFF metadata definitions for geo-referencing data. 
4)	Compression is supported on a per-tile basis and multiple compression methods are optionally allowed incl. JPEG, LZW, deflate (i.e. PNG or zlib), JPEG2000, JPEG-XR, etc. Also allowed is compression/encoding on a per plane basis (i.e. per IFD) specifically for inherently multi-resolution compression/encoding methods such as JPEG2000. Note: if compression is per tile, it is allowed, and arguably desirable for scientific image data, that the full-resolution image is stored using lossless or no compression while the sub-resolution tiles may be stored using lossy compression for improved reading and rendering speed.
5)	A specific question is where to store the pyramid of sub-resolution representations per image plane. The 2 possibilities are: option 1) follow the implied intent of the use of SubIFDs (https://www.awaresystems.be/imaging/tiff/specification/TIFFPM6.pdf) and store the sub-resolution representations in a sequence of SubIFDs of the image plane’s IFD, and option 2) stay in line with every implementation of pyramid storage so far and store the reduced resolution images as a sequence of top-level IFDs. While option 1 is more elegant, in order to optimize compatibility with existing approaches, we propose to initially allow only option 2. A future revision may also accept option 1.
6)	In either case the reduced resolution sequence should follow a dyadic reduction in both X and Y until one of the dimensions reaches 256. Other reduction schemes are allowed and can be encoded in a higher level specification such as OME-TIFF or alternatively the reader code can deduce it from the sizes of the reduced resolution images in the sequence.
7)	As already formalized in the libtiff spec, each full-resolution IFD should have bit 0 of the NewSubfileType tag unset while each reduced resolution IFD should have bit 0 of the NewSubfileType tag set. For maximum compatibility, it is recommended that the deprecated SubfileType tag is set to 1 (for full resolution) or 2 (for reduced resolution).
8)	Each IFD not containing a reduced resolution image can optionally store a thumbnail in its SubIFD 1. Thumbnail must be JPEG or PNG, strip or raster (no tiles), and no larger than 4096x4096; recommended size is 1024 pixels on largest side. In addition, to support digital pathology applications, the first IFD in the file can optionally store a slide label image in its SubIFD 2 with the same specifications as the thumbnail image. And optionally a slide overview image with the same specifications can be stored as SubIFD 3.
Implementations:
Current TIFF readers/writers: When presented with pyrTIFF-compliant TIFF file, most existing TIFF readers will at a minimum be able to read the initial full resolution image from the file and ignore the sub-resolution versions or alternatively treat those as additional separate images in the file. This is considered acceptable behavior. A few readers such as the IIPImage server (http://iipimage.sourceforge.net/) and the OpenSlide library (http://openslide.org/) can with some restrictions use the full pyramid of sub-resolutions. A number of widely used software applications (ImageMagick, GraphicsMagick, VIPS, some versions of Adobe Photoshop) can already generate pyramidal TIFF files that are compatible with the specifications proposed.
OME-TIFF: Currently the OME-TIFF specification does not cover storing pyramidal large images. With this write-up we propose that the OME-TIFF specification be extended to support such images. In doing so, the pyrTIFF extension would become an integral part of the OME-TIFF specification. One approach could be as follows: Use the Pixels and TiffData elements to define the reduced resolution IFDs analogous to the specification of Z, T, and C by adding an “R” dimension for all per-plane encoding/compression schemes. For non-dyadic reduction schemes, the TiffData element can specify the sub-resolution on a per-IFD basis (e.g.: <TiffData IFD=”2” Resolution%=”50”/> ). For encoding/compression schemes that incorporate pyramids natively such as JPEG2000, use the TiffData element to flag that (e.g.: <TiffData IFD=”1” Encoding=”JP2000”/> ).
Bio-Formats, OMERO, and OME-Files: These products/projects from OME already have some support for pyramidal large image types as an input and/or for internal storage. As part of this write-up, we propose the following developments:
- Bio-Formats: a) extend (or subclass) the existing tiff reader to accept and handle pyrTIFF files and treat them similarly to how it handles Aperio SVS files. b) extend (or subclass) the OME-TIFF writer (and possibly also the regular tiff writer) to optionally generate pyrTIFF-compliant OME-TIFF files (see above) where the pyramid can be calculated at time of writing, or used from the internal OMERO Pyramid format, or, possibly in the case of bfconvert, used from an already pyramid’ed input file.
- OMERO: accept pyrTIFF-compliant files as just another pyramidal file format providing such image files with fast import, no pyramid calculation, and no additional storage needs beyond the original file. It appears from the documentation that the internal OMERO Pyramid format v1.0.0 may be compatible with pyrTIFF and could thus be stored in the proposed extended OME-TIFF format for external use.
- OME Files: this C++ reference library for OME-TIFF file format needs to be extended to implement the OME-TIFF w/ pyrTIFF extension specification for both reading and writing. Ideally the regular tiff reading and writing would also support pyrTIFF-compliant files.
Libtiff: this general-purpose reference C/C++ library for reading and writing TIFF files has, as mentioned above, become the vehicle for the de facto extended specification of the TIFF format. We propose that the libtiff community consider accepting the pyrTIFF extension in a similar way to how it incorporated BigTIFF. In practice this would mean: a) the libtiff documentation to incorporate the pyrTIFF extension, b) development of a few wrapper functions as part of libtiff that implement: high-level read functions to access the desired resolution level or SubIFD-stored specialty images, high-level write functions which optionally generate the sub-resolution pyramid, and high-level write functions for the SubIFD specialty images. Even without b) the current libtiff is already fully capable of writing pyrTIFF-compliant TIFF files and it can read all such files that don’t rely on features not (yet) supported by libtiff.
ZIF: Another desirable feature that has been identified is the ability to view such large XY-extent images directly via web browsers to enable remote viewing of such image data without an image server application. The support of such a maximally web-friendly option is provided by the ZIF format specified on http://zif.photo/. ZIF has stricter requirements to enable direct web delivery of panning and zooming of extremely large images using standard web browsers. Effectively ZIF is a more restricted subset of the pyrTIFF extension. A ZIF-compliant file should be readable by a regular TIFF reader to access the initial full resolution image while a pyrTIFF-compliant reader can also interpret and use the full pyramid of sub-resolution versions of the ZIF image.
			
		 
    


