The suitability of the format for the data is dependent on the type of data itself and the specifics of its generation, as well as the equipment used. Furthermore, when preparing data for publication in data repositories or data journals, the formats supported by the repositories and recommended by the publishers must be considered as they can vary. However, it is recommended that, where possible, the choice of the data format should consider several key criteria.
- Is the data format widely used?
- Is the format suitable for long-term storage?
- Is the format open, and does not require licenced software to use it?
- What is the complexity of the format? It is recommended to choose simpler formats?
- Can compression (archiving) be applied to the format and is not detrimental to the data quality?
Recommended data formats:
| Data type | Recommended formats |
| Text | PDF (the most appropriate: PDF/A) without formatting: TXT can be edited: ODT, RTF, HTML for text with formulas: LaTeX (TEX) |
| Tables | CSV / TSV Numerical data: HDF5 |
| Graphics | Raster: PNG, TIFF Vector: SVG, EPS |
| Multimedia | Multimedia: MKV, WebM, Video: AV1, VP9 Audio: FLAC, WAV, Vorbis, Opus |
| Linked and/or structured data | SIARD, Dump, XML, CSV / TSV, HDF5, JSON, YAML |
