Raw proteomic data collected from LCMS experiments can be difficult to access and process, particularly with common data tools such as R and Python. We provide services to convert your instrument specific data into a universal format enabling open access and efficient data processing across most all programming languages.

Product Summary

Converts mass spectrometry raw data from proprietary vendor formats, or mzML, into an simplified HDF5 file.


To establish a big data format within the Mass Spectrometry community that is accessible by all major programming languages. HDF5 is supported by a vibrant community of data scientists. Read more here.



Proprietary format files from 7 of the top instrument vendors are first translated by ProteoWizard.

mzML #

A universally accessible XML file, gets translated into a simplified data object containing tables, arrays and matrices.

h5 #

Final data format, HDF5 is made avaiable for downloading.

Access the data with these tools:

Programming Language Package
R rhdf5
Python h5py
Java HDF Object Package
Visualizer HDF View
More tools

Optional: download selected tables as CSV files.


Optional: download selected tables as SQL files for importing directly into MySQL or MariaDB.


Product Deliverable

Password protected account to access HDF5 files for downloading, and or CSV table extracts. All data is deleted after 7 days. Tools for further data extraction and manipulation are available on GitHub supporting both R and Python.