Technical Note

Collect Your Own Proteome: Demonstrating the Feasibility of Personal Proteome Data Collection and Analysis

Jeff Jones, Ryan Benz


Consumer-level personalized health monitoring is rapidly advancing as technology and public interest in this area continues to grow. To date, most consumer-level health monitoring is limited to either well regulated easy-to-measure metrics such as heart rate and blood pressure, or overly complex sources, such genomic data which has been made more accessible through direct-to-consumer companies that are pressured by regulators to limit the analytical conclusions. While regulation has limited the average consumer's access to personal analytics, it has not stopped the hobby enthusiast or scientific researcher. However, few options exist for the collection of personal proteomic data. Here, we demonstrate the feasibility for individuals to measure their own proteome, utilizing existing sample collection methods, contract research laboratories, and open source software.

SoCal Bioinformatics Inc.

While admittedly the utility of this approach is limited at this time, this proof-of-concept study demonstrates that personal proteomic data can be collected by individuals without the need for extensive infrastructure, resources or cost. It is our estimate that the one-off hobby enthusiast could obtain rich plasma proteomic data for less than 0 per sample. Furthermore, establishment of a high throughput service could bring down those costs significantly, possibly driving the cost to well below $100 per sample.


Ten dried plasma spot (DPS) samples were collected from each of two individuals over the course of approximately one day, using Novilytic Noviplex Dried Plasma Prep cards following the supplied collection protocol. The cards were then stored in foil pouches with dessicant packs, and shipped via standard mail to an LCMS CRO for further sample preparation and LCMS data collection. Utilizing a highly customized 30 minute LC data collection protocol, the 20 DPS samples were analyzed on a Sciex TripleTOF 5600 with 1 MS1 scan collected every 1 second, and 3 MS2 scans collected in between. The data was converted to mzML format and then processed and analyzed through open source tools including OpenMS, OMSSA and R.
SoCal Bioinformatics Inc.


Approximately 3,500 molecular (peptide) features, present in at least 50% of the samples, were measured across the 20 samples. Within the two individuals, ~4000 were measured in at least of half of the samples for the first individual, and ~3600 for the second. Nearly 3000 of these features were found in common between the two individuals, highlighting both similarities and differences in their proteomic profiles. The associated peptide ID's yielded several hundred confidently identified proteins compiled from thousands of peptides with a multitude of biologically relevant modifications. In all, there is a potential for several thousand single marker analytes to track, and a near infinite number of combinations that could be used to discern anything from the mundane age and gender, to the more practical observation of early onset diseases. Our prognostication is not necessarily that the personal proteome could be utilized to detect specific diseases, but rather observe the trends to inform lifestyle changes. Explore more at
SoCal Bioinformatics Inc.



20 Samples
2 Individuals
12hr collection period

COLLECTION: Dried plasma spots utilized for self administered sample collections.

LCMS: HPLC-QTOF utilizing a 30 minute gradient, alternating ms1 and ms2 DDA


REDUCTION: Features from ms1 data extracted with LC, MZ and Z determined. Peptides identified by OMSSA and current UniProt all organism database with known natural variants included.

ANALYSIS: Implemented CPTAC suggested quality metrics. Features from ms1 aggregated, normalized and associated to peptide IDs. Proteins realized according to published methods.


3500 ms1 features
561 ms2 peptides
281 possible proteins
6 orders dynamic range