From Concept to Biomarkers: An MRM Approach to Clinical Cancer Marker Discovery

Efforts in discovering biomarkers from liquid biopsies with sufficient performance to be clinically useful remain a significant challenge in diagnostics. However, as detailed in this example, researchers can leverage highly multiplexed approaches to simultaneously measure a multitude of proteins.

Continue reading
Predictive Modeling Pitfalls: Information Leakage Can Generate Models that Fail to Validate with New Data

Building predictive models that successfully generalize to new data is a challenging process full of potential pitfalls. During the modeling building process, cross-validation is routinely used to optimize parameters, and importantly, provide an independent assessment of trained model performance using the hold-out test set partitions.

Continue reading
Discovering The Hidden Signal Among the Noise: An In Silico Example

Deomstrating a method for identifying biomarkers that interact, or represent the effects of a biological network. When combined, often out perform multivariate canidates idetified by ANOVA anlysis.

Continue reading
Collect Your Own Proteome: Demonstrating the Feasibility of Personal Proteome Data Collection and Analysis

Consumer-level personalized health monitoring is rapidly advancing as technology and public interest in this area continues to grow. Here, we demonstrate the feasibility for individuals to measure their own proteome, utilizing existing sample collection methods, contract research laboratories, and open source software.

Continue reading
Consideration of HDF5 File Architecture for LCMS Data Acquisition Archival and Efficient Access

Open access data efforts for LCMS are not fully optimized for big data applications. Proposed here is a file architecture utilizing the HDF5 accessable file format.

Continue reading
Quickly merge a single column to a large data frame using a named vector

The `merge` function in R is convenient for merging two data frames. However, if your input data frames are large, `merge` can be very slow.

Continue reading
Efficiently De-Duplicate a Dataset

The `duplicated` function in R can handle an entire data.frame, or several columns in a single pass. Making it a very useful tool when eliminating redundancy within datasets.

Continue reading