Configuration

The workflow needs to be configured to perform the quality control analyses by creating a set of files that are defined in the config/config.yaml. Each of the underlying sections below corresponds to specific files and configuration options that should be added in.

Configuation file
Metadata file
Sample linker file

Configuation file

A configuration file for this pipeline can be found in config/config.yaml and is used for generating and specifying the report from the pipeline.

Config parameters

Below are descriptions and usage options for the various config parameters specified in config.yaml.

Parameter	Required	Description
`name`	Y	Name for the project (prepended to results files)
`olink_data/data`	Y	CSV File with Olink proteomics data
`olink_data/checksum`	Y	File with SHA256 checksum for Olink CSV file
`olink_data/ignoreFile`	N	File with sample IDs to ignore from analysis (default: “”)
`olink_data/ignoreRegex`	N	File with regex pattern to ignore samples (default: “”)
`olink_data/sdIQR`	Y	Standard deviations for IQR-based outlier filter
`olink_data/sdMedian`	Y	Standard deviations for Median based outlier filter
`olink_data/sdPCA`	Y	Standard deviations for PC1 & PC2 based outlier filter
`olink_data/assayWarnProp`	Y	Proportion of assay warnings required to remove assay
`metadata/filename`	Y	Metadata filename (see below)

Metadata file

The metadata file can specify both quantitative and qualitative covariates to check for downstream association with axes of proteomic variation (e.g. Age, Sex, BMI).

The file can be structured as as TSV or CSV as below:

SampleID    COV1    COV2
A1  0.0334699       0.329964
A10 0.690636        0.422487
A100        0.206265        0.250128
A101        0.636559        0.863622
A102        0.301656        0.0249239
A103        0.364993        0.765381

For a clearer example of an example dataset, explore our example data here. Note that checks will be performed to ensure that every single individual has metadata accompanying it. If you have no metadata for an individual, make sure to fill it in with empty or NA values.

Sample linker file

In both processing of metadata and Olink proteomic data there can be potential shifts in sample ID nomenclature that can be difficult for merging and performing analyses. In instances like these you can provide a simple two column file that will perform the sample renaming in downstream files for you, for example:

curID newID
A1 B1
A2 B2

This file is largely necessary when the IDs sent to Olink are discrepant with your in-house metadata/phenotypes. The column headers are important to retain.