Overview & Getting started

The Benchling Warehouse is a database solution that tracks assay data, registry entities, and inventory data. The warehouse centralizes the entire organization's research output and facilitates queries that would historically require parsing multiple data sources, such as “find all batches that had an OD > 1000”.

The warehouse facilitates higher level analysis and can be securely connected with third-party visualization and analysis tools. Configurable permissions ensure appropriate data access control.

How it works

The warehouse ingests both user generated data about entities from the registry as well as assay results from networked instruments. To enable this aggregation, the way it works is as follows:

  • Researchers register new samples in the registry, specifying properties about the sample. For example, T cell receptor may capture information about the Alpha chain, Beta chain, and Specificity.
  • They initiate a new run, specifying the assay to be run, the sample IDs, and parameters such as the adapter used.
  • Results from the assay are uploaded as structured data and blobs to the warehouse.
  • Third party analysis and visualization tools can connect to the warehouse to consume the data.

Warehouse architecture

Warehouse config

You can connect to the warehouse with any PostgreSQL client. It has the following properties:

  • Host: (sub out "tenant" with your tenant's name)
  • Port: 5432
  • Database Name: warehouse
  • Username and Password: These can be generated in the "Settings" section of your Benchling account


Runs store parameters about the assay that will be performed, such as the Instrument ID.
Results capture results generated via the assay which are associated to the samples, such as the Cell Count.


Capturing results from a Flow Cytometry run

  • Researchers configure schemas for the Flow Cytometer assay in the Benchling UI.
  • They set up fields, and specify the data type such as string, float etc. Fields can be links to blobs, which are used for handling raw data or images.
FlowCytometryRun: Run

  containerId         container_link
  instrument          text
  rawData             blob_link
FlowGatingResult: Result

flowRun                 assay_run_link
CD3+                        float
CD4+                        float
parentResult                assay_result_link
  • They then initiate the assay on the Flow Cytometer, and upload parameters about the run, such as the Instrument ID, to Benchling. An example run includes:
POST /blobs
{“blobId”: [“65da6215-a889-49d3-a6da-b5cc0ac60d75”]}
POST /blobs/65da6215-a889-49d3-a6da-b5cc0ac60d75/parts
POST /blobs/65da6215-a889-49d3-a6da-b5cc0ac60d75:complete-upload

POST /assay-runs
  “schema”: “FlowCytometryRun”,
  “fields”: {“instrument”: “My Instrument”,
             “rawData”: “65da6215-a889-49d3-a6da-b5cc0ac60d75”},
{“assayRuns”: [“9c6da62a-0a9e-4b88-b057-1adabfd31e2b”]}
  • After the run is complete, a script on the instrument uploads results to Benchling, specifying what sample and container they are associated with, and results such as the CD3+ values. An example of a result looks like:
POST /assay-results
{“assayResults”: [
    “schemaId”: “assaysch_123456”,
    “fields”: {“flowRun”: “9c6da62a-0a9e-4b88-b057-1adabfd31e2b”,
               “CD3+”: 0.4, “CD4+”: 0.5}
{“assayResults”: [“77af3205-65af-457f-87f5-75462b85075a”, ...]}
  • The run is attached directly to an ELN entry in Benchling
  • When researchers want to analyze results across multiple runs, they query the warehouse using either third party analytics tools or through SQL queries
$ psql -h

-- Get all batches with CD3plus > 0.5
JOIN container ON container.batch_id =
JOIN flow_cytometry_run ON flow_cytometry_run.container_id =
JOIN flow_gating_result
  ON flow_gating_result.flow_run =
WHERE flow_gating_result.CD3plus > 0.5
AND flow_gating_result.created_at > ‘2017-01-01’;

Updated 10 months ago

Overview & Getting started

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.