Referencing an External Registry

Introduction

Many Benchling customers use registry systems outside of Benchling to define and track data related to their scientific work, such as molecule data in a chemical registry or materials data in an ERP system. Being able to sync data from these third party systems into Benchling is critical for successful cross-team collaboration and communication. Using Benchling’s Registry and Benchling’s Developer Platform, Benchling customers can integrate data from their third party registries into their Benchling registry. This integration gives end users and scientists a smooth experience, allowing them to interact with critical data with the least amount of effort.

Definitions

TermDefinition
External RegistryA database external to Benchling that tracks experimental data such as samples, molecules, sequences, etc.
Benchling AppA framework to develop custom code to extend Benchling functionality.
App ManifestA tenant-independent definition of an app, including all of the app's metadata and configuration details.
UpsertA database operation that will update an existing row if a specified value already exists in a table, and insert a new row if the specified value doesn't already exist.

Scope

The goal of this Technical Accelerator is to highlight how to leverage the app framework to build an app able to sync data from a third-party registry API to Benchling. The example application discussed here is triggered by either user interaction or a scheduled sync time; the application then uses Benchling entities to represent the external registry data, and App Status to represent metadata about the data sync.

Not covered

This document does not cover continuous sync integrations, but instead will focus on a scheduled sync and user-triggered sync. This document does not cover syncing from an external registry to the Benchling registry for an arbitrary set of schemas, but instead focuses on a single defined schema.

High Level Integration Diagram

Step By Step Explanation

Establish an app trigger

  1. We’ll discuss two ways that the integration can be triggered:
    1. A user in Benchling interacts with a button in an app canvas to trigger the integration
    2. A scheduled job (e.g. cron) runs at a specific time to trigger the integration

Determine last update time

In order to facilitate syncing the most up-to-date version of your data, the integration will need to keep track of the last time it was run. When the integration is triggered, it must store a “last update” timestamp that will be used in subsequent steps. This can be done in a number of ways; the best method will vary depending on your specific use case and available infrastructure. Some common patterns include:

  • Storing the value in a Benchling result (e.g. in a results table)
  • Tracking the “last update” time locally, or in a data store external to Benchling
  • Using App Status to track when the integration syncs with Benchling

Our recommendation is to leverage App Status; by querying the last successful integration sync, it’s possible to use the timestamp at which the session was created as the “last update” time for a subsequent sync. Additionally, using App Status allows for communicating important information with Benchling users like sync status and error messages.

Pull relevant data from third party registry

With the appropriate “last update” time, the integration can sync any and all newly created or modified entities from the third party registry. The specific details will vary from registry to registry; consult the API documentation for your specific external registry for more information.

Transform data to Benchling upsert request

With all of the relevant data from the third party registry that needs updating, you will need to map this data to your entity schema. Use the API documentation to format your request. We recommend that you use Benchling’s app configuration to define the mapping between the fields in the third party registry schema to the Benchling schemas in order to generate your request body. For example, the third party registry field “Manufacturer” might map to “Manufacturer Name” in the Benchling schema.

We highly recommend using a unique identifier from the third party registry as your Benchling registry id. This results in a more intuitive mapping between the Benchling and third-party registries. We also recommend configuring the entity schema permissions to be READ ONLY so that Benchling users cannot edit the entities. Both of these choices represent a key design principle: The source of truth for this app is the third party registry.

Upsert data to Benchling

Once you have your request formatted to upsert all changes to Benchling, you may use the API to upsert the data. If your upsert comes back with a failure, log the failures, throw out the failed attempts, and re-attempt the upsert.

Record results

Log the results of your Benchling update. We recommend using App Status to log successes and failures. Since App Status does not currently appear in the data warehouse warehouse, you can alternatively create a results schema to log your app's history. With a results schema, you can create dashboards and query your app's successes and failures.

Initial Integration Setup

The recommended solution relies on the Benchling app framework; specifically, the application uses the App Status and App Configuration features. For more information on these features, as well as the app framework more broadly, check out the Getting Started with Benchling Apps guide. Here is an example of what this might look like in the App Config UI:

Here, we can see a list of app configuration options that might be used for an integration like this:

  • API URL - Text field to specify the URL of the third party registry API
  • API Key - Secure text field for storing private credentials (like an API key) used to authenticate requests to the third party registry API
  • Synced Entity Location - Benchling folder field representing the location where the sync will upsert entities
  • Schema Field Mappings - Array config element for storing a mapping of third party field names to Benchling schema and field names

Below is an example app manifest defining this app: While your specific app manifest and configuration options may vary, the overall structure will likely be similar. For more information on App Configuration and a reference of all app manifest options, check out the App Configuration & Lifecycle guide and the App Manifest Reference, respectively.

Example Manifest

manifestVersion: 1
info:
  name: app-entity-sync
  description: Template Manifest for Entity Sync App
security:
  publicKey: |
    -----BEGIN PUBLIC KEY-----
    MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAoFGZ0xaKXIf/Ob5IdIBA
    qyPMdca6TSbHVi9/cyGjnOq2m4lJRpceyX8So5n9H1YKRvvcjGlw9TZr6CH+iRL4
    C3+4L2QnyNBGIE4ZHEXCfmuOYJsLpGr8FdhOZcpcPu0yHnyeTpcul1mg3IRd1Vx0
    SmFroOxpXje621OOY0zIwYspj58teNNYbmX6bsJbxAAzZw0dgAD9asjITKaj5gx9
    Qfs3zaMw2RHXAd00TFpZQluHkCPR6SKk8U6f6/TP7hxecUyn0j+BXwDOiJD6NQjp
    ihmhKiRWGm62SlyjBaHpgxcEHk6LjhdkNn4lEHOFBNJBYvtUpY+Je/vNwIjHKfLk
    YwIDAQAB
    -----END PUBLIC KEY-----
settings:
  lifecycleManagement: MANUAL
features:
- name: Manual Sync
  id: manual_sync
  type: APP_HOMEPAGE
configuration:
- name: API URL
  type: text
  requiredConfig: true
- name: Schema to Sync
  type: entity_schema
  requiredConfig: true
- name: Registry
  type: registry
  requiredConfig: true
- name: Field Configurations
  type: array
  elementDefinition:
    - name: Entity Schema
      type: entity_schema
      requiredConfig: true
      fieldDefinitions:
        - name: Benchling Field
    - name: External API Field Name
      type: text
      requiredConfig: true
  requiredConfig: true
  minElements: 1
  maxElements: null
- name: Synced Entity Location
  type: folder
  requiredConfig: false
- name: API Key
  type: secure_text
  requiredConfig: true