Warehouse Tables

Objects from the main Benchling database are periodically synced to the warehouse to make them available for queries. It may take time (usually a few minutes, up to 24 hours) for data to appear in the warehouse.

If you need additional help for the Benchling Insights application please visit https://help.benchling.com/hc/en-us/sections/8506377929869-Insights

AND/OR

If you need more detail about each Benchling table then please navigate to https://docs.benchling.com/docs/warehouse-columns-rows-descriptions

Tables exposed in the warehouse are shown in the image below. Because of its size, it's best to download it and open it separately, or open it in a new tab, where you can zoom in and see the part you'd like. In some cases, we have a table per user-defined schema in Benchling. In these cases, an (Example) is added to the table's name. The tables you see in your Benchling warehouse would be named after the schemas you have configured.

All of the tables in the below image are Postgres views, backed by a table of the same name followed by $raw. Generally speaking, the views are "cleaned up" versions of the raw data (e.g. archived items are hidden, only results from reviewed notebook entries are shown, etc). Deciding which table version to use depends on your specific use case:

For most users, we recommend using the "cleaned up" views, particularly if you are just getting started. These views are designed to present only the most useful data for the most common warehouse use cases, but might suffer from performance issues in some particularly complex edge cases. Users that are using particularly complex queries or who need optimal performance, we recommend using the unprocessed $raw table versions instead.

Some important considerations when using $raw tables:

  • $raw tables also include various "other" columns like _pkey and others starting with "_"; these columns exist for internal use, and shouldn't be used in any query.
  • The source_id column exists for internal use, and should not be used in any query.
  • The acl_resource_id column exists for internal use, and should not be used in any query.
  • The stability and semantics of these columns do not fall under our Stability guidelines, and may change without warning.
  • The "_pkey" column does not represent the primary key for a particular table. All tables will have an initial "id" column which represents that particular table's primary key.

📘

Why isn't a table appearing in my warehouse?

Warehouse tables only appear once there are rows to create for that table! That means if you create a new schema, it won't appear in the warehouse until you've also created an object under that schema.

Sample Queries

You can query for data in Benchling by combining tables together. For example to show all "in-progress" entries you can combine the entry, entry_auditor, principal (includes Benchling User and App accounts), and project tables. See below:

SELECT a.id as "Notebook Entry", 
        c.name as "Author", 
        coalesce(a.review_status, 'IN_PROGRESS') as "Review Status", 
        d.name as "Project",
        a.created_at as "Created At"
FROM entry$raw a
LEFT JOIN entry_auditor$raw b on a.id = b.entry_id
LEFT JOIN principal$raw c on c.id = a.creator_id
LEFT JOIN project$raw d on a.source_id = d.id

Query Output:

Notebook EntryAuthorReview StatusProjectCreated At
etr_7O2twKtERosalind FranklinIN_PROGRESSSCFA Platform2020-04-06T21:45:54.716093
etr_XnmA7Fh5Rosalind FranklinIN_PROGRESSSCFA Platform2020-04-06T21:46:13.181137
etr_gGyzeVhnJames WatsonIN_PROGRESSSCFA Platform2020-04-15T22:44:15.662984
etr_OqHUmK2DRosalind FranklinIN_PROGRESSSCFA Platform2020-04-15T23:37:33.465126
etr_4qgVERktFrancis CrickIN_PROGRESSSCFA Platform2020-03-20T22:25:33.428003
etr_zpHeAFc2James WatsonIN_PROGRESSSCFA Platform2020-04-15T22:44:24.523667