Origin Stories

Origin adds support for BDT 2.0

Share
Origin adds support for BDT 2.0
Photo by Mike Winkler / Unsplash

This week, Origin launched support for ICMA's Bond Data Taxonomy 2.0. Clients are increasingly asking us to pull bond issuance data off the platform in a structured and digitised format. BDT 2.0 was published by ICMA on 27 April, and it joins Airbrush v3 and Clearstream's D7 as data formats we support out of the box. (We do also support bespoke mappings on request).

Data standards and taxonomies - yes this can seem like an otherwise dry topic. But, at Origin, we've been involved in structuring bond issuance data via our Documentation product for over 7 years now, and for much of that time, we have been advocating for more standardisation in the way data is shared and communicated. So this is a topic that is really close to our hearts, and we thought it'd be useful to share some background context on how we got here, what makes a data standard a good one, and why a platform like Origin is now able to support three of them out of the box.

The plug socket problem

A useful way to think about data standards in capital markets is to think about electrical plug sockets.

There are fifteen different plug and socket types in active use around the world, labelled Type A through Type O. They exist because different countries electrified at different times, with different engineering constraints. And once the cement set, it set. The UK three-pin plug, (the chunky one familiar to anyone who has stood on one in the dark), was designed in 1947, partly because Britain was short of copper after the war and needed a wiring scheme that put a fuse in every plug. That shortage is long gone, but the plug remains. It's now in fifty-odd countries.

In 1986, the International Electrotechnical Commission published IEC 60906-1: a universal global plug standard. The IEC did exactly the work an international standards body should do. They designed a safe, compact connector suitable for almost anywhere in the world. Forty years on, two countries have adopted it (South Africa and Paraguay). The rest of the world is on its own timeline, because the cost of rewiring sockets in two hundred countries is not a five-year programme; it's a fifty-year programme, and counting.

But the standardisation impulse didn't stop at the wall socket. It shifted up the stack. USB-C is now the global charging standard for personal electronics, (especially after it was mandated by the EU). Walk through any modern airport or hotel and you'll see wall plates with USB built right into them, because whoever installed them knew that no matter which country you'd flown in from, your phone almost certainly charged over USB. Wireless charging takes it one step further again.

The lesson isn't that wall socket standards failed. It's that progress at one layer was slow while progress at the layer above was fast, because new device generations are greenfield in a way installed infrastructure is not. And in the meantime, while the different layers move at different speeds, what keeps the world spinning is the travel adapter. The adapter doesn't replace the standards. It lets the world function while adoption catches up across multiple layers and decades.

That same shape is now playing out in DCM. The installed market infrastructure - CSDs, paying agents, dealer systems, treasury platforms, monopoly market data terminals - they are the electrical grid. Some of them have published APIs (ie a wall socket), but many have not. Data standards such as BDT, Airbrush, and others, are the attempts at creating a universal "wall socket". Entirely new platforms and modalities (such as tokenised assets) may open up the opportunity for standardization at a new layer (a lá USB-C). But in the meantime, as adoption across various layers is "jagged," there is a strong need for an "adapter" (also known as an orchestration layer), to help institutions translate across all the standards.

The bond market's alphabet soup

Data standards in finance have a long history, pioneered by the secondary markets. The FIX protocol (Financial Information eXchange) was developed by Fidelity and Salomon Brothers in 1992, initially for equity trade communication. It is now the de facto messaging standard for trade communication across equities, fixed income, FX and derivatives, and the backbone of fixed income secondary trading. FpML (the Financial products Markup Language) was started inside JP Morgan in the late 1990s and taken over by ISDA in 2001; it is now the global standard for OTC derivatives messaging. ISO 20022, launched in 2004, is the meta-standard for financial messaging more broadly, covering payments, securities, and treasury.

The primary markets have only recently taken up the mantle.

The first attempt was GLML, the General-purpose Legal Mark-up Language, developed by the early fintech Nivaura in collaboration with Allen & Overy and Linklaters in 2017. GLML wasn't exactly a bond data taxonomy; it was an attempt to make legal documentation itself both human-readable and machine-readable, with a standard taxonomy of data "tags." The effort was migrated over to the GLML foundation in 2021, and was one of the first industry efforts aimed at standardising fixed income data.

The second was Airbrush, which we launched in early 2021 with the support of our strategic shareholders Clearstream and Luxembourg Stock Exchange. Airbrush was the first market data standard scoped specifically to bond issuance: a minimum-and-sufficient set of data points, with a standardised nomenclature, published as an open OpenAPI specification. It's now on v3, with 286 fields covering everything from syndicated benchmarks to non-deliverable currency issuances.

The third is the ICMA Bond Data Taxonomy itself. ICMA's working group published BDT v1.0 in March 2023, covering around ninety data fields defining the structure of a vanilla bond. The April 2026 release of BDT 2.0 is the biggest expansion yet: support for multi-series, multi-class and multi-tranche structures, an extension of the DLT capabilities ICMA added in 2024, and accommodations for emerging-market use cases.

That's three serious attempts in roughly six years to formalise the data layer of bond issuance, each from a different angle: legal documents, vendor platform, trade association.

Good data standards are hard to build

Given our position in the market, we are deeply familiar with the benefits, and challenges, of creating a data standard. This is why we embarked on the Airbrush project in the first place. And over the last 5 years, we've learned a lot. This is the part of the story that doesn't usually get told.

Our CTO, Rob Taylor, led our effort to build Airbrush v1 in early 2021. He has a unique vantage point because he's responsible for the data tagging of every document Origin processes, which means reading the actual legal docs and translating them into structured data. That puts him at the intersection of two groups of practitioners who almost never sit in the same room: front-office and legal experts, who know exactly what legal provisions the documents need to contain, and IT and operational experts, who think about APIs and data models, and almost nothing else.

He tells a story that captures why building (good) standards is harder than it looks. Before Airbrush v1 was even written, he'd speak to our clients' IT teams who would tell us their entire downstream system was mapped to ISO 20022, so all we needed to do was provide ISO 20022 outputs for everything. So we tried. But ISO 20022 is genuinely incomplete in places. Its list of day count fractions is missing some that appear in real bond contracts. Its list of business day centres doesn't cover all the cities that show up on real term sheets. You cannot issue a real bond against an incomplete standard.

What does a bond data model actually need to do, specifically, to be useful?

The answer comes down to four properties.

  1. Accuracy and precision: every data point captures the right thing at the right resolution. (If you store business day centres as countries, you've lost precision; the actual cities matter. If you allow five decimal places for issue price, you'll truncate trades that go to eight.)
  2. Validity: every data point is actually a legal value in its context.
  3. Typing: every field has a defined type (a date is a date, a number is a number, an enum has a known list of values), so the system reading the data knows what it's getting.
  4. Reconciliation: every data point can be tied back to the transaction document unambiguously. That last one is the hardest, and it's the one most often skipped by standards designed in the abstract.

This is also why standards continue to evolve and grow as they get tested against real issuance.

Airbrush v1 launched in February 2021 with 116 fields. It had a flat structure, and was modelled partly on what we were seeing on our Documentation platform and partly on the existing data conventions of LuxSE, Clearstream and a few paying agents. By June 2021, we upgraded to v2: 170 fields and a nested structure to handle more sophisticated transaction types. By October 2021, we'd upgraded again to v3, the version we run today. V3 initially started at 189 and has since grown to 286 fields, as the scope expanded from syndicated benchmarks to equity-linked structured notes, fix-to-floats, index-linked notes, accreting zero coupons, and synthetic and non-deliverable currency issuances.

Every new asset type adds new fields not for the sake of it, but because each new type tested one of the four properties: a new day count, a new business day centre, a new validation rule, a new way the data needed to tie back to the document.

This is precisely the work BDT 2.0 is now extending. Clean, standardised data is something we care deeply about, and we're glad an industry body like ICMA has taken this on. They're the right body to do this work, and we're glad to be supporting it.

The orchestration layer

Standards in deeply networked markets propagate non-linearly. Wall sockets stayed fragmented for a century. Once a new technology platform emerged, USB-C unified the device layer in a decade.

The same story is forming in DCM. Paying agents, stock exchanges, clearing systems, risk management systems - these are all deeply entrenched pieces of infrastructure with existing data models and systems of record that have been in place for decades. Each will modernise on its own timeline, and standardise on its own timeline. One day, a completely new technology platform (tokenised securities?) may herald the arrival of a new layer, and standardisation and interoperability can be built natively from day 1.

But until then, we are happy to serve as the orchestration layer - the "travel adapter" - for the market, helping clients seamlessly translate data from one format to another.

So this week, BDT 2.0 joins Airbrush v3 and Clearstream's D7 as supported data formats on the Origin platform. Clients can consume bond issuance data in any of them, choosing whichever format their downstream systems want to receive. (We also offer bespoke data formats on request).

Different airports, different sockets, same phone.