I’d like to talk about a crucial component of any healthcare analytics system — terminologies. These are the underlying codes for diagnoses, drugs, procedures, devices, and other healthcare services that are used ubiquitously in healthcare data. At ClosedLoop, we maintain and utilize many terminologies within our data science platform. We’ve found Fast Healthcare Interoperability Resources (FHIR) — a standard describing data formats and elements and an application programming interface for exchanging health data — to be both incredibly helpful and incredibly frustrating for this task. FHIR provides a consistent structure for these terminologies, which were often previously managed as file and database tables with different formats and conventions. FHIR is great at managing data, but is not designed for analytics. In this post, I’ll talk about some of the issues we faced when trying to build a purely FHIR-based terminology service, and how we eventually migrated to an approach that uses FHIR to maintain and manage the source data and then built our own terminology graph on top of that to power analytics queries.
FHIR has revolutionized healthcare data interoperability. Applications that exist solely to facilitate the exchange of healthcare data need a common language — and FHIR is that common language. It has enabled the success of applications like Apple HealthKit, CMS BlueButton, and many more. An entire subsection of the FHIR specification deals with terminology services and describes how the myriad diagnosis, procedure, and other codes used throughout healthcare records are used within FHIR. You might think that for those doing analytics on healthcare data, FHIR terminologies would be incredibly useful. Having all of the codes standardized and accessible from a single FHIR server could make building and maintaining analytical queries much easier. Unfortunately, FHIR was built for interoperability and has two serious shortcomings that make it very difficult to use for analytics.
Figure 1: FHIR Terminology Resources overview. Reproduced from http://hl7.org/fhir/2021Mar/terminology-module.html
To understand these shortcomings, it’s useful to have a bit of background on how terminologies work within FHIR. As shown in Figure 1, FHIR defines 3 key resources for managing terminologies: CodeSystems, ValueSets and ConceptMaps. A CodeSystem, unsurprisingly, is a set of unique codes that are part of some controlled terminology. ICD-10 diagnosis codes are a CodeSystem, as is SNOMED, and even the official list of FHIR languages. CodeSystems can be hierarchical, so that individual codes within the code system can have children underneath them. The ICD-10 code J45 for “asthma” has a child of J45.2 for “mild intermittent asthma”. The CodeSystem itself has an identifier so if you see a code in FHIR you know exactly where it came from.
Along with CodeSystems, FHIR defines 2 other resources, ValueSets and ConceptMaps. ValueSets define groups of codes that can be used in a particular place. For example, you might have a particular field that can be either a CPT code or an ICD-10 procedure code. The ValueSet for that field would be the combination of those two codes. ConceptMaps define relationships between codes. For example, a ConceptMap could relate an RxNorm concept code to the list of NDCs associated with that concept.
The first major issue with using FHIR for analytics is that CodeSystems, ValueSets, and ConceptMaps each define relationships between codes, but do it in a different way with different rules and different query functions. This makes it very difficult to perform even basic analytics queries that involve aggregating codes from multiple CodeSystems into higher level groupings.
In a simple example, assume you want to find medical claims with diagnoses for a certain set of infectious diseases. You have a high level list of codes for these diagnoses. One of those is CCS category 1.1.1 for “Tuberculosis”. This CCS grouping includes ICD-10 code A15. A15 has several subcodes for different kinds of tuberculosis, A15.0, A15.4, etc. You’d like to be able to take your initial list of codes and determine all the ICD-10 codes that match your query. That is, you’d like to get all the child codes of CCS 1.1.1 and this needs to include ICD-10 A15.0 and ICD-10 A15.4. To capture this in FHIR, you’d have a ValueSet that defines your initial list of codes, which would include CCS 1.1.1. The mapping of the CCS to diagnosis codes would be done through a ConceptMap, and the relationship between A15 and its child codes would be stored in the hierarchy of the ICD-10 code system.
The limitations of FHIR become clear when you try to resolve your initial list to a set of diagnosis codes; you’d need to run 3 separate FHIR queries. You’d first have to expand the ValueSet, and then traverse the ConceptMap, and finally get all the child codes for the ICD-10 code. What’s worse is that this combination of FHIR queries is specific to this particular structure of the data. If the structure of the ConceptMap or CodeSystem hierarchy were different, you might have to perform a different set of queries.
If you found the previous example confusing, don’t worry. The point of that example was to show how difficult it is to try to use FHIR queries for analytics. For analytics, a much more natural view of terminologies is as a directed graph, where each code is a node and the relationships between them are edges. Any sort of aggregation is then a straightforward graph traversal.
The second major issue is that the FHIR approach to versioning makes it difficult to work with historical data that may span multiple versions of prior terminologies. In analytics, we are generally looking at a data set that contains historical data, sometimes data that goes back years. A particular code may be outdated today, but it was valid when it was used several years ago and you don’t want to throw all those codes out when doing analytics. For analytics, you want a historical view of terminologies that includes both current and expired codes and works with them seamlessly. In FHIR, you either need to specify a specific version, or just choose the latest. You can’t combine a search across versions.
To address these two issues we have developed a hybrid approach to terminologies, the ClosedLoop Terminology Graph. The ClosedLoop Terminology Graph is a high performance terminology graph that can quickly resolve code relationships. It sits in front of our FHIR server and provides a graph-based query API that is much faster and simpler to use for analytics than standard FHIR endpoints. The performance is achieved by storing the graph structure in memory, using a highly-compressed format that just stores the node and edge structure, and the underlying information on each code is retrieved from the FHIR server as needed. This graph has the ability to merge multiple versions of FHIR resources so that it can contain a complete historical code set. By default the graph returns the most recent version of any code, but in cases where a code has been deleted, it will retrieve that code from the most recent version in which it was active.
The ClosedLoop Terminology Graph gives us the best of both worlds. FHIR is an excellent way to store and maintain CodeSystems, ConceptMaps, and ValueSets, and we use it for that purpose. On top of FHIR we provide a graph-based view of terminologies that can span multiple historical versions to enable fast and convenient queries for analytics. While it was a challenge to build our own graph and find a way to keep it synchronized with our FHIR server, the benefits in the end outweighed the cost. It allows us to remain standards compliant, but still provide an excellent analytics experience. FHIR may at some point incorporate more analytics workflows and address these issues. Until it does we have a solution.
Interested in learning more about data interoperability or the ClosedLoop Platform? Read on:
We add new resources regularly. Enter your email address to get them directly in your inbox.