CRISP (Chesapeake Regional Information System for our Patients), a nonprofit healthcare information exchange (HIE), collaborated with Slalom to build a Databricks data lakehouse architecture initially in response to the analytics requirements driven by the COVID-19 pandemic. Over time, this platform has been expanded to cater to additional use cases. One significant development has been the engineering of streaming data pipelines to process healthcare messages, including HL7, with the goal of achieving vendor independence.
The session will primarily focus on the enhancements CRISP has made to its data lakehouse platform to support streaming use cases and the resulting impact on the organization. Key topics to be covered include the utilization of Databricks Auto Loader for efficient ingestion of incoming files, ensuring data quality through Delta Live Tables, and internal data sharing using a SQL warehouse. Additionally, the presentation will highlight CRISP’s efforts in parsing and standardizing HL7 messages from numerous sources. These advancements have enabled CRISP to process and stream over 4 million messages per day in near real-time, ensuring scalability and accommodating the onboarding of new healthcare providers. Consequently, CRISP continues to facilitate care coordination and drive improvements in health outcomes.
In summary, CRISP’s collaboration with Slalom has yielded significant improvements to its data lakehouse platform, particularly in supporting streaming use cases. By leveraging Databricks Auto Loader, Delta Live Tables, and streamlining HL7 message processing, CRISP has achieved scalability and near real-time data processing capabilities. These advancements empower CRISP to efficiently onboard new healthcare providers and further enhance care coordination efforts, ultimately leading to improved health outcomes for patients.