What healthcare orgs. need to know before implementing Big Data
April 2, 2014 1 Comment
Co-authored by Shiraz Rehmani and Sameer Attharkar
Several would argue that the healthcare system is on a trajectory that is unsustainable due to increasing pressure to improve outcomes and patient care; reduce inefficiencies; rein in rising costs; and improve transparency and overall collaboration.
Many organizations are placing big bets on Big Data to solve these problems, hoping to elevate their analytics to a utopia of real-time, interactive problem analysis and decision-making. This view of Big Data is often misleading and overly simplistic, neglecting to consider the true life cycle of healthcare data. Big Data’s often misinterpreted as a silver-bullet solution that renders existing data management and traditional analytics practices obsolete—but this simply isn’t the case.
Is Big Data right for your healthcare organization? Before answering, it’s important to understand its inherent complexities and opportunities. We can start by examining healthcare data sources.
Healthcare data sources
Healthcare data is a collection of diverse standards, quality, completeness, availability, frequency, and volume. Adherence to specific standards and quality may vary considerably from one source to another. Healthcare data comes in three flavors: structured, semi-structured, and completely unstructured. These data sets can be loosely classified as follows:
Electronic Health Records (EHR/EMR)
- Claims/Billing
- Lab tests and results
- Medications
- Clinical notes
- Other clinical data points (e.g., diagnosis, allergies, medical history, and immunizations)
Other
- Medical device output (sensor and imaging data)
- Genomic data (DNA sequencing)
- Individual’s behavioral data (social networks and wearable mobility sensors)
- Research clinical studies and trials
- Public and government records
- Regulatory compliance (state and federal)
- Data standards and code sets: CPT, ICD, LIONIC, NDC, RX, SNOMED CT
It’s critical to note that many of these data sets are often in silos; extremely fragmented; and require varying degrees of transformation process, often based on complex data and compliance rules prior to any consumption or usage, including analytics.
The three Vs: variety, velocity, and volume
The next logical step is to identify if any of the data categories are predisposed toward Big Data technologies by their innate design. That is, which of the data categories pass the Big Data litmus test of variety, velocity, and volume.
Medical device output (sensor and imaging), Genomic (DNA Sequencing), and behavioral (social networks and wearable mobility sensors) data sets immediately stand out as potential candidates. They all have heterogeneous qualities, sheer velocity, and volume. Additionally, these data sets are often used in short-term, focused analysis and other studies uniquely suited for Big Data.
On the other hand, patient clinical-centric data has comparatively lower volumes and is often even difficult to procure, thereby lacking both velocity and volume. This type of data more often than not requires a more traditional and tested approach that is heavy on reference, consistency, integrity, and workflow processes, relying on intense mapping rules and cleansing prior to consumption and analysis. Data sets in this category of data are not suited for Big Data implementation (at least not natively).
Additional considerations
The explosion of data liquidity in healthcare and the advent of new sources of data, including mobility devices and social media, has resulted in knee-jerk reactions from many organizations banking on Big Data solving all their healthcare analytics problems. But a basic analysis of the data categories noted above reveals a different perspective. While some data sources are uniquely suited for Big Data implementation, others require more traditional data management processes. Make sure to consider the following before implementing Big Data:
- Avoid treating all healthcare data as a monolithic set
- Clearly understand the scope and duration of required analytics
- Understand the source of data in question to determine volume, velocity, and completeness
- Clearly understand the security and privacy requirements of the data in question
Big Data is just part of the solution
Often times, the assumption that a particular technology or specific software will provide all the answers is, at best, misleading. The reality is much more complex and sobering. Most industry thought leaders agree that analytics competency is critical to address many of the fundamental issues of the healthcare ecosystem. The key is to understand the nature of the data in question first, and then decide upon a viable, customized approach backed by sound data management and design principles.
Data is often referred to as the raw material of information, and Big Data has a pivotal role in transforming this raw material to be consumed at a new level in the healthcare ecosystem. But this process must be aligned strategically with traditional analytics and data management practices. With a strategically aligned approach, the real potential of developing a framework of viable analytical services can be realized. That framework can truly facilitate informed, real-time decision-making on critical issues, helping reduce costs and improve efficiencies at all levels of your healthcare operation.
Pingback: Healthcare Organizations and Big Data | Brian Gogle