IEEE Big Data , Naples, İtalya, 15 - 18 Aralık 2023, ss.5007-5013
Health data plays a pivotal role in modern
healthcare, guiding patient care, diagnoses, treatments, and
outcomes. This extensive data repository encompasses electronic
health records, medical imaging, test reports, and
administrative information, empowering healthcare
practitioners and researchers to make evidence-based decisions
to improve patient well-being. In the complex healthcare
landscape, handling health data presents challenges. While
relational databases have historically dominated many
industries, including healthcare, innovative alternatives like
graph databases are gaining favor. Due to its complex and
interconnected nature, healthcare data often losessemantic data
integrity when modeled in relational databases. In contrast,
graph databases have shown remarkable performance with
interconnected data. Consequently, there is a belief that
modeling health data as a whole on a graph database would
produce excellent results. This preliminary study investigates
how graph databases can efficiently manage health data by
comparing simple data modeling and query performance. The
research utilizes a dataset that is publicly available from a
hospital in the United States. The dataset covers multiple areas,
including hospital admissions, diagnoses, laboratory results, and
prescription information for patients diagnosed with diabetes.
Initially, an Entity-Relationship Diagram (ERD) models this
two-dimensional tabular dataset and is built on a relational
database. Subsequently, the ERD is transformed into a graph
database schema and built on a NoSQL graph database system.
Both databases are normalized during the modeling process,
and they share identical data to ensure consistency in data entry.
Following this, varying degrees of complex queries are
constructed and enacted using the query languages of both
database management systems. The primary results indicate
that Neo4j outperforms PostgreSQL in performance, though
slight inconsistencies in data entry were noted. It highlights their
potential in enhancing healthcare data management for better
patient care and outcomes.