Data Spine

Why Data Spine?

High-level Dataflow through Data Spine Figure 1: High-level Dataflow through Data Spine

The EFPF ecosystem is based on a federation model. The services belonging to different platforms are heterogeneous and interoperability gaps exist between them at the levels of protocols, data models, data formats, data semantics, and also authentication providers. Data Spine is the gluing mechanism that is capable of bridging these interoperability gaps and enabling communication between them, thereby enabling communication in the EFPF ecosystem. In order for a pair of heterogeneous services in the EFPF ecosystem to communicate with each other, they are integrated through the Data Spine at first. Once the integration is done, communication can happen. Figure 1 shows a very high-level overview of dataflow between such heterogeneous services in the EFPF ecosystem through the Data Spine, with the Data Spine as a ‘black-box’. The Data Spine User Guide illustrates the dataflow in the EFPF ecosystem at greater levels of detail.

Data Spine Overview

Data Spine is the interoperability backbone of the EFPF ecosystem that interlinks and establishes interoperability between the services of different platforms. The Data Spine is aimed at bridging the interoperability gaps between services at three different levels:

  • Protocol interoperability: The Data Spine supports two communication patterns:

    1. synchronous request-response pattern and
    2. asynchronous publish-subscribe pattern

    While the Data Spine supports standard application layer protocols that are widely used in the industry (e.g., HTTP/REST, MQTT, AMQP, etc.), it employs an easily extensible mechanism for adding support for new protocols

  • Data Model interoperability: The Data Spine provides a platform and mechanisms to transform between the message formats, data structures and data models of different services thereby bridging the syntactic and semantic gaps for data transfer

  • Security interoperability: The EFPF Security Portal (EFS) component of the Data Spine facilitates the federated security and SSO capability for the EFPF ecosystem

Data Spine Architecture Figure 2: High-level Architecture of the Data Spine

Components

As illustrated in Figure 2, the Data Spine is a collection of the following components that work together to form an integration, interoperability and communications layer for the EFPF ecosystem:

Conceptual Component Technology Overview
Integration Flow Engine (IFE) Apache NiFi DS NiFi
API Security Gateway (ASG) APISIX ASG and EFS
Service Registry (SR) LinkSmart Service Catalog DS Service Registry
Message Bus (MB) RabbitMQ DS RabbitMQ
EFPF Security Portal (EFS) Keycloak (& Policy Enforcement Service) EFS
Concept Technology/Implementation
Integration Flow (iFlow) Dataflow in NiFi (on NiFi’s canvas)
Protocol Connector Processors in NiFi such as HandleHTTPRequest, InvokeHTTP, ConsumeMQTT, PublishMQTT, etc.
Data Transformation processor Processors in NiFi such as JoltTransformJSON, TransformXml, ExecuteScript, ReplaceText, EvaluateJsonPath, etc.

Data Spine Realisation Figure 3: Data Spine Realisation

Data Spine Documentation

User Guide 101

References