The Data Story
The Client Ask:
NIAXO was asked, as part of a platform architecture build, to support automated and manual data ingest methods and establish, maintain and govern active data feeds for use by data scientists and analysts. The range of data requirements was extensive.
Delivering Successful Outcomes (NIAXO’s Solution):
Over the course of a year, NIAXO automated the ingest of >45 Open-Source data feeds for the client, varying in complexity, via a combination of web-scraping, API polling and file downloads; in addition to a series of non-Open-Source feeds, with transfer mechanisms established directly with third party suppliers. All data feeds managed by the platform were governed end-to-end; the ingest process fully integrated with the governance workflow and catalogue. NIAXO established intricate ingest architecture, with ETL functions in place to automated schema checking and av scanning on incoming data files. NIAXO provisioned a myriad of AWS services to support the data science and analytics element on the platform, including Sagemaker – providing users with machine learning capabilities – Athena and Code Commit. Access to both data and tooling was controlled via (Roles Based Access Control) RBAC, with platform roles designed in-line with organisational role. The Data Governance Portal was configured with custom, gated workflows, supporting the manual registry of assets by users, and automated registry of assets acquired via automated ingest pipelines, all synchronised with the Information Asset Register, to ensure the platform and organisation maintained a single source of truth.
Value Added /NIAXO differentiators:
Acknowledging the critical nature of the work to be undertaken on the platform, NIAXO built a bespoke data governance capability, integrated with the technical ingest process, featuring a customised governance workflow and front-end catalogue. NIAXO provided data management and BA expertise to design and develop the tools supporting the client’s adherence to governance standards, both internally and across the wider external landscape.
What They Said:
‘Historically, I’ve found that Data Science and Data Governance rarely marry up. Working on this project, I was confident the data I was using had passed through rigorous governance checks and was fit for use. I also had all the information I needed to use the data safely.’