data, serverless architecture, master data management, Dataops, data lineage, data governance, data catalogs

Data doesn’t sit in one database, file system, data lake, or repository. Data created in a system of record must serve multiple business needs, integrate with other data sources, and then be used for analytics, customer-facing applications, or internal workflows.

Examples include:

  • Data from an e-commerce application is integrated with user analytics, customer data in a customer relationship management (CRM) system, or other master data sources to establish customer segments and tailor marketing messages.
  • Internet of Things (IoT) sensor data is linked to operational and financial data stores and used to control throughput and report on the quality of a manufacturing process.
  • An employee workflow application connects data and tools across multiple software-as-a-service (SaaS) platforms and internal data sources into one easy-to-use mobile interface.

Many organisations also have data scientists, data analysts, and innovation teams who increasingly need to integrate internal and external data sources. Data scientists developing predictive models often load multiple external data sources such as econometrics, weather, census, and other public data and then blend them with internal sources.

Innovation teams experimenting with artificial intelligence need to aggregate large and often complex data sources to train and test their algorithms. And business and data analysts who once performed their analyses in spreadsheets may now require more sophisticated tools to load, join, and process multiple data feeds.

Programming and scripting data integrations

For anyone with even basic programming skills, the most common way to move data from source to destination is to develop a short script. Code pulls data from one or more sources, performs any necessary data validations and manipulations, and pushes it to one or several destinations.

Developers can code point-to-point data integrations using many approaches, such as:

  • A database-stored procedure that pushes data changes to other database systems
  • A script that runs as a scheduled job or a service
  • A webhook that alerts a service when an application’s end-user changes data
  • A microservice that connects data between systems
  • A small data-processing code snippet deployed to a serverless architecture

These coding procedures can pull data from multiple sources, join, filter, cleanse, validate, and transform data before shipping them to destination data sources.

Scripting might be a quick and easy approach to moving data, but it is not considered a professional-grade data processing method. A production-class data-processing script needs to automate the steps required to process and transport data and handle several operational needs.

For example, integrations that process large data volumes should be multithreaded, and jobs against many data sources require robust data validation and exception handling. If significant business logic and data transformations are required, developers should log the steps or take other measures to ensure that the integration is observable.

The script programming to support these operational needs is not trivial. It requires the developer to anticipate things that can go wrong with the data integration and program accordingly.

In addition, developing custom scripts may not be cost effective when working with many experimental data sources. Finally, data integration scripts are often difficult to knowledge transfer and maintain across multiple developers.

For these reasons, organisations with many data integration requirements often look beyond programming and scripting data flows.

Features of robust data integration platforms

Data integration platforms enable the development, testing, running, and updating of multiple data pipelines. Organisations select them because they recognise that data integration is a platform and capability with specific development skills, testing requirements, and operational service-level expectations.

When architects, IT leaders, CIOs, and chief data officers talk about scaling data integration competencies, they recognise that the capabilities they seek go beyond what software developers can easily accomplish with custom code.

Here is an overview of what you are likely to find in a data integration platform.

  • A tool specialised for developing and enhancing integrations; often low-code visualisation tools allow drag-and-drop processing elements, configuring and connecting them into data pipelines.
  • Out-of-the-box connectors that enable rapid integration with common enterprise systems, SaaS platforms, databases, data lakes, big data platforms, APIs, and cloud data services. For example, suppose you want to connect to Salesforce data, capture accounts and contacts, and push the data to AWS Relational Database Service. In that case, chances are the integration platform already has these connectors prebuilt and ready to be used in a data pipeline.
  • The capability to handle multiple data structures and formats beyond relational data structures and file types. Data integration platforms typically support JSON, XML, Parquet, Avro, ORC, and may also support industry-specific formats such as NACHA in financial service, HIPAA EDI in healthcare, and ACORD XML in insurance.
  • Advanced data quality and master data management capabilities may be features of the data integration platform, or they may be add-on products that developers can interface from data pipelines.
  • Some data integration platforms target data science and machine learning capabilities and include analytics processing elements and interface with machine learning models. Some platforms also offer data prep tools so that data scientists and analysts can prototype and develop integrations.
  • Devops capabilities, such as support for version control, automating data pipeline deployments, tearing up and down test environments, processing data in staging environments, scaling up and down production pipeline infrastructure, and enabling multithreaded execution.
  • Multiple hosting options include data centre, public cloud, and SaaS.
  • Dataops capabilities can maintain test data sets, capture data lineage, enable pipeline reuse, and automate testing.
  • In runtime, data integration platforms can trigger data pipelines using multiple methods, such as scheduled jobs, event-driven triggers, or real-time streaming modalities.
  • Observable production data pipelines provide reporting on performance, alert on data source issues, and have tools to diagnose data processing problems.
  • Different tools support security, compliance, and data governance requirements, such as encryption formats, auditing capabilities, data masking, access management, and integrations with data catalogs.
  • Data integration pipelines don’t run in isolation; top platforms integrate with IT Service Management, agile development, and other IT platforms.

How to shop for a data integration platform

The list of data integration capabilities and requirements can be daunting considering the types of platforms, the number of vendors competing in each space, and the analyst terminology used to categorise the options. So, how do you choose the right mix of tools for today and future data integration requirements?

The simple answer is that it requires some discipline. Start by taking inventory of the integrations already in use, cataloging the use cases, and reverse engineering the requirements on data sources, formats, transformations, destination points, and triggering conditions.

Then qualify the operating requirements, including service-level objectives, security requirements, compliance needs, and data validation requirements. Finally, consider adding some new or emerging use cases of high business importance that have requirements that differ from existing data integrations.

With this due diligence in hand, you can probably find ample reasons why do-it-yourself integrations are subpar solutions and some guidance about what to look for when reviewing data integration platforms.

Japan travel news, japan travel guides, japan holiday destinations and japan reviews



Flipkart challenges Karnataka HC order on CCI probe in Supreme Court

Bengaluru: E-commerce major Flipkart has moved the Supreme Court challenging last week’s Karnataka High Court order which cleared the way for the Competition Commission of India ( CCI ) to probe firms like Flipkart and Amazon India, according to sources aware of the matter. Flipkart has filed an appeal and…

Read more: Flipkart challenges Karnataka HC order on CCI probe in Supreme Court

Amazon has no 'good news' for Bitcoin

Amazon has denied a media report saying the e-commerce giant was looking to accept bitcoin payments by the end of the year. The report from London’s City A.M. newspaper, citing an unnamed “insider”, sent the world’s biggest cryptocurrency up as much as 14.5% before it trimmed gains to last trade…

Read more: Amazon has no 'good news' for Bitcoin

Ola founder to Tesla, Hyundai: Build in India, not just import EVs

With Tesla reportedly lobbying for a drastic reduction in import duties on electric vehicles (EVs) in India and Hyundai backing Tesla for similar tax reliefs, Ola founder Bhavish Aggarwal hit back by saying that he “strongly disagrees” with the thought that reducing taxes, and import duties will help increase the…

Read more: Ola founder to Tesla, Hyundai: Build in India, not just import EVs

British femtech Elvie lands £58M funding for its smart breast pumps and more

The femtech industry is expected to become a $50 billion market by 2025. To capitalise on this growth, London-based Elvie, a leading femtech startup that develops iconic and smart products for women has bagged £58 million in a Series C funding round. Growth plans ahead The investment round was led…

Read more: British femtech Elvie lands £58M funding for its smart breast pumps and more

Mac Pro 2022: Upcoming workstation won’t adopt Apple Silicon yet, report claims

Photo credit: Nana Dua / Unsplash When Apple announced its transition to using Apple Silicon chips for its devices last year, many assumed that the company would no longer launch a new product powered by any Intel processors. That does not seem to be the case for the Mac Pro, though. A new…

Read more: Mac Pro 2022: Upcoming workstation won’t adopt Apple Silicon yet, report claims

Top 20 companies holding Brits financial data

During the COVID-19 pandemic, lockdown has played as a catalyst for acceleration of digitisation when people wanted to attend events virtually, shop groceries online and much more while staying at home. Consequently, there was a greater need for people to share their financial data with services. Relatively, Mine, an Israeli…

Read more: Top 20 companies holding Brits financial data

‘New World’ MMO reaches new concurrent players record on Steam

From the “New World” trailer | Photo credit: Play New World / YouTube screenshot Amazon’s upcoming MMO “New World” is proving to be a very popular game and has reached another all-time high record in concurrent players while still in the beta phase. Fans still have a few days to…

Read more: ‘New World’ MMO reaches new concurrent players record on Steam

Global chip shortage continues, affecting a range of sectors

(123rf)The prolonged global semiconductor shortage is having a sweeping impact on a wide range of sectors ranging from cars and computers to smartphones and gaming devices. South Korea, home to major chipmakers and tech players, is keen to soften the effect, but the outlook remains mixed, experts said. Last week,…

Read more: Global chip shortage continues, affecting a range of sectors

Look out! Fake Windows 11 installers are doing the rounds online

Does Time Machine tell you the file %@ can’t be backed up? It’s a Monterey beta bug

Electric car subscription startup Onto races ahead with $175M to drive Britain towards a greener future

Jeff Bezos ready to cover $2 billion in costs if NASA agrees for Blue Origin's moon mission contract

Cryptocurrency exchange operators Huobi, OKCoin to close Beijing subsidiaries amid China’s crackdown

Working with data in motion at clothing brand Boden

IBM z/OS V2.5 – the next-gen OS for IBM Z

Nothing Ear 1 TWS earbuds launching globally at 06:30 pm today: How to watch the livestream; What to expect