Big Data for a lifetime

Nowadays, the increasing pervasiveness of data and computing results in the proliferation of edge applications for timely and effective processing of data and advanced analytics. However, as the available data grows, new solutions are needed to ensure a fluid integration of resources to support dynamic, data-driven application workflows. In that direction, the EU-funded DataCloud project introduces a groundbreaking paradigm with a complete life cycle managing Big Data pipelines through discovery, design, simulation, provisioning, deployment and adaptation across the computing continuum. It will allow Big Data pipelines to interconnect the end-to-end industrial operations from the preprocessing and collecting of data to the realisation of a business target. DataCloud will make Big Data advancements more accessible regardless of hardware.

Project Objective

DataCloud provides a novel paradigm covering the complete lifecycle of managing Big Data pipelines through discovery, design, simulation, provisioning, deployment, and adaptation across the Computing Continuum. Big Data pipelines in DataCloud interconnect the end-to-end industrial operations of collecting pre-processing and filtering data, transforming and delivering insights, training simulation models, and applying them in the cloud to achieve a business goal. DataCloud delivers a toolbox of new languages, methods, infrastructures, and prototypes for discovering, simulating, deploying, and adapting Big Data pipelines on heterogeneous and untrusted resources. DataCloud separates the design from the run-time aspects of Big Data pipeline deployment, empowering domain experts to take an active part in their definitions. The main exploitation targets the operation and monetization of the toolbox in European markets, and in the Spanish-speaking countries of Latin America. Its aim is to lower the technological entry barriers for the incorporation of Big Data pipelines in organizations’ business processes and make them accessible to a wider set of stakeholders regardless of the hardware infrastructure. DataCloud validates its plan through a strong selection of complementary business cases offered by SMEs and a large company targeting higher mobile business revenues in smart marketing campaigns, reduced production costs of sport events, trustworthy eHealth patient data management, and reduced time to production and better analytics in Industry 4.0 manufacturing. The balanced consortium consists of 11 partners from eight countries. It has three strong university partners specialised in Big Data, distributed computing, and high-productivity languages, led by a research institute. DataCloud gathers six SMEs and one large company (as technology providers and stakeholders/users/early adopters) that prioritise the business focus of the project in achieving high business impacts.

DATACLOUD

Deliverables