Serverless Orchestration: Exploring the Future of Workflow AutomationPyCon & PyData DE 2025Orchestration is a typical challenge in the data engineering world. Scheduling your data transformation jobs via CRON-jobs is cumbersome and error-prone. Furthermore, with an increasing number of jobs to manage it gets in-oversee able. Tools like Apache Airflow, Dagster, Luigi, and Prefect are known for addressing these challenges but often require additional resources or investment. With the advent of serverless orchestration tools, many of these disadvantages are mitigated, offering a more streamlined and cost-effective solution. Beyond data engineering, serverless orchestration holds substantial potential for classical software engineering, especially as organizations explore serverless approaches for optimizing efficiency and reducing overhead.
Microsoft Fabric: Data Engineering Game Changer or Just a Fad?Vienna Data Engineering Meetup, Oct 2024Microsoft Fabric is a new end-to-end analytics and data platform designed for enterprises that need a unified solution. It promises to span the entire data lifecycle, including data movement, processing, ingestion, transformation, real-time event routing, and report building. At Cloudflight, we started using Microsoft Fabric to build our internal data platform. This talk will cover our experiences, the issues we encountered, the best practices we learned, and how we managed to work with Microsoft Fabric. At the end, let's have a joint discussion and look forward to whether Fabric is here to stay or whether it will fade away.
Von Chaos zu Erfolg: Datenqualität beherrschendata2day 2023This talk covers the challenges of working with messy and complex data during data processing. We introduce the most common sources and types of data quality problems. We then look at the specific methods and techniques used to identify, analyze and resolve such issues, such as data quarantining, data testing and data contracts. The aim is to introduce strategies, best practices and tools for ensuring data quality.
IIoT-Datenanbindung und -analyse leicht gemachtBuilding IoT 2023
Apache StreamPipes for Pythonistas - IIoT data handling made easy!PyCon & PyData DE 2023The industrial environment offers a lot of interesting use cases for data enthusiasts. There are myriads of interesting challenges that can be solved by data scientists. However, collecting industrial data in general and industrial IoT (IIoT) data in particular, is cumbersome and not really appealing for anyone who just wants to work with data. Apache StreamPipes addresses this pitfall and allows anyone to extract data from IIoT data sources without messing around with (old-fashioned) protocols. In addition, StreamPipes newly developed Python client now gives Pythonistas the ability to programmatically access and work with them in a Pythonic way.