Die Apache Software Foundation: Die größte Open-Source-Stiftung der WeltIT Tage 2025The Apache Software Foundation (ASF) plays a central role in today's software landscape—its projects are ubiquitous, and its economic contribution is estimated at over $22 billion. Yet many developers who work with Apache projects on a daily basis or whose work is based on ASF software know surprisingly little about the organization itself and its structures. This presentation offers participants an easy-to-understand introduction to how the ASF works, explains key terms and processes, and provides practical insights into the organization—supplemented by personal experiences from the perspective of a PMC (Project Management Committee) member, i.e., a so-called core contributor.
Serverless Orchestration: Exploring the Future of Workflow AutomationPyCon & PyData DE 2025Orchestration is a typical challenge in the data engineering world. Scheduling your data transformation jobs via CRON-jobs is cumbersome and error-prone. Furthermore, with an increasing number of jobs to manage it gets in-oversee able. Tools like Apache Airflow, Dagster, Luigi, and Prefect are known for addressing these challenges but often require additional resources or investment. With the advent of serverless orchestration tools, many of these disadvantages are mitigated, offering a more streamlined and cost-effective solution. Beyond data engineering, serverless orchestration holds substantial potential for classical software engineering, especially as organizations explore serverless approaches for optimizing efficiency and reducing overhead.
Microsoft Fabric: Data Engineering Game Changer or Just a Fad?Vienna Data Engineering Meetup, Oct 2024Microsoft Fabric is a new end-to-end analytics and data platform designed for enterprises that need a unified solution. It promises to span the entire data lifecycle, including data movement, processing, ingestion, transformation, real-time event routing, and report building. At Cloudflight, we started using Microsoft Fabric to build our internal data platform. This talk will cover our experiences, the issues we encountered, the best practices we learned, and how we managed to work with Microsoft Fabric. At the end, let's have a joint discussion and look forward to whether Fabric is here to stay or whether it will fade away.
Von Chaos zu Erfolg: Datenqualität beherrschendata2day 2023This talk covers the challenges of working with messy and complex data during data processing. We introduce the most common sources and types of data quality problems. We then look at the specific methods and techniques used to identify, analyze and resolve such issues, such as data quarantining, data testing and data contracts. The aim is to introduce strategies, best practices and tools for ensuring data quality.
IIoT-Datenanbindung und -analyse leicht gemachtBuilding IoT 2023
Apache StreamPipes for Pythonistas - IIoT data handling made easy!PyCon & PyData DE 2023The industrial environment offers a lot of interesting use cases for data enthusiasts. There are myriads of interesting challenges that can be solved by data scientists. However, collecting industrial data in general and industrial IoT (IIoT) data in particular, is cumbersome and not really appealing for anyone who just wants to work with data. Apache StreamPipes addresses this pitfall and allows anyone to extract data from IIoT data sources without messing around with (old-fashioned) protocols. In addition, StreamPipes newly developed Python client now gives Pythonistas the ability to programmatically access and work with them in a Pythonic way.