- Phone: +46 73 036 28 78
- Email: [email protected]
- Location: Gothenburg, Sweden
- Website: reinthal.me
- LinkedIn: alexander-reinthal
- GitHub: reinthal
Senior Data Engineer, Independent Contributor & Open-Source Advocate with expertise in real-time data ingestion, data modeling, and orchestrating cloud-based data workloads. Strong background in Apache Spark, Flink & Iceberg, Databricks & Unity Catalog, and Snowflake ,similar to AWS Glue, S3, Redshift and EMR from the amazon product offering. Proven open-source contributions to Dagster & DLT Hub. Passionate about developer experience, mentoring engineers in distributed systems, and ML-driven analytics.
- Feb 2022 – present
- Gothenburg, Sweden
- Worked remotely leading development of an asset-centric data modeling approach for Databricks & Delta Live Tables, enhancing data reliability and developer experience.
- Led an agile R&D team to build real-time data ingestion pipelines handling massive volumes of CI-pipeline events using Apache Flink (similar to AWS Kinesis) & Spark Structured Streaming.
- Designed scalable data pipelines on Databricks & Snowflake, similar to AWS EMR & AWSAWS Redshift, enabling advanced analytics workloads.
- Served as the technical lead, engaging with a client to design and implement a modern data platform leveraging Snowflake, Dagster, and dbt.
- Mentored junior engineers in Python, SQL, DAG-based orchestration, and modern data engineering best practices. Led workshops on aligning technical solutions with business objectives.
- Collaborated with stakeholders to refine data governance, mediate upstream data contracts, and improve data visibility using effective metadata management.
- Apr 2019 – Jan 2022
- Gothenburg, Sweden
- Developed predictive analytics tooling for detecting malicious network traffic, leveraging Python, machine learning, and distributed processing (precursor to modern real-time data pipelines).
- Automated security alert processing, optimizing real-time event analysis and reducing response times.
- Designed Python-based data ingestion and processing workflows, akin to real-time ETL pipelines in modern data engineering.
- Provided mentorship and security training, educating analysts on threat intelligence, anomaly detection, and alert triage automation.
- June 2017 – Oct 2018
- Gothenburg, Sweden
- Led a team to develop a log parsing and analytics tool that scaled into a dedicated engineering team at Ericsson.
- Automated data workflow optimizations in Python, improving efficiency in log-based machine-generated data.
- GitHub
- Developed a modern data lakehouse solution leveraging Apache Iceberg, Flink, Spark, Nessie (similar to AWS Glue), MinIO (AWS S3 equivalent), and Dagster to analyze high-volume data.
- Prototyped real-time ingestion pipelines using Apache Flink and Iceberg tables stored on Minio, Similar to AWS S3.
- GitHub
- Designed and implemented an on-premise data lake solution for Kubernetes using Dagster, Nessie, Apache Iceberg, MinIO, Apache Flink and Spark. Similar to AWS Glue, AWS S3 Tables and AWS S3.
- Inspired work to scale our open source data platform to multiple customers.
- GitHub
- Dagster Contribution: PR #24188 – Improved workflow orchestration, scheduling, and monitoring of data pipelines.
- DLT Hub Contribution: PR #594 – Mentored junior engineers, contributing Python data ingestion enhancements.
- Data Architectures: Data Modeling, Data Warehousing, Data Lakehouse Solutions
- Semantic Layers: dbt Cloud, Cube.dev
- Data Catalogs: Polaris, Snowflake, Unity & Nessie (Similar to AWS Glue)
- Big Data & Streaming: Apache Flink, Apache Spark, RabbitMQ
- Data Engineering Tools & Platforms: Databricks & Delta Live Tables, Snowflake (similar to AWS EMR & Redshift), dbt, dlthub, Dagster
- Languages: Python, SQL, Java, Terraform, Nix, Rust, C++, Assembly, R
- Cloud & Platforms: AWS, MinIO & Azure Blob Storage (AWS S3 equivalent), Azure, Kubernetes, Proxmox
- Infrastructure & DevOps: Docker, Podman, FluxCD, Git, CI/CD pipelines, Terraform, Helm
- Business Intelligence Tools: PowerBI, Apache Superset, Lightdash, Streamlit
- Sept 2016 – June 2018
- Coursework: Statistical Physics, Neural Networks and Machine Learning
- Sept 2013 – June 2016
- Coursework: Algorithms, Testing, Debugging and Verification, Theoretical Computer Science, Operating Systems, Cryptography, Cyber Security
Data Modelling for Predicting Exploits (10.1007/978-3-030-03638-6_21)
- Nov 2018
- Reinthal, A, Filippakis, E., Almgren, M.