Andrey Sydelov 9/18/25 Andrey Sydelov 9/18/25

Real-Time CDC with Debezium and Kafka for Sharded PostgreSQL Integration

How to stream data from sharded PostgreSQL to a Data Warehouse using Debezium and Kafka. This guide covers Change Data Capture (CDC) setup with Kubernetes, handling sharded databases, and overcoming operational challenges for scalable, real-time analytics.

Andrey Sydelov 9/16/25 Andrey Sydelov 9/16/25

Sagas: Managing Transactions in Distributed Systems

Sagas revolutionize transaction management in distributed systems, offering a scalable alternative to ACID transactions. This article explores how sagas coordinate microservices through local, reversible steps, using choreography or orchestration. Learn their core concepts, implementation strategies with idempotent designs, advantages like fault tolerance, and trade-offs compared to ACID, with practical tips for building resilient applications.

Andrey Sydelov 9/11/25 Andrey Sydelov 9/11/25

ACID, Isolation Levels, and MVCC: Architecture and Execution in Relational Databases

How do databases ensure data correctness under concurrency and failure? This article breaks down ACID properties, isolation levels, MVCC, and WAL, explaining how relational systems like PostgreSQL maintain consistency and performance.

Andrey Sydelov 9/2/25 Andrey Sydelov 9/2/25

The Blueprint of a Data Team: Roles, Responsibilities, and Specializations

A data team’s success hinges on clear roles and collaboration. Explores how roles evolve, adapt to company needs, and align through a RACI matrix to deliver reliable data with minimal friction.

Andrey Sydelov 8/14/25 Andrey Sydelov 8/14/25

You Can’t Trust COUNT and SUM: Scalable Data Validation with Merkle Trees

A Merkle Tree is a scalable, SQL-friendly approach to verifying data integrity — widely used in systems like Git, blockchains, and distributed databases.

Andrey Sydelov 8/12/25 Andrey Sydelov 8/12/25

Engineering with SOLID, DRY, KISS, YAGNI and GRASP

Design principles like SOLID, DRY, KISS, YAGNI, and GRASP aren’t rules — they’re tools for managing complexity, preserving clarity, and making software resilient to change. This deep dive explores each principle with real-world examples and refactoring patterns.

Andrey Sydelov 7/22/25 Andrey Sydelov 7/22/25

Slowly Changing Dimensions: Strategies for Maintaining History and Integrity in Analytical Systems

Slowly Changing Dimensions (SCD) are essential for maintaining historical accuracy in data systems where context evolves over time. This in-depth guide explores all SCD types, their engineering trade-offs, and practical strategies for designing dimensional data that preserves meaning — not just metrics.

Andrey Sydelov 7/17/25 Andrey Sydelov 7/17/25

Cross-Platform Multi-Channel Attribution in Marketing: Balancing Costs and Results Across Devices

Attribution across channels and devices isn’t just about tracking—it’s about understanding synergy across traffic sources like push notifications, social media, webinars, and affiliate programs. Combining data-driven attribution with MMM and incrementality testing enables smarter budget decisions under modern privacy constraints.

Andrey Sydelov 7/3/25 Andrey Sydelov 7/3/25

What Data Engineers Really Do: It’s Not Pipelines — It’s Guarantees, Contracts, and Cost-Aware Systems

Modern data engineering isn’t about building pipelines — it’s about building trust, reliability, and cost-aware systems. This article reframes the role and explains what experienced engineers actually do.