What I'm up to

Last updated: 8 March 2026

๐Ÿ”ง Working on

  • Hardening a Kubernetes-based data platform on AWS EKS โ€” improving observability with OpenTelemetry and Grafana stack
  • Migrating batch pipelines to Apache Kafka streaming architecture
  • Evaluating Apache Iceberg as a table format migration path from legacy Hive-based storage
  • Building this site โ€” documenting the journey publicly for the first time

๐Ÿ“š Learning

  • Kubeflow โ€” getting hands-on with ML pipeline orchestration on Kubernetes
  • MLflow โ€” experiment tracking and model registry patterns in production
  • Apache Flink โ€” stateful stream processing beyond Kafka consumer groups
  • Terraform modules โ€” making infrastructure reusable across environments

โœ๏ธ Writing about

  • The real operational cost of running a Cloudera CDP cluster vs cloud-native alternatives
  • Dremio as a query engine for a data lakehouse โ€” when it works, when it doesn't
  • A practical guide to Apache Iceberg table maintenance at scale

๐Ÿ’ญ Thinking about

  • The gap between "ML platform" and "platform that can run ML workloads" โ€” they are not the same thing
  • How much platform complexity is justified before the platform becomes the product
  • Writing more publicly โ€” the friction of perfectionism vs the value of shipping