Barun Debnath đŸ•šī¸
blog / projects / explorer
blog projects explorer
👋

Hi, I am Barun Debnath

I build full-stack products, AI systems, and the platforms that keep them reliable, observable, and scalable when real users show up.

My lane sits where product engineering meets SRE: TypeScript and Go services, agent workflows, ClickHouse/Kafka data paths, Kubernetes platforms, and the reliability work behind them.

Ninja coding on laptop - Barun's digital avatar
Open To Opportunities
  • infra
  • ai
  • ml
  • agents
  • sre
  • scaling
  • infra
  • ai
  • ml
  • agents
  • sre
  • scaling

Current Arc

Full-stack x AI

Shipping product experiences, agents, APIs, and the infra they depend on.

Main Class

Platform-minded builder

Go, TypeScript, ClickHouse, Kafka, Kubernetes, and production debugging.

Side Quest

Writing Field Notes

Turning systems, AI, and hard-earned reliability lessons into posts.

đŸ’ŧ What I've Been Up To
Selected work across full-stack engineering, AI systems, platform reliability, and scalable infrastructure.
Designation Organization Duration
Founding Engineer LinkRunner Nov'25 - Present +
  • Rebuilt real-time attribution pipelines around Go, Kafka, ClickHouse, and Redis for high-volume event ingestion.
  • Delivered major infra cost reductions across analytics, Kubernetes workloads, logging, and cloud migration work.
  • Led platform reliability work across GKE, GitOps, ClickHouse operations, Redis recovery paths, and compliance evidence.
GoClickHouseKafkaGKERedisCost
SRE/Platform Engineer One2N Jun'24 - Nov'25 +
  • Built hybrid cloud and Kubernetes platform work with EKS, Cilium, Istio, CoreDNS, ExternalDNS, and ingress routing.
  • Reduced GenAI inference cost by moving workloads from managed APIs to self-hosted model serving on SageMaker.
  • Scaled and hardened multi-cloud infrastructure with VPCs, load balancers, WAF rules, VPNs, CI/CD, and ISO/SOC2 work.
EKSCiliumIstioSageMakerTerraformSecurity
SRE/DevOps Media.Net Jul'23 - Jun'24 +
  • Built monitoring surfaces for production queueing pipelines with Django, Prometheus metrics, and Grafana dashboards.
  • Handled zero-downtime GKE node-pool upgrades, production MySQL migration work, and Redis spot-node cost optimization.
  • Improved registry reliability and cluster cost attribution with Artifact Registry migration and Stackdriver exporter work.
GKEPrometheusGrafanaMySQLRedisDjango
SRE Intern Media.Net Jan'23 - Jun'23 +
  • Built internal Terraform modules to standardize resource provisioning across environments.
  • Migrated GitOps deployments from ArgoCD app-of-apps to ApplicationSet patterns for better environment scale.
  • Integrated DriftCTL with Terraform and Atlantis to detect and manage infrastructure state drift.
TerraformArgoCDApplicationSetAtlantisDriftCTL

From product surfaces and AI workflows to production reliability and scalable platforms, this is the operating range. Know more about my work experience from my RESUME

Lore

Know more about me 📜

Origin Class

Full-stack engineer with platform instincts

I build product surfaces, backend systems, AI workflows, and the infra paths that keep them reliable when traffic, data, and debugging pressure increase.

Current focus Full-stack + AI
Reliability lens SRE / Platform
Favorite terrain Systems at scale

Proof of Work

Shipping where product and infrastructure meet

Recent work spans real-time attribution, ClickHouse analytics, GenAI cost optimization, Kubernetes platforms, observability, and zero-downtime cloud migrations.

Events handled 25M+ / day
Infra cost cut 25x
Inference cost cut 99%
Traffic scale 10x growth

Backpack

Skills I carry into the build

A practical stack for building apps, agents, platforms, and production systems end to end.

Languages

GoPythonTypeScriptJavaScriptRustSQLC++

Backend & APIs

GoNode.jsFastAPIExpressgRPCRESTDjangoFlask

AI Engineering

RAGAgentsLangGraphLangChainMem0TemporalMCPOpenAI API

Data

ClickHouseKafkaRedisPostgreSQLMongoDBNeo4jQdrantElasticsearch

Cloud & Platform

GCPAWSKubernetesDockerCiliumIstioCloudflare

Reliability

PrometheusGrafanaThanosOpenTelemetryLokiAlertmanagerOpenCost

IaC & Delivery

TerraformTerragruntAnsibleArgoCDGitHub ActionsGitLab CIHelm

GitHub Realm

Open-source trail

A live-ish view of the public work and contribution trail I keep around tools, platform experiments, AI projects, and developer workflow systems.

View GitHub profile
Public repos 74
Original repos 55
Followers 132
Public stars 33
PythonAstroRustGoTypeScriptHCL
GitHub contribution activity graph for d-cryptic

Speaking

Talk Delivered

A recent speaking win on scaling real-time attribution systems with ClickHouse.

ClickHouse Bangalore User Group

25x Cheaper Infrastructure, 8x Cheaper Pricing: Real-Time Attribution With ClickHouse as the Backbone

Real-time attribution systems are analytical workloads: high-volume event ingestion, time-window queries, and heavy aggregations. This session covers how we rebuilt Linkrunner's attribution system from PostgreSQL to ClickHouse, reduced infrastructure cost by roughly 25x, and unlocked pricing around 8x cheaper than competitors.

Event: Agentic AI Meets Real-Time Data: Building the Intelligence Layer of Tomorrow

When: Saturday, Apr 18, 2026, 10:00 AM - 2:00 PM IST

Where: AWFIS Residency Square, Bengaluru

Speakers: Darshil Rathod, Co-founder & CTO @ Linkrunner, and Barun Debnath, Founding Engineer @ Linkrunner

View Meetup event

Latest posts

See all posts
  • Deploying a Scalable NATS Cluster - Part 2: Hands-On Demo thumbnail

    November 22, 2025

    Deploying a Scalable NATS Cluster - Part 2: Hands-On Demo

    ->

    We will do a hands-on demo based on the theoretical groundwork where we will construct a complete, production-grade, and fully observable 4-node NATS cluster using Docker.

    External #nats
  • Hitchhiker's Guide To Make SadServers Happy - Part 1 thumbnail

    October 20, 2025

    Hitchhiker's Guide To Make SadServers Happy - Part 1

    ->

    In this blog, I will go through all the concepts that is required to complete almost all of the exercises in [sadservers](sadservers.com). This blog is not a compilation of solutions, but a guide for anyone to reach the appropriate solution.

    #troubleshooting
  • Python Tricks that LLMs won't teach you - Part 1 thumbnail

    September 08, 2025

    Python Tricks that LLMs won't teach you - Part 1

    ->

    Python is easy everytime, but Python is efficient when you do it the right way. In this blog series, I will be summarizing the points learnt in the book "Python Tricks - The Book"

    Series #python
  • Enough C++ To Build Anything: Part 1 thumbnail

    July 16, 2025

    Enough C++ To Build Anything: Part 1

    ->

    In this blog series, I will be learning, exploring, and explaining C++ Concepts. Whether it's DSA, building DB or picking up an OSS project, this blog series will help you navigate the C++ world.

    Series #c++
  • Deploying a Scalable NATS Cluster - Part 1: Core Architecture and Considerations thumbnail

    June 04, 2025

    Deploying a Scalable NATS Cluster - Part 1: Core Architecture and Considerations

    ->

    Learn about the core architecture and important considerations when deploying a scalable NATS cluster.

    External #nats
  • Neovim Saga Part 1: Getting Started thumbnail

    May 11, 2025

    Neovim Saga Part 1: Getting Started

    ->

    In this blog series, I will be moving to Neovim completely. This part will cover Neovim keybindings, tricks and introduction to Lua

    Series #neovim

Recent projects

See all projects
  • CCSentinel screenshot

    CCSentinel

    ->

    Cross-platform Claude Code account, profile, and session manager for isolating project contexts and switching accounts from the terminal.

    RustTypeScriptTauri Per-project Claude Code session isolationOAuth, API key, Bedrock, and Vertex AI profile support
  • PaladinAI screenshot

    PaladinAI

    ->

    AI-powered monitoring and incident response platform that connects Prometheus, Grafana, Loki, and Alertmanager with graph memory and natural-language investigation.

    PythonFastAPILangGraph Internal hackathon winnerMulti-interface Web UI, CLI, and API
  • ChattGator screenshot

    ChattGator

    ->

    Open-source chat UI kit and backend-as-a-service for adding chat capabilities to applications with reusable components and docs.

    JavaScriptReactNext.js Reusable chat UI componentsDocumentation site for setup and usage
đŸ’Ŧ

Want to Chat?

Whether it's about full-stack products, AI engineering, system design, or reliability at scale, I'm always up for a good conversation.

Grab a coffee, poke around, and don't forget to say hi! You can find me at:

  • X (formerly Twitter) /
  • GitHub /
  • LinkedIn /
  • Medium /
  • Website /
  • [email protected]
© 2026 â€ĸ Barun Debnath đŸ•šī¸
Tip: #tag @type type: tag: stack: role: outcome: since: until: and exclusions with -
Press Esc or click anywhere to close
Tip: #tag @type type: tag: stack: role: outcome: since: until: and exclusions with -
    Press Esc or click anywhere to close