booting engineer.exe — session: 2026

I build systems that
think, scale, stay up.

I'm Sushant Satyam — a full-stack engineer who builds across AI, data, and product. A decade in, still obsessed with the seams between a clever idea and a system that keeps working on Monday morning.

$ currently

See the work Get in touch

sushant@satyam: ~/whoami

$ whoami

engineer · ai tinkerer · systems thinker

$ cat /etc/interests

LLM agents & tool-use architectures
Synthetic data, privacy, eval frameworks
Algo trading (Indian markets, Zerodha)
Full-stack product engineering
Boring infra that makes the fun stuff possible

$ ls ~/now

synthforge/ codewithcolonel/ algo-trading/ writing/

$ echo $MISSION

"ship small, honest things that compound."

uptime: 10y+

// whoami

Engineer by trade. Tinkerer by temperament.

I've spent a decade around code — building, breaking, rebuilding. My favorite problems sit at the intersection of data, AI, and domain: trading, logistics, procurement, agri-tech. I care about the boring parts: observability, evals, and the small decisions that compound.

AI that actually does something

I ship LLM systems with real constraints: tool-use, memory, evals, budgets, latency. Less demo-ware, more production surfaces.

Data as a first-class citizen

Whether synthetic or scraped, pipelines or Postgres — if the data is wrong, the product is wrong. I spend weeks here so the UI can be honest.

Full-stack, end-to-end

Next.js / NestJS / Prisma / Postgres on one side, Python ML stacks on the other. I like the seams between them — that's where products live.

// selected work

Projects I've shipped, broken, and learned from.

A mix of AI/ML systems, trading infra, logistics, and full-stack product work. Open source projects link out to GitHub; private ones are sketched briefly.

// open source

4 repos · linked

2026·public featured

SynthForge

LLM-native synthetic tabular data at scale

GitHub

Generates high-fidelity synthetic datasets from tiny production samples. Combines Gaussian Copula, CTGAN, TVAE, TabSyn, and Diffusion with LLM-powered semantic inference, PII/MNPI detection, and business-rule extraction. Scales from 1K to 10M+ rows with a built-in eval framework for fidelity, utility, and privacy.

PythonCTGANDiffusionClaudeLiteLLMPresidioPandas

2026·public featured

CodeWithColonel

21-day ML + AI agent curriculum, built in the open

GitHub

A public, two-track learning repo: six AI/LLM projects covering prompt engineering, agent patterns, tool-routing, memory, and RAG — plus a 21-day ML curriculum from data cleaning to deep learning and recommenders. Progress over perfection.

PythonOpenAIRAGAgentsScikit-learn

2025·public

prompt_eng

Prompt engineering notebook

GitHub

A working notebook of prompt engineering patterns — structured output, role conditioning, evals, and tool-use — maintained as I iterate on LLM applications.

Prompt EngineeringLLMs

2026·public

ELBRouting-6899

Load-balancer routing exercise

GitHub

A TypeScript exercise modeling elastic load-balancer routing semantics. A small, focused repo exploring the edge cases of path-based routing.

TypeScript

// private / work

10 repos · brief only

private·2026

SmartTender

AI-assisted tender discovery and bid prep

Ingests public tender corpora, extracts requirements, and drafts bid outlines with citation-grounded rationale. Built for procurement teams buried in PDFs.

PythonLLMsRAGOCR

private·2025

SuperLogistica

Logistics operations platform

A logistics platform experiment — dispatch, tracking, and ops surfaces for last-mile workflows.

TypeScriptNext.js

private·2025

OptiFlowAI

Smarter logistics, seamless supply chains

AI-driven optimization for logistics routing and supply-chain flows. Focused on where planners actually spend their time.

AIOptimizationLogistics

private·2025

DAN_AGRO

Agri-tech: farm data and decision support

Agri-tech product work combining farm data capture with decision-support surfaces for growers and ops teams.

JavaScriptAgri-tech

private·2026

BookBridge

Reading / books product (web)

A web product exploring a books-centric UX. Built on the feature/web branch with a TypeScript stack.

TypeScriptWeb

private·2026

pat-task

Take-home engineering task

A compact Python solution to a take-home engineering challenge — kept private but shippable.

Python

private·2026

Report Analytics Engine

Report analytics and insight engine

An engine for parsing, analyzing, and surfacing insights from report corpora — the plumbing behind a reporting product.

PythonAnalytics

private·2023

DrishtiV3

Drishti — third iteration

Third-generation iteration of a long-running personal product line focused on vision / insight tooling.

CSSWeb

private·2023

drishtiSQL

Query layer for Drishti

SQL layer and query patterns supporting the Drishti product experiments.

SQL

private·2023

trim-adv

Early Python experiments

An older Python sandbox kept for reference — part of how I got here.

Python

// private repos: details intentionally light. happy to walk through any of these over a call.

// lab

Live experiments & interactive artifacts.

Small interactive things I've built or sketched — usually as a Claude artifact. Click through to play with them live.

tool·2026

AI Research Reading Tracker

open

A personal tracker for 26 foundational AI/ML papers + Google's 5-Day AI Agents Intensive whitepapers — all in one place with checkboxes, topic filters, and direct links to every paper. From Attention Is All You Need → Chinchilla → LLaMA → RoPE → FlashAttention → RAG → InstructGPT → DPO → ReAct → DeepSeek-R1 → Scaling Monosemanticity, plus Google's agent series (Intro to Agents, MCP & Tool Use, Context Engineering, Agent Quality, Prototype to Production). Free, no login.

TransformersScaling LawsAlignmentMoEAgentsMCPRAGReading List

$ open /artifact/ai-papers-tracker.html

visualization·2026

Agent Memory & RAG — A 9-Part Deep Dive

open

Soup-to-nuts walkthrough of how agent memory actually works under the hood. Part 1: why agents need memory & the RAG problem statement. Part 2: embedding models (text → vectors, InfoNCE training). Part 3: backpropagation — how embedding weights get learned. Part 4: vector RAG (semantic retrieval). Part 5: Graph RAG & entity extraction. Part 6: the Leiden algorithm for graph communities. Part 7: Microsoft GraphRAG, formalised. Part 8: HNSW — how vector DBs search fast. Part 9: a grand unified architecture connecting every algorithm into one pipeline, with a complete formula reference.

RAGGraphRAGEmbeddingsVector SearchHNSWLeidenKnowledge GraphsBackpropAgent Memory

$ open /artifact/agentic-memory.html

visualization·2026

Sinusoidal Positional Encoding — Geometry & Intuition

open

A complete, self-contained dark editorial explainer (Instrument Serif + DM Mono) that builds positional encoding from first principles across 6 sections: metric goals, single-wave ambiguity, unit-circle intuition, sin-vs-sin+cos comparison, full formula breakdown, and an interactive pair explorer with live computed values and wave-speed shifts for i=0→255.

TransformersPositional EncodingSine/CosineGeometryInteractiveMath Visualisation

$ open /artifact/sine-cosine-explainer.html

visualization·2026

The Transformer Architecture — From Raw Words to Output Tokens

open

A complete first-principles walkthrough of the full Transformer pipeline with a dark editorial aesthetic: encoder/decoder big picture, embedding geometry, positional encoding, interactive attention visualiser, Add & Norm derivation, FFN intuition, decoder generation stepper, output softmax, and training dynamics with live loss curves.

TransformersAttentionEncoder-DecoderPositional EncodingDeep LearningInteractiveEducation

$ open /artifact/transformer-architecture.html

interactive·2026

XGBoost Text Comparison

open

An interactive browser demo that compares two text inputs using similarity features (Jaccard, fuzzy ratio, n-gram overlap, length, common words) and explains how an XGBoost-style classifier would combine them into a final match score and verdict.

XGBoostNLPText SimilarityFeature EngineeringInteractiveClassification

$ open /artifact/xgboost-text-comparison.html

visualization·2026

LiteLLM PyPI Supply Chain Attack — Incident Report

open

A forensic-style interactive report mapping the full attack chain from compromised GitHub Actions to malicious LiteLLM PyPI releases, suppression botnet behavior, indicators of compromise, and an actionable response checklist for engineering teams.

Supply Chain SecurityIncident ResponsePyPICI/CDThreat IntelligenceLiteLLM

$ open /artifact/litellm-supply-chain-incident-report.html

visualization·2026

Data Mesh — The Architecture of Ownership

open

A visual deep dive into Data Mesh: why centralized data platforms bottleneck at scale, the four core principles (domain ownership, data as a product, self-serve platform, federated governance), and when this paradigm is the right fit.

Data MeshData ArchitectureDomain OwnershipData ProductsFederated GovernancePlatform Engineering

$ open /artifact/data-mesh-architecture.html

visualization·2026

SynthForge — Synthetic Data Generation with LLM-Augmented Pipelines

open

A deep interactive walkthrough of SynthForge: six synthesis backends, LLM-augmented schema + privacy intelligence, and a five-layer evaluation stack for generating high-fidelity synthetic tabular data from small production samples.

Synthetic DataLLMsDiffusionTabSynPrivacyEvaluationPython

$ open /artifact/synthforge-synthetic-data.html

visualization·2026

Agentic SDLC — The New Operating Model

open

A strategic visual explainer of how software delivery shifts from human handoffs to agent-orchestrated execution: intent synthesis, parallel implementation, sentinel quality gates, MCP-enabled toolchains, and phased enterprise migration.

Agentic SDLCMCPMulti-Agent SystemsDevOpsSoftware ArchitectureEnterprise AI

$ open /artifact/agentic-sdlc-operating-model.html

// activity

Shipping, measurably.

A live snapshot of the last 365 days on github.com/sushantsatyam. Reflects public commits by default — private repo contributions show up too if enabled in GitHub profile settings.

last 365d

566

active days

current streak

longest streak

this month

132

best day

38 · Mar 22

$ git log --since=1.year --count-days

lessmore

hover a square for details

$ commits --by-month

last 12 months

May

Jun

Jul

103

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

132

Apr

// stack

The tools I reach for.

A working set, not a museum. I'm opinionated about a few things and pragmatic about the rest.

languages

PythonTypeScriptJavaScriptSQLJava

ai / ml

LLM agentsRAGPrompt engineeringCTGAN / TVAEDiffusionClaude / OpenAI SDKsScikit-learnPyTorch

backend

NestJSFastAPIPostgresPrismaRedisRESTWebSockets

frontend

Next.jsReactTailwindFramer Motion

infra

VercelDockerGitHub ActionsAWS

domains

Fintech / Algo TradingLogisticsAgri-techProcurementData platforms

// contact

Let's build something useful.

I'm open to interesting problems — AI systems, data platforms, fintech, or anything where the domain is messy and the bar is high.

$ cat ./handshake.sh

sushant.satyam@gmail.com

connect →

github

github.com/sushantsatyam

connect →

linkedin.com/in/sushant-satyam

connect →

medium

medium.com/@techdoctrinewithcolonel

connect →

web

portfolio-woad-theta-wtkbfncyjn.vercel.app

connect →

$ echo "best way in: a short email with the problem you're solving."

I build systems thatthink, scale, stay up.

Engineer by trade. Tinkerer by temperament.

AI that actually does something

Data as a first-class citizen

Full-stack, end-to-end

Projects I've shipped, broken, and learned from.

// open source

SynthForge

CodeWithColonel

prompt_eng

ELBRouting-6899

// private / work

SmartTender

SuperLogistica

OptiFlowAI

DAN_AGRO

BookBridge

pat-task

Report Analytics Engine

DrishtiV3

drishtiSQL

trim-adv

Live experiments & interactive artifacts.

Shipping, measurably.

The tools I reach for.

Let's build something useful.

I build systems that
think, scale, stay up.