booting engineer.exe — session: 2026

I build systems that
think, scale, stay up.

I'm Sushant Satyam — a full-stack engineer who builds across AI, data, and product. A decade in, still obsessed with the seams between a clever idea and a system that keeps working on Monday morning.

$ currently
sushant@satyam: ~/whoami
$ whoami
engineer · ai tinkerer · systems thinker
$ cat /etc/interests
  • LLM agents & tool-use architectures
  • Synthetic data, privacy, eval frameworks
  • Algo trading (Indian markets, Zerodha)
  • Full-stack product engineering
  • Boring infra that makes the fun stuff possible
$ ls ~/now
synthforge/ codewithcolonel/ algo-trading/ writing/
$ echo $MISSION
"ship small, honest things that compound."
$
// whoami

Engineer by trade. Tinkerer by temperament.

I've spent a decade around code — building, breaking, rebuilding. My favorite problems sit at the intersection of data, AI, and domain: trading, logistics, procurement, agri-tech. I care about the boring parts: observability, evals, and the small decisions that compound.

AI that actually does something

I ship LLM systems with real constraints: tool-use, memory, evals, budgets, latency. Less demo-ware, more production surfaces.

Data as a first-class citizen

Whether synthetic or scraped, pipelines or Postgres — if the data is wrong, the product is wrong. I spend weeks here so the UI can be honest.

Full-stack, end-to-end

Next.js / NestJS / Prisma / Postgres on one side, Python ML stacks on the other. I like the seams between them — that's where products live.

// selected work

Projects I've shipped, broken, and learned from.

A mix of AI/ML systems, trading infra, logistics, and full-stack product work. Open source projects link out to GitHub; private ones are sketched briefly.

// open source

4 repos · linked
2026·public featured

SynthForge

LLM-native synthetic tabular data at scale
GitHub

Generates high-fidelity synthetic datasets from tiny production samples. Combines Gaussian Copula, CTGAN, TVAE, TabSyn, and Diffusion with LLM-powered semantic inference, PII/MNPI detection, and business-rule extraction. Scales from 1K to 10M+ rows with a built-in eval framework for fidelity, utility, and privacy.

PythonCTGANDiffusionClaudeLiteLLMPresidioPandas
2026·public featured

CodeWithColonel

21-day ML + AI agent curriculum, built in the open
GitHub

A public, two-track learning repo: six AI/LLM projects covering prompt engineering, agent patterns, tool-routing, memory, and RAG — plus a 21-day ML curriculum from data cleaning to deep learning and recommenders. Progress over perfection.

PythonOpenAIRAGAgentsScikit-learn
2025·public

prompt_eng

Prompt engineering notebook
GitHub

A working notebook of prompt engineering patterns — structured output, role conditioning, evals, and tool-use — maintained as I iterate on LLM applications.

Prompt EngineeringLLMs
2026·public

ELBRouting-6899

Load-balancer routing exercise
GitHub

A TypeScript exercise modeling elastic load-balancer routing semantics. A small, focused repo exploring the edge cases of path-based routing.

TypeScript

// private / work

10 repos · brief only
private·2026

SmartTender

AI-assisted tender discovery and bid prep

Ingests public tender corpora, extracts requirements, and drafts bid outlines with citation-grounded rationale. Built for procurement teams buried in PDFs.

PythonLLMsRAGOCR
private·2025

SuperLogistica

Logistics operations platform

A logistics platform experiment — dispatch, tracking, and ops surfaces for last-mile workflows.

TypeScriptNext.js
private·2025

OptiFlowAI

Smarter logistics, seamless supply chains

AI-driven optimization for logistics routing and supply-chain flows. Focused on where planners actually spend their time.

AIOptimizationLogistics
private·2025

DAN_AGRO

Agri-tech: farm data and decision support

Agri-tech product work combining farm data capture with decision-support surfaces for growers and ops teams.

JavaScriptAgri-tech
private·2026

BookBridge

Reading / books product (web)

A web product exploring a books-centric UX. Built on the feature/web branch with a TypeScript stack.

TypeScriptWeb
private·2026

pat-task

Take-home engineering task

A compact Python solution to a take-home engineering challenge — kept private but shippable.

Python
private·2026

Report Analytics Engine

Report analytics and insight engine

An engine for parsing, analyzing, and surfacing insights from report corpora — the plumbing behind a reporting product.

PythonAnalytics
private·2023

DrishtiV3

Drishti — third iteration

Third-generation iteration of a long-running personal product line focused on vision / insight tooling.

CSSWeb
private·2023

drishtiSQL

Query layer for Drishti

SQL layer and query patterns supporting the Drishti product experiments.

SQL
private·2023

trim-adv

Early Python experiments

An older Python sandbox kept for reference — part of how I got here.

Python

// private repos: details intentionally light. happy to walk through any of these over a call.

// lab

Live experiments & interactive artifacts.

Small interactive things I've built or sketched — usually as a Claude artifact. Click through to play with them live.

tool·2026
AI Research Reading Tracker
open

A personal tracker for 26 foundational AI/ML papers + Google's 5-Day AI Agents Intensive whitepapers — all in one place with checkboxes, topic filters, and direct links to every paper. From Attention Is All You Need → Chinchilla → LLaMA → RoPE → FlashAttention → RAG → InstructGPT → DPO → ReAct → DeepSeek-R1 → Scaling Monosemanticity, plus Google's agent series (Intro to Agents, MCP & Tool Use, Context Engineering, Agent Quality, Prototype to Production). Free, no login.

TransformersScaling LawsAlignmentMoEAgentsMCPRAGReading List
$ open /artifact/ai-papers-tracker.html
visualization·2026
Agent Memory & RAG — A 9-Part Deep Dive
open

Soup-to-nuts walkthrough of how agent memory actually works under the hood. Part 1: why agents need memory & the RAG problem statement. Part 2: embedding models (text → vectors, InfoNCE training). Part 3: backpropagation — how embedding weights get learned. Part 4: vector RAG (semantic retrieval). Part 5: Graph RAG & entity extraction. Part 6: the Leiden algorithm for graph communities. Part 7: Microsoft GraphRAG, formalised. Part 8: HNSW — how vector DBs search fast. Part 9: a grand unified architecture connecting every algorithm into one pipeline, with a complete formula reference.

RAGGraphRAGEmbeddingsVector SearchHNSWLeidenKnowledge GraphsBackpropAgent Memory
$ open /artifact/agentic-memory.html
visualization·2026
Sinusoidal Positional Encoding — Geometry & Intuition
open

A complete, self-contained dark editorial explainer (Instrument Serif + DM Mono) that builds positional encoding from first principles across 6 sections: metric goals, single-wave ambiguity, unit-circle intuition, sin-vs-sin+cos comparison, full formula breakdown, and an interactive pair explorer with live computed values and wave-speed shifts for i=0→255.

TransformersPositional EncodingSine/CosineGeometryInteractiveMath Visualisation
$ open /artifact/sine-cosine-explainer.html
visualization·2026
The Transformer Architecture — From Raw Words to Output Tokens
open

A complete first-principles walkthrough of the full Transformer pipeline with a dark editorial aesthetic: encoder/decoder big picture, embedding geometry, positional encoding, interactive attention visualiser, Add & Norm derivation, FFN intuition, decoder generation stepper, output softmax, and training dynamics with live loss curves.

TransformersAttentionEncoder-DecoderPositional EncodingDeep LearningInteractiveEducation
$ open /artifact/transformer-architecture.html
interactive·2026
XGBoost Text Comparison
open

An interactive browser demo that compares two text inputs using similarity features (Jaccard, fuzzy ratio, n-gram overlap, length, common words) and explains how an XGBoost-style classifier would combine them into a final match score and verdict.

XGBoostNLPText SimilarityFeature EngineeringInteractiveClassification
$ open /artifact/xgboost-text-comparison.html
visualization·2026
LiteLLM PyPI Supply Chain Attack — Incident Report
open

A forensic-style interactive report mapping the full attack chain from compromised GitHub Actions to malicious LiteLLM PyPI releases, suppression botnet behavior, indicators of compromise, and an actionable response checklist for engineering teams.

Supply Chain SecurityIncident ResponsePyPICI/CDThreat IntelligenceLiteLLM
$ open /artifact/litellm-supply-chain-incident-report.html
visualization·2026
Data Mesh — The Architecture of Ownership
open

A visual deep dive into Data Mesh: why centralized data platforms bottleneck at scale, the four core principles (domain ownership, data as a product, self-serve platform, federated governance), and when this paradigm is the right fit.

Data MeshData ArchitectureDomain OwnershipData ProductsFederated GovernancePlatform Engineering
$ open /artifact/data-mesh-architecture.html
visualization·2026
SynthForge — Synthetic Data Generation with LLM-Augmented Pipelines
open

A deep interactive walkthrough of SynthForge: six synthesis backends, LLM-augmented schema + privacy intelligence, and a five-layer evaluation stack for generating high-fidelity synthetic tabular data from small production samples.

Synthetic DataLLMsDiffusionTabSynPrivacyEvaluationPython
$ open /artifact/synthforge-synthetic-data.html
visualization·2026
Agentic SDLC — The New Operating Model
open

A strategic visual explainer of how software delivery shifts from human handoffs to agent-orchestrated execution: intent synthesis, parallel implementation, sentinel quality gates, MCP-enabled toolchains, and phased enterprise migration.

Agentic SDLCMCPMulti-Agent SystemsDevOpsSoftware ArchitectureEnterprise AI
$ open /artifact/agentic-sdlc-operating-model.html
// activity

Shipping, measurably.

A live snapshot of the last 365 days on github.com/sushantsatyam. Reflects public commits by default — private repo contributions show up too if enabled in GitHub profile settings.

last 365d
566
active days
59
current streak
9d
longest streak
9d
this month
132
best day
38 · Mar 22
$ git log --since=1.year --count-days
lessmore
hover a square for details
$ commits --by-month
last 12 months
10
May
4
Jun
59
Jul
103
Aug
5
Sep
0
Oct
3
Nov
55
Dec
72
Jan
48
Feb
75
Mar
132
Apr
// stack

The tools I reach for.

A working set, not a museum. I'm opinionated about a few things and pragmatic about the rest.

languages
PythonTypeScriptJavaScriptSQLJava
ai / ml
LLM agentsRAGPrompt engineeringCTGAN / TVAEDiffusionClaude / OpenAI SDKsScikit-learnPyTorch
backend
NestJSFastAPIPostgresPrismaRedisRESTWebSockets
frontend
Next.jsReactTailwindFramer Motion
infra
VercelDockerGitHub ActionsAWS
domains
Fintech / Algo TradingLogisticsAgri-techProcurementData platforms
// contact

Let's build something useful.

I'm open to interesting problems — AI systems, data platforms, fintech, or anything where the domain is messy and the bar is high.