Principal Engineer @ Gainsight

Building distributed systems that power customer success at scale

I architect the data infrastructure, search platforms, and AI systems behind Gainsight's enterprise SaaS, from pipeline engines processing billions of records daily, to AI-powered search with dual-LLM providers and an MCP server that lets Claude and ChatGPT query community knowledge natively.

11+
Years Engineering
10+
Production Systems
35+
SaaS Connectors
600+
Enterprise Tenants
61B+records processed / day
14 TBdata processed / day
93K+jobs completed / day
540K+tasks executed / day
25M+community records indexed
35K+rules executed / day
98.7%job success rate
61B+records processed / day
14 TBdata processed / day
93K+jobs completed / day
540K+tasks executed / day
25M+community records indexed
35K+rules executed / day
98.7%job success rate
Abhishekh Singh

Principal Engineer at Gainsight with 11+ years building production distributed systems across two product lines: Customer Success (CS) and Customer Communities (CC). Based in Hyderabad, India. B.Tech in Computer Science from BBDITM, Lucknow.

On the CS side, I built the Data Highway pipeline engine, the graph-based permission system used by 7+ services, and the event processing platform with cross-region SQS failover. On the CC side, I architected the AI-native search platform: three LLM-powered features, a production MCP server exposing dozens of tools, and a federated search system indexing records from 5 sources.

My work sits at the intersection of distributed systems, data engineering, and AI, building the kind of infrastructure that other teams build their products on.

Platform Architecture

10+ production services across Java (Vert.x, Spring Boot) and Python: DAG orchestration, event-driven pipelines, graph-based RBAC, and real-time search

AI & MCP Integration

Production MCP server exposing dozens of tools for Claude/ChatGPT. Three AI features: ReplyBot (OpenAI + Gemini fallback), AI Answers Summary (SSE streaming), Smart Search Summary (Llama 3 70B)

Data at Scale

61B+ records processed daily, 25M+ search records indexed, 35+ SaaS connectors, dual-warehouse analytics (Postgres + Redshift), S3 Parquet data lake

Security-First Design

Graph-based RBAC with Neo4j, JWT with Redis-backed revocation, permission-scoped Algolia tokens, process-isolated task sandboxing

Building at Every Layer of the Stack
A decade of progressive engineering ownership, from automation frameworks to architecting the data and AI backbone of an enterprise SaaS platform.
APR 2025 - PRESENT
Principal Engineer
Gainsight · Hyderabad
Architecting the Search & AI platform: production MCP server for AI assistants, three LLM-powered features with cross-provider fallback, search analytics data lake, and federated search indexing across multiple content sources. Leading the platform's AI-native evolution.
AUG 2019 - MAR 2025
Lead Software Engineer
Gainsight · Hyderabad
Led architecture of the Data Highway pipeline engine scaling to billions of records daily across hundreds of enterprise tenants. Built the Usage Data Model analytics platform, community search system, and enterprise webhook delivery with IAM SigV4 signing.
JUL 2017 - AUG 2019
Senior Software Engineer
Gainsight · Hyderabad
Built the Usage Data Model (UDM) analytics platform with dual-warehouse abstraction (Postgres + Redshift), Event Processing pub/sub system, and core Data Highway infrastructure.
JUL 2016 - JUN 2017
Software Engineer
Gainsight · Hyderabad
Developed the graph-based permission system (Neo4j), centralized Config Server, Audit Service, and contributed to the Data Highway DAG engine with 60+ custom Drill UDFs.
DEC 2015 - JUN 2016
Software Engineer
Talentica Software · Pune
Engineered the Intelligence Machine, a big data recommendation engine processing terabytes of social media data daily. Built performance benchmarking infrastructure and data-driven test automation across distributed systems.
AUG 2014 - DEC 2015
Software Analyst
Nihilent · Pune
Delivered 5 projects for MultiChoice Africa (Naspers subsidiary), a major media platform serving millions of subscribers. Contributed to open-source technology solutions spanning .NET, Java, PHP, and Oracle.
Systems I've Built & Architected
Production systems powering Gainsight's enterprise platform, from data pipelines to AI-powered search.
Across Two Product Lines

My Contributions to Gainsight's CustomerOS

Gainsight CustomerOS · multiple repositories, cross-repo connections across two product lines CS · Customer Success Data infrastructure · 6 core services APPLICATION LAYER Analytics Data Store 91 modules · Spring Usage Data Model Vert.x · Redshift Bionic Reporting Spring · Presto PLATFORM SERVICES Data Highway DAG Engine · Drill · 35 Connectors Permissions Neo4j · RBAC Events SQS · Pub/Sub CROSS-CUTTING (used by ALL services) Config Server Audit SHARED SDK (JAR dependencies in every repo) Shared SDK: commons · async framework · audit client · config client · query builder · storage · data core ALSO CONTRIBUTED Data Exchange Hub Adoption Tracker CC-to-CS Bridge Query Federation DAGs DAGs CC · Customer Communities Search · AI · MCP Protocol Community MCP Server MCP Protocol · Streamable HTTP · OAuth2 · AI Assistants Synchronized Search Service Indexing · Tokens · Analytics · ReplyBot Algolia Search · AskAI AI Features ReplyBot · Smart Summary Federated: Zendesk · Salesforce · Freshdesk · Skilljar · Community Analytics: S3 Parquet + Athena · 25M+ records indexed Redis DynamoDB MySQL SNS/SQS 61B+ records/day · 14 TB processed · 600+ tenants 25M+ indexed · 5 federated sources · 3 AI features ━━ primary dependency ╌╌ SDK/client usage ● animated data flow
AI · SEARCH · LLM

Community Search & AI Platform

CC (Customer Communities) · Three AI features, federated search, and a provider-agnostic AI orchestration layer
Community Platform · AI Orchestration Layer Service Layer · Provider Factory · Automatic Failover · Adapters Composable prompt architecture (model, version, temperature, reasoning) OpenAI Provider Gemini Provider Summary Engine Automatic cross-provider failover with error recovery and retry ReplyBot Messagebus trigger OpenAI ↔ Gemini fallback AI Answers Summary SSE streaming to client Algolia AskAI + dedup Smart Search Summary Blocking response Llama 3 70B via Bedrock Synchronized Search Index builder · Token issuer Analytics data lake 25M+ records · 600+ communities Algolia Multi-app (APP1/APP2) · AskAI Secured API keys · Scoped tokens Federated Sources Zendesk · Salesforce · Freshdesk · Skilljar SNS Messagebus ~95 events · Webhooks · Connectors MySQL Redis S3 API Gateway · Authentication · OAuth2 · Rate Limiting · Request Routing
The AI brain of Gainsight Communities. A provider-agnostic AI orchestration layer powers three flagship features through a factory pattern: primary plus fallback LLM chains with automatic cross-provider retry, bidirectional payload transformation for provider compatibility, and graceful degradation on any failure. Synchronized Search Service indexes 25M+ records from 5 federated sources to Algolia with zero-downtime index swaps.
PHP (Yii/Symfony)Algolia AskAIOpenAIGeminiLlama 3 70BAWS BedrockSSE StreamingPythonRedisS3 ParquetAthena
  • Cross-provider LLM fallback: dedicated adapter per provider handles payload asymmetry, with automatic retry across providers even on client errors to prevent single-provider outages
  • SSE streaming with thinking-phase deduplication: stateful stream processor filters duplicate reasoning blocks per model family while preserving final output fidelity
  • Domain-driven prompt architecture: composable prompt traits (model, version, temperature, reasoning effort) enable per-feature model versioning and A/B testing without code duplication
  • Dynamic rate limiting with zero-restart retuning: atomic Redis counters (no locks, no races), thresholds configurable at runtime so ops can retune without a deploy
  • Zero-downtime full sync: bulk-index 25M+ records into temp Algolia index, atomic rename over production. No search outage during reindex
  • Analytics data lake on S3 Parquet + Athena with tag expression consolidation reducing Algolia API calls by 90%+
MCP PROTOCOL · PYTHON · AI INFRASTRUCTURE

Customer Community MCP Server

CC · Production Model Context Protocol server making community knowledge accessible to AI assistants
Customer Community MCP Server · request flow Claude ChatGPT Cursor Custom Clients MCP over Streamable HTTP or stdio CloudFront + Lambda@Edge · Region Router MCP Gateway OAuth2 Proxy PKCE · dynamic client reg Tool Router domain dispatch Rate Limiter per-tenant Cache TTL + LRU Tools exposed, grouped by domain (read-only + administrative) Search Analytics Admin Education Community Search unified index Analytics usage metrics Admin moderation Education courses Community API discussions Algolia Redis DynamoDB Multi-region deployment · permission-scoped tokens · per-tenant rate limiting · graceful degradation
Production MCP server powering AI assistant integrations for enterprise community platforms. Handles the full OAuth2.1 authorization lifecycle with PKCE and dynamic client registration. Routes tool invocations to multiple backend services covering search, analytics, administration, and education. Enforces tenant-scoped security at every layer with permission-scoped tokens and per-tenant rate limiting. Deployed across multiple regions with edge-based request routing.
PythonFastMCPStreamable HTTPOAuth2.1 PKCEJWTRedisDynamoDBAlgoliaECS FargateCloudFrontLambda@Edge
  • Full OAuth2.1 authorization flow with PKCE and dynamic client registration, listed on major AI assistant connector registries
  • Multi-region deployment with edge-based request routing for geographic locality and failover
  • Permission-scoped search tokens with access control baked at the token layer, not post-filtered
  • Per-tenant rate limiting, connection lifecycle management, and graceful degradation on backend failures
  • Dozens of tools spanning search, analytics, content moderation, user management, and education backends behind a single MCP interface
DISTRIBUTED SYSTEMS · DAG ENGINE

Data Highway: DAG-Based Pipeline Engine

CS (Customer Success) · The compute backbone processing 61B+ records/day
API Interface · REST endpoints for DAG submission, monitoring, control Flow Manager DAG orchestration · State machine Zookeeper (×3) Leader election · Locks Launcher Task dispatch · Resource alloc PostgreSQL Redis DAG Workers (×N) Java · Vert.x · 60+ UDFs Dynamic Workers 35 connector JARs · Isolated Python Workers Celery · ML tasks Apache Drill (×2) SQL engine DAG Consumers: Analytics Platform · Reporting Engine · Data Exchange · Platform Services Analytics Data Store · S3 · Redshift · SQS · Connection Pooler
A distributed data pipeline platform for enterprise SaaS. Ingests data from tenant systems via 35+ connectors, runs transformations through a SQL engine, and loads results into analytics data stores and warehouses. A directed-acyclic graph of tasks coordinated by leader election and workflow orchestration across a horizontally-scalable worker cluster.
Java 8Vert.xApache DrillZookeeperPython/CeleryPostgreSQLRedisAWS SQSS3
  • Processes 61B+ records/day across 600+ enterprise tenants at 98.7% job success rate
  • Multi-worker architecture: DAG Workers (Java/Vert.x) + Dynamic Workers (35 isolated connector JARs) + Python/Celery ML Workers + Apache Drill SQL engine
  • 60+ custom Apache Drill UDFs for tenant-specific data transformations
  • Horizontally-scalable cluster coordinated by Zookeeper leader election and distributed workflow orchestration
  • Failure-injection tasks (OOM, disk exhaustion, FD leak) as production resilience probes
35+ Enterprise Connectors
SalesforceSalesforce
HubSpotHubSpot
SnowflakeSnowflake
BigQueryBigQuery
DatabricksDatabricks
SAPSAP
ZendeskZendesk
ZoomZoom
MS TeamsMS Teams
IntercomIntercom
MySQLMySQL
PostgreSQLPostgreSQL
MSSQLMSSQL
GA4GA4
+ 21 more
SECURITY · GRAPH DB

User Permissions (RBAC)

CS · Graph-native access control powering authorization across 7+ services
permissions-api (Vert.x × 4 verticles) Auth → CORS → Router → Policy Evaluator Neo4j Permission graph Redis Eval cache Msg Queue Async sync permissions-worker (N instances) Heartbeat leader · Onboarding · Group sync · Scavenger cleanup Permission SDK Consumers (7+ services) AnalyticsRow-level data access control Query Fed.Query federation access control ReportingReport-level permission checks API GWJWT filter + resource authorization Data StoreResource-level access enforcement CatalogCatalog access validation CC BridgeCross-product permission bridge SDK
Centralized multi-tenant RBAC service. Permissions stored as a graph: a single traversal answers "can user U do action A on resource R?" in milliseconds, with Redis-cached evaluations and expression-based policy conditions. The permission SDK is consumed by 7+ services across the platform.
Neo4jVert.xRedisMessage QueuePostgreSQL
  • Graph-native model: resources, policies, groups, and manager chains as traversable edges, so a single graph traversal resolves access in milliseconds
  • Targeted Redis invalidation: editing a sharing group invalidates only affected members × resources, not the full cache
  • Row-level data enforcement: restricted data never leaves the database, access control enforced at SQL query-build time
  • Permission SDK consumed by 7+ platform services: analytics, query federation, reporting, API gateway, data store, catalog, and cross-product bridge
  • Heartbeat-based leader election for worker coordination with automatic cleanup of orphaned graph nodes
ANALYTICS · WAREHOUSING

Usage Data Model (UDM)

CS · Multi-tenant analytics orchestration across dual warehouses
UDM API (Vert.x) Analyzer · RunNow · Admin · Migration Message Queue Multiple queues · async dispatch UDM Worker (multi-pool) Scheduler · Submitter · DataLoad · Archival · QueryBuilder Data Highway DAG submission via internal API PostgreSQL Per-tenant DDL Redshift High-volume OLAP Redis Conn Pool Permissions Row-level RBAC Data Store API callouts ~25 pluggable task builders · Dual-warehouse abstraction · Template-driven DDL · Permissions injected at query time
Full-lifecycle analytics engine: onboarding, schema definition, ingestion from 10+ sources, transformation, and warehouse storage. Powers Gainsight dashboards with per-tenant dynamic DDL and dual-warehouse abstraction (PostgreSQL for small tenants, Redshift for high-volume).
Vert.xRedshiftPostgreSQLMessage QueueRedis
  • ~25 pluggable task builders: adding a new data source is a single builder class
  • Dual-warehouse abstraction: PostgreSQL for small tenants, Redshift for high-volume OLAP, switched transparently
  • Permissions injected at query-build time: restricted data never leaves the database
  • Cross-system DAG submission to the pipeline engine with retry and backoff
  • Secure multi-tenant data exchange with per-tenant credential isolation
PUB/SUB · EVENT-DRIVEN

Event Processing

CS · Four-tier async pub/sub with durable delivery and cross-region failover
TIER 1 TIER 2 TIER 3-4 Receiver (×3) Vert.x · Auth · Rate limit · Persist SQS Primary SQS Secondary (failover) Enricher Queue Data lookup · Enrichment Redis Postgres Router (×N) Fan-out · Broadcast queues · Tenant routing Relayer (×N) Delayed delivery · Retry with backoff Broadcaster (×N) HTTP delivery · Status tracking Subscribers · HTTP webhooks · Internal services · External customer endpoints Self-healing delivery recovery · Cross-region queue failover · Persistent routing state
Async event fan-out with durable delivery. Publishers emit once; subscribers register for events with optional delayed delivery and data enrichment. Four-tier architecture with purpose-matched runtimes: Receiver → Router → Relayer → Broadcaster.
Vert.xAWS SQSApache HttpAsyncPostgreSQLRedis
  • Cross-region SQS failover: automatic switch to secondary queues on publish errors
  • Self-healing delivery: dedicated recovery process detects and recovers orphaned delivery rows after worker crashes
  • Blocked-tenant handling isolates noisy tenants from starving other tenants' event delivery
  • Enrichment loop: events decorated with business context from the analytics data store before delivery
  • Tiered architecture: Receiver → Router → Relayer → Broadcaster with purpose-matched runtimes
COMPLIANCE · OBSERVABILITY

Audit Service

Centralized event capture & compliance trail
Reactive audit service for compliance trails. Fire-and-forget writes via Vert.x event bus decouple caller latency from Postgres persistence. Per-tenant tables created lazily on first write. Used by every service in the platform via a lightweight audit client SDK.
Vert.xPostgreSQLRedis
  • Async write via event bus: HTTP 200 returns before DB insert completes
  • Random-tenant retention cleanup with +10 day safety buffer
  • Separate read/write DB pools to prevent analytics from starving ingestion
INFRASTRUCTURE · CONFIG

Config Server

Centralized configuration for all CS services
Spring Boot + Vert.x hybrid config service. Every platform service pulls runtime configuration from here (DB strings, feature flags, encrypted secrets) with full audit trail and periodic refresh. The config client SDK is embedded across all Java services.
Spring BootVert.xPostgreSQLAES Encryption
  • Upsert + append-only version history table in single transaction
  • Database-enforced integrity with case-insensitive keys and FK cascades
  • Redis-cached auth lookups with fail-fast startup ping
Projects I Build in the Open
Personal projects, open source contributions, and developer tools.
ANALYTICS · OPEN SOURCE

GoatCounter Dashboard

A modern alternative analytics dashboard for privacy-first web analytics
GoatCounter Dashboard · data flow Single HTML File React + Recharts · no server · no build step · host anywhere GoatCounter API · read-only, token auth GoatCounter API KPI Cards period-over-period trend Traffic Chart area chart World Map choropleth · sqrt scale click any data point to drill down Referrers Versions Regions Campaigns Devices Languages Response Cache · 60s TTL
A single HTML file that replaces GoatCounter's default analytics UI with interactive charts, a choropleth world map, drill-down navigation, and period-over-period trend comparison. No server, no build step. Host it anywhere for free.
HTMLCSSJavaScriptReactRechartsGoatCounter APIGitHub Pages
  • Interactive visualizations: area charts for traffic, donut charts for browsers, operating systems, and devices, horizontal bars for countries and languages
  • Choropleth world map with square-root scaling, hover tooltips, and a gradient legend
  • Click any data point to drill down: pages show referrers, browsers show versions, countries show regions, campaigns show source URLs
  • Built-in demo mode with realistic sample data, works without an API key or account
  • Smart loading: primary KPIs render first, secondary panels lazy-fetch on scroll, and a 60-second response cache makes navigation instant
  • Dark and light theme with system preference detection, persisted across sessions
SECURITY · DEVELOPER TOOLS · OPEN SOURCE

mcp-halflist

CI-first conformance, security, and performance testing for MCP servers
PyPI version Supported Python versions
$pip install mcp-halflist
$halflist check <server-url>
mcp-halflist · CI test flow halflist CLI check · bench · audit · watch · report · pin stdio local process Streamable HTTP remote server SSE Fallback legacy transport OAuth2 PKCE · auto-discovery 5 Test Suites handshake · tools resources · prompts Bench Engine p50 · p95 · p99 latency Security Scanner offline · injection exfiltration · rug-pull Terminal JUnit XML HTML JSON Markdown SVG Badge offline-first · no API keys · CI-native · official GitHub Action
A CLI tool that tests MCP servers the way pytest tests code. Five test suites cover the full protocol (handshake, tools, resources, prompts) plus offline security scanning for prompt injection, data exfiltration, and rug-pull attacks. Plugs into any CI pipeline with JUnit XML output and a ready-made GitHub Action.
PythonTyperPydantichttpxMCP SDKOAuth2.1 PKCEPyPI
  • Five test suites covering the full MCP protocol, plus offline security scanning for prompt injection, data exfiltration, cross-tool manipulation, and supply chain attacks
  • OAuth2 PKCE with auto-discovery: provide the server URL and the tool discovers auth endpoints, opens a browser, then handles token exchange and caching automatically
  • Debug logging with multi-layer request and response tracing (MCP operations, HTTP wire, SDK internals) and credential masking
  • JUnit XML output for CI integration with GitHub Actions, GitLab, and Jenkins
  • Config file (halflist.toml) with environment variable expansion, so CI pipelines and teams share the same testing configuration
  • GitHub Action available: three lines of YAML to add MCP server testing to any workflow
Technologies & Tools
A decade of hands-on experience across languages, frameworks, cloud services, and data systems.

Languages & Frameworks

Java
Java 8+
Python
Python 3.x
Vert.x
Vert.x
Spring Boot
Spring Boot
FastAPI
FastAPI
FastMCP
FastMCP
PHP
PHP (Yii)

Cloud & Infrastructure

ECS
AWS ECS Fargate
λ
AWS Lambda
S3
AWS S3
SQ
AWS SQS / SNS
DB
AWS DynamoDB
CF
CloudFormation
Docker
Docker
CW
CloudWatch

Data & Search

PostgreSQL
PostgreSQL
Redis
Redis
Neo4j
Neo4j
Algolia
Algolia
Apache Drill
Apache Drill
RS
Redshift
MongoDB
MongoDB
At
Athena
MySQL
MySQL
Zookeeper
Zookeeper

AI & Protocols

MCP
MCP Protocol
SSE
SSE Transport
OpenAI
OpenAI / GPT
Gemini
Google Gemini
Llama
Llama 3
Algolia AskAI
Algolia AskAI
JWT
JWT / PyJWT

DevOps & Quality

GitHub Actions
GitHub Actions
Bitbucket
Bitbucket Pipelines
SonarCloud
SonarCloud
Maven
Maven
pytest
pytest
JUnit
JUnit
REST Assured
REST Assured
Background

B.Tech, Computer Science & Engineering

BABU BANARASI DAS Institute of Technology & Management, Lucknow (2010–2014) · 75.16%

Intermediate, Mathematics

Rastriya Inter College (2008–2010) · 73.6%

Languages

English (Full Professional) · Hindi (Native)

Awards & Leadership

Vice-Captain, Nihilentia. Selected to lead Nihilent's annual fest (Dec 2014). Successfully organized events and competitions.

IT-FIESTA 2011. Appreciation Certificate awarded by T. Ashok (Founder & CEO, Stag Software Pvt. Ltd.)

UTKARSH-13 & 14. Coordinated Technical Events at BBDGEI (2013 & 2014)

ENCORE-12. Honor & Excellence Certificate at IET Lucknow (Oct 2012)

Techsamagam-11. Certificate Of Merit, FORCE CIC Education (Nov 2011)

3rd Position, UP Board. Block-level topper in Jaunpur, High School Examinations (2007)

What Colleagues Say
"Abhishekh is an outstanding person, great to work with and a real self starter. He is very effective in team and testing related activities. He has strong and broad technical and automation expertise. Abhishekh is one of the top people I have met and worked with in my career."
Rajesh Chejerla
Principal Engineer at Salesforce
"Abhishekh is very passionate and has great vision for his work. His focus keeps everything moving smoothly, he makes sure all the deadlines are met, and makes sure that whatever project he is working on meets the highest standards."
Anant Agrawal
Engineering Manager | CyberSecurity | Scale Products
"Abhishekh successfully automated our application and single handedly managed all testing activities. He is extremely enthusiastic about his work which is infectious, makes it fun to learn. He always has a creative, positive outlook and he's good at organizing and bringing people together."
Shivesh Verma
Senior Software Engineer (JL4) at Maersk
"I had the privilege of working with Abhishek on a couple of academic projects during our B.Tech. He is a highly organized, goal oriented, independent and hard working perfectionist. He has very good analytical and interpersonal skills. He is the most motivated and enthusiastic person I have ever met."
Kuldeep Kr
Staff Software Engineer @ ServiceNow | Ex-Microsoft
"Abhishek has tremendous knowledge of Testing, always there to help for problems related to testing. His rational approach of making complex topics very simpler by relating it with day to day examples was amazing. That was the beauty of his approach in System Testing and Integration Testing."
Rupender Rana
Technical Lead, iOS at Nagarro
Let's Connect
Open to conversations about distributed systems, AI infrastructure, and platform engineering.