Principal Engineer @ Gainsight

Building distributed systems that power customer success at scale

I architect the data infrastructure, search platforms, and AI systems behind Gainsight's enterprise SaaS — from pipeline engines processing billions of records daily, to AI-powered search with dual-LLM providers and an MCP server that lets Claude and ChatGPT query community knowledge natively.

11+
Years Engineering
10+
Production Systems
35+
SaaS Connectors
600+
Enterprise Tenants
61B+records processed / day
14 TBdata processed / day
93K+jobs completed / day
540K+tasks executed / day
25M+community records indexed
35K+rules executed / day
98.7%job success rate
61B+records processed / day
14 TBdata processed / day
93K+jobs completed / day
540K+tasks executed / day
25M+community records indexed
35K+rules executed / day
98.7%job success rate
Abhishekh Singh

Principal Engineer at Gainsight with 11+ years building production distributed systems across two product lines — Customer Success (CS) and Customer Communities (CC). Based in Hyderabad, India. B.Tech in Computer Science from BBDITM, Lucknow.

On the CS side, I built the Data Highway pipeline engine, the graph-based permission system used by 7+ services, and the event processing platform with cross-region SQS failover. On the CC side, I architected the AI-native search platform — three LLM-powered features, a production MCP server with 21 tools, and a federated search system indexing records from 5 sources.

My work sits at the intersection of distributed systems, data engineering, and AI — building the kind of infrastructure that other teams build their products on.

Platform Architecture

10+ production services across Java (Vert.x, Spring Boot) and Python — DAG orchestration, event-driven pipelines, graph-based RBAC, and real-time search

AI & MCP Integration

Production MCP server with 21 tools for Claude/ChatGPT. Three AI features: ReplyBot (OpenAI + Gemini fallback), AI Answers Summary (SSE streaming), Smart Search Summary (Llama 3 70B)

Data at Scale

61B+ records processed daily, 25M+ search records indexed, 35+ SaaS connectors, dual-warehouse analytics (Postgres + Redshift), S3 Parquet data lake

Security-First Design

Graph-based RBAC with Neo4j, JWT with Redis-backed revocation, permission-scoped Algolia tokens, process-isolated task sandboxing

Building at Every Layer of the Stack
A decade of progressive engineering ownership — from automation frameworks to architecting the data and AI backbone of an enterprise SaaS platform.
APR 2025 — PRESENT
Principal Engineer
Gainsight · Hyderabad
Architecting the Search & AI platform — production MCP server for AI assistants, three LLM-powered features with cross-provider fallback, search analytics data lake, and federated search indexing across multiple content sources. Leading the platform's AI-native evolution.
AUG 2019 — MAR 2025
Lead Software Engineer
Gainsight · Hyderabad
Led architecture of the Data Highway pipeline engine scaling to billions of records daily across hundreds of enterprise tenants. Built the Usage Data Model analytics platform, community search system, and enterprise webhook delivery with IAM SigV4 signing.
JUL 2017 — AUG 2019
Senior Software Engineer
Gainsight · Hyderabad
Built the Usage Data Model (UDM) analytics platform with dual-warehouse abstraction (Postgres + Redshift), Event Processing pub/sub system, and core Data Highway infrastructure.
JUL 2016 — JUN 2017
Software Engineer
Gainsight · Hyderabad
Developed the graph-based permission system (Neo4j), centralized Config Server, Audit Service, and contributed to the Data Highway DAG engine with 60+ custom Drill UDFs.
DEC 2015 — JUN 2016
Software Engineer
Talentica Software · Pune
Engineered the Intelligence Machine — a big data recommendation engine processing terabytes of social media data daily. Built performance benchmarking infrastructure and data-driven test automation across distributed systems.
AUG 2014 — DEC 2015
Software Analyst
Nihilent · Pune
Delivered 5 projects for MultiChoice Africa (Naspers subsidiary) — a major media platform serving millions of subscribers. Contributed to open-source technology solutions spanning .NET, Java, PHP, and Oracle.
Systems I've Built & Architected
Production systems powering Gainsight's enterprise platform — from data pipelines to AI-powered search.
Across Two Product Lines

My Contributions to Gainsight's CustomerOS

Gainsight CustomerOS — 20 repositories, 60+ cross-repo connections CS — Customer Success Data infrastructure · 6 core services APPLICATION LAYER Analytics Data Store 91 modules · Spring Usage Data Model Vert.x · Redshift Bionic Reporting Spring · Presto PLATFORM SERVICES Data Highway DAG Engine · Drill · 35 Connectors Permissions Neo4j · RBAC Events SQS · Pub/Sub CROSS-CUTTING (used by ALL services) Config Server Audit SHARED SDK (JAR dependencies in every repo) Shared SDK: commons · async framework · audit client · config client · query builder · storage · data core ALSO CONTRIBUTED Data Exchange Hub Adoption Tracker CC-to-CS Bridge Query Federation DAGs DAGs CC — Customer Communities Search · AI · MCP Protocol DCH Search MCP Server MCP Protocol · SSE · JWT · AI Assistants Synchronized Search Service Indexing · Tokens · Analytics · ReplyBot Algolia Search · AskAI AI Features ReplyBot · Smart Summary Federated: Zendesk · Salesforce · Freshdesk · Skilljar · Community Analytics: S3 Parquet + Athena · 25M+ records indexed Redis DynamoDB MySQL SNS/SQS 61B+ records/day · 14 TB processed · 600+ tenants 25M+ indexed · 5 federated sources · 3 AI features ━━ primary dependency ╌╌ SDK/client usage ● animated data flow
AI · SEARCH · LLM

Community Search & AI Platform

CC (Customer Communities) · Three AI features, federated search, and a provider-agnostic AI orchestration layer
Community Platform — AI Orchestration Layer Service Layer · Provider Factory · Automatic Failover · Adapters Composable prompt architecture (model, version, temperature, reasoning) OpenAI Provider Gemini Provider Summary Engine Automatic cross-provider failover with error recovery and retry ReplyBot Messagebus trigger OpenAI ↔ Gemini fallback AI Answers Summary SSE streaming to client Algolia AskAI + dedup Smart Search Summary Blocking response Llama 3 70B via Bedrock Synchronized Search Index builder · Token issuer Analytics data lake 25M+ records · 600+ communities Algolia Multi-app (APP1/APP2) · AskAI Secured API keys · Scoped tokens Federated Sources Zendesk · Salesforce · Freshdesk · Skilljar SNS Messagebus ~95 events · Webhooks · Connectors MySQL Redis S3 API Gateway — Authentication · OAuth2 · Rate Limiting · Request Routing
The AI brain of Gainsight Communities. A provider-agnostic AI orchestration layer powers three flagship features through a factory pattern — primary + fallback LLM chains with automatic cross-provider retry, bidirectional payload transformation for provider compatibility, and graceful degradation on any failure. Synchronized Search Service indexes 25M+ records from 5 federated sources to Algolia with zero-downtime index swaps.
PHP (Yii/Symfony)Algolia AskAIOpenAIGeminiLlama 3 70BAWS BedrockSSE StreamingPythonRedisS3 ParquetAthena
  • Cross-provider LLM fallback — dedicated adapter per provider handles payload asymmetry; automatic retry across providers even on client errors to prevent single-provider outages
  • SSE streaming with thinking-phase deduplication — stateful stream processor filters duplicate reasoning blocks per model family while preserving final output fidelity
  • Domain-driven prompt architecture — composable prompt traits (model, version, temperature, reasoning effort) enable per-feature model versioning and A/B testing without code duplication
  • Dynamic rate limiting with zero-restart retuning — atomic Redis counters (no locks, no races), thresholds configurable at runtime so ops can retune without a deploy
  • Zero-downtime full sync — bulk-index 25M+ records into temp Algolia index, atomic rename over production. No search outage during reindex
  • Analytics data lake on S3 Parquet + Athena with tag expression consolidation reducing Algolia API calls by 90%+
MCP PROTOCOL · PYTHON · AI INFRASTRUCTURE

DCH Search MCP Server

CC · Production Model Context Protocol server — 21 tools making community knowledge accessible to AI assistants
AI Assistants Claude · ChatGPT · Cursor SSE (/{community_id}) or stdio DCH Search MCP Server (FastMCP · Python 3.11 · ECS Fargate) Auth Layer JWT token validation Redis-backed revocation Guest vs Authenticated Token Manager Algolia scoped tokens 25-min cache, 80% proactive Real-time role from Sync Search Middleware Rate limiter (token bucket) TTL + LRU cache Exp backoff retry 21 Read-Only Tools Algolia Search (2) CC REST API (13) Skilljar (3) Analytics (1) search_unified · get_discussions · get_topic list_topics · list_ideas · list_categories get_course · get_lesson · list_learning_paths get_search_analytics · get_community_info ... Synchronized Search — Token API Scoped token issuance API Algolia Scoped token · {communityId}-unified Redis CC REST API OAuth2 client_credentials Skilljar API Basic Auth · 50-pg cap ECS Fargate · Auto-scaling · Structured JSON logging · Correlation IDs · Comprehensive test coverage
One of the early production MCP deployments integrating Claude, ChatGPT, and Cursor with a real enterprise search backend. 21 read-only tools expose community discussions, knowledge base articles, Skilljar courses, and search analytics — all through permission-scoped Algolia tokens that enforce the same access rules as the frontend UI.
Python 3.11FastMCPSSE TransportJWTRedisAlgoliaECS FargateasynciostructlogPydantic
  • 21 read-only tools across 4 backends — unified search, community REST API, education platform, and search analytics — behind a single MCP interface
  • Proactive token refresh before expiry — request-time never pays fetch latency. Role resolved from the search service at token-time, not from cached claims (survives long-lived token lifespans)
  • Permissioned search at the token layer — access control baked into scoped search tokens, not post-filtered. MCP and frontend receive identical security semantics
  • Deliberate availability vs safety tradeoffs per endpoint — revocation checks degrade gracefully while role-based checks remain strict
  • Connection-scoped context propagated to all child tool calls — auth context and tokens travel the entire request lifecycle without threading issues
DISTRIBUTED SYSTEMS · DAG ENGINE

Data Highway — DAG-Based Pipeline Engine

CS (Customer Success) · The compute backbone processing 61B+ records/day
API Interface — REST endpoints for DAG submission, monitoring, control Flow Manager DAG orchestration · State machine Zookeeper (×3) Leader election · Locks Launcher Task dispatch · Resource alloc PostgreSQL Redis DAG Workers (×N) Java · Vert.x · 60+ UDFs Dynamic Workers 35 connector JARs · Isolated Python Workers Celery · ML tasks Apache Drill (×2) SQL engine DAG Consumers: Analytics Platform · Reporting Engine · Data Exchange · Platform Services Analytics Data Store · S3 · Redshift · SQS · Connection Pooler
A distributed data pipeline platform for enterprise SaaS. Ingests data from tenant systems via 35+ connectors, runs transformations through a SQL engine, and loads results into analytics data stores and warehouses. A directed-acyclic graph of tasks coordinated by leader election and workflow orchestration across a horizontally-scalable worker cluster.
Java 8Vert.xApache DrillZookeeperPython/CeleryPostgreSQLRedisAWS SQSS3
  • Processes 61B+ records/day across 600+ enterprise tenants at 98.7% job success rate
  • Multi-worker architecture: DAG Workers (Java/Vert.x) + Dynamic Workers (35 isolated connector JARs) + Python/Celery ML Workers + Apache Drill SQL engine
  • 60+ custom Apache Drill UDFs for tenant-specific data transformations
  • Horizontally-scalable cluster coordinated by Zookeeper leader election and distributed workflow orchestration
  • Failure-injection tasks (OOM, disk exhaustion, FD leak) as production resilience probes
35+ Enterprise Connectors
SalesforceSalesforce
HubSpotHubSpot
SnowflakeSnowflake
BigQueryBigQuery
DatabricksDatabricks
SAPSAP
ZendeskZendesk
ZoomZoom
MS TeamsMS Teams
IntercomIntercom
MySQLMySQL
PostgreSQLPostgreSQL
MSSQLMSSQL
GA4GA4
+ 21 more
SECURITY · GRAPH DB

User Permissions (RBAC)

CS · Graph-native access control powering authorization across 7+ services
permissions-api (Vert.x × 4 verticles) Auth → CORS → Router → Policy Evaluator Neo4j Permission graph Redis Eval cache Msg Queue Async sync permissions-worker (N instances) Heartbeat leader · Onboarding · Group sync · Scavenger cleanup Permission SDK Consumers (7+ services) AnalyticsRow-level data access control Query Fed.Query federation access control ReportingReport-level permission checks API GWJWT filter + resource authorization Data StoreResource-level access enforcement CatalogCatalog access validation CC BridgeCross-product permission bridge SDK
Centralized multi-tenant RBAC service. Permissions stored as a graph — a single traversal answers "can user U do action A on resource R?" in milliseconds, with Redis-cached evaluations and expression-based policy conditions. The permission SDK is consumed by 7+ services across the platform.
Neo4jVert.xRedisMessage QueuePostgreSQL
  • Graph-native model: resources, policies, groups, and manager chains as traversable edges — single graph traversal resolves access in milliseconds
  • Targeted Redis invalidation — editing a sharing group invalidates only affected members × resources, not the full cache
  • Row-level data enforcement — restricted data never leaves the database, access control enforced at SQL query-build time
  • Permission SDK consumed by 7+ platform services: analytics, query federation, reporting, API gateway, data store, catalog, and cross-product bridge
  • Heartbeat-based leader election for worker coordination with automatic cleanup of orphaned graph nodes
ANALYTICS · WAREHOUSING

Usage Data Model (UDM)

CS · Multi-tenant analytics orchestration across dual warehouses
UDM API (Vert.x) Analyzer · RunNow · Admin · Migration Message Queue Multiple queues · async dispatch UDM Worker (multi-pool) Scheduler · Submitter · DataLoad · Archival · QueryBuilder Data Highway DAG submission via internal API PostgreSQL Per-tenant DDL Redshift High-volume OLAP Redis Conn Pool Permissions Row-level RBAC Data Store API callouts ~25 pluggable task builders · Dual-warehouse abstraction · Template-driven DDL · Permissions injected at query time
Full-lifecycle analytics engine — onboarding, schema definition, ingestion from 10+ sources, transformation, and warehouse storage. Powers Gainsight dashboards with per-tenant dynamic DDL and dual-warehouse abstraction (PostgreSQL for small tenants, Redshift for high-volume).
Vert.xRedshiftPostgreSQLMessage QueueRedis
  • ~25 pluggable task builders — adding a new data source is a single builder class
  • Dual-warehouse abstraction — PostgreSQL for small tenants, Redshift for high-volume OLAP, switched transparently
  • Permissions injected at query-build time — restricted data never leaves the database
  • Cross-system DAG submission to the pipeline engine with retry and backoff
  • Secure multi-tenant data exchange with per-tenant credential isolation
PUB/SUB · EVENT-DRIVEN

Event Processing

CS · Four-tier async pub/sub with durable delivery and cross-region failover
TIER 1 TIER 2 TIER 3-4 Receiver (×3) Vert.x · Auth · Rate limit · Persist SQS Primary SQS Secondary (failover) Enricher Queue Data lookup · Enrichment Redis Postgres Router (×N) Fan-out · Broadcast queues · Tenant routing Relayer (×N) Delayed delivery · Retry with backoff Broadcaster (×N) HTTP delivery · Status tracking Subscribers — HTTP webhooks · Internal services · External customer endpoints Self-healing delivery recovery · Cross-region queue failover · Persistent routing state
Async event fan-out with durable delivery. Publishers emit once; subscribers register for events with optional delayed delivery and data enrichment. Four-tier architecture with purpose-matched runtimes: Receiver → Router → Relayer → Broadcaster.
Vert.xAWS SQSApache HttpAsyncPostgreSQLRedis
  • Cross-region SQS failover — automatic switch to secondary queues on publish errors
  • Self-healing delivery: dedicated recovery process detects and recovers orphaned delivery rows after worker crashes
  • Blocked-tenant handling isolates noisy tenants from starving other tenants' event delivery
  • Enrichment loop — events decorated with business context from the analytics data store before delivery
  • Tiered architecture: Receiver → Router → Relayer → Broadcaster with purpose-matched runtimes
COMPLIANCE · OBSERVABILITY

Audit Service

Centralized event capture & compliance trail
Reactive audit service for compliance trails. Fire-and-forget writes via Vert.x event bus decouple caller latency from Postgres persistence. Per-tenant tables created lazily on first write. Used by every service in the platform via a lightweight audit client SDK.
Vert.xPostgreSQLRedis
  • Async write via event bus — HTTP 200 returns before DB insert completes
  • Random-tenant retention cleanup with +10 day safety buffer
  • Separate read/write DB pools to prevent analytics from starving ingestion
INFRASTRUCTURE · CONFIG

Config Server

Centralized configuration for all CS services
Spring Boot + Vert.x hybrid config service. Every platform service pulls runtime configuration from here — DB strings, feature flags, encrypted secrets — with full audit trail and periodic refresh. The config client SDK is embedded across all Java services.
Spring BootVert.xPostgreSQLAES Encryption
  • Upsert + append-only version history table in single transaction
  • Database-enforced integrity with case-insensitive keys and FK cascades
  • Redis-cached auth lookups with fail-fast startup ping
Technologies & Tools
A decade of hands-on experience across languages, frameworks, cloud services, and data systems.

Languages & Frameworks

Java
Java 8+
Python
Python 3.x
Vert.x
Vert.x
Spring Boot
Spring Boot
FastAPI
FastAPI
FastMCP
FastMCP
PHP
PHP (Yii)

Cloud & Infrastructure

ECS
AWS ECS Fargate
λ
AWS Lambda
S3
AWS S3
SQ
AWS SQS / SNS
DB
AWS DynamoDB
CF
CloudFormation
Docker
Docker
CW
CloudWatch

Data & Search

PostgreSQL
PostgreSQL
Redis
Redis
Neo4j
Neo4j
Algolia
Algolia
Apache Drill
Apache Drill
RS
Redshift
MongoDB
MongoDB
At
Athena
MySQL
MySQL
Zookeeper
Zookeeper

AI & Protocols

MCP
MCP Protocol
SSE
SSE Transport
OpenAI
OpenAI / GPT
Gemini
Google Gemini
Llama
Llama 3
Algolia AskAI
Algolia AskAI
JWT
JWT / PyJWT

DevOps & Quality

GitHub Actions
GitHub Actions
Bitbucket
Bitbucket Pipelines
SonarCloud
SonarCloud
Maven
Maven
pytest
pytest
JUnit
JUnit
REST Assured
REST Assured
Background

B.Tech — Computer Science & Engineering

BABU BANARASI DAS Institute of Technology & Management, Lucknow (2010–2014) · 75.16%

Intermediate — Mathematics

Rastriya Inter College (2008–2010) · 73.6%

Languages

English (Full Professional) · Hindi (Native)

Awards & Leadership

Vice-Captain, Nihilentia — Selected to lead Nihilent's annual fest (Dec 2014). Successfully organized events and competitions.

IT-FIESTA 2011 — Appreciation Certificate awarded by T. Ashok (Founder & CEO, Stag Software Pvt. Ltd.)

UTKARSH-13 & 14 — Coordinated Technical Events at BBDGEI (2013 & 2014)

ENCORE-12 — Honor & Excellence Certificate at IET Lucknow (Oct 2012)

Techsamagam-11 — Certificate Of Merit, FORCE CIC Education (Nov 2011)

3rd Position, UP Board — Block-level topper in Jaunpur, High School Examinations (2007)

What Colleagues Say
"Abhishekh is an outstanding person, great to work with and a real self starter. He is very effective in team and testing related activities. He has strong and broad technical and automation expertise. Abhishekh is one of the top people I have met and worked with in my career."
Rajesh Chejerla
Principal Engineer at Salesforce
"Abhishekh is very passionate and has great vision for his work. His focus keeps everything moving smoothly, he makes sure all the deadlines are met, and makes sure that whatever project he is working on meets the highest standards."
Anant Agrawal
Engineering Manager | CyberSecurity | Scale Products
"Abhishekh successfully automated our application and single handedly managed all testing activities. He is extremely enthusiastic about his work which is infectious, makes it fun to learn. He always has a creative, positive outlook and he's good at organizing and bringing people together."
Shivesh Verma
Senior Software Engineer (JL4) at Maersk
"I had the privilege of working with Abhishek on a couple of academic projects during our B.Tech. He is a highly organized, goal oriented, independent and hard working perfectionist. He has very good analytical and interpersonal skills. He is the most motivated and enthusiastic person I have ever met."
Kuldeep Kr
Staff Software Engineer @ ServiceNow | Ex-Microsoft
"Abhishek has tremendous knowledge of Testing, always there to help for problems related to testing. His rational approach of making complex topics very simpler by relating it with day to day examples was amazing. That was the beauty of his approach in System Testing and Integration Testing."
Rupender Rana
Technical Lead, iOS at Nagarro
Let's Connect
Open to conversations about distributed systems, AI infrastructure, and platform engineering.