• Built a developer-focused webhook infrastructure platform abstracting reliable event delivery, covering retry logic, failure recovery, and delivery guarantees away from application code.
• Designed a two-path failure recovery system: for delays under 2 minutes, uses SQS Lambda visibility timeout API to delay re-delivery in-place; for longer delays, re-queues via EventBridge Scheduler, minimising overhead while maintaining delivery SLAs.
• Full pipeline: SDK/API ingestion → SQS queue → Lambda worker → customer endpoint, with branching for successful acknowledgement vs. failed delivery.
• Stack: Node.js, TypeScript, AWS SQS, ALB, Lambda, EventBridge, Redis, MongoDB.
• Load tested the ingestion API across 10–100 concurrent connections: sustained ~192 RPS at peak with p50 latency of 434ms and p99 under 1.7s at 100 connections. Zero errors across all runs. Throughput scaled linearly from ~52 KiB/s at 10 connections to ~491 KiB/s at 100.
• Worker throughput: 133 msg/sec across 31k requests at 512MB Lambda memory, with concurrency capped at 100 — headroom to scale significantly higher on demand.
RAG pipeline using LangChain to generate embeddings from academic PDFs and synthesise contextual practice questions.
IELTS 7.5