Skip to content

Sudesh P AI Systems Engineer

Creator of OmniSLM. Building production-ready AI applications with Small Language Models.

Focused on RAG pipelines, local-first LLM platforms, agent architectures, and privacy-first AI infrastructure.

Chennai, India M.Tech Computer Science, SRMIST, Chennai
Creator of OmniSLM
AI Systems Engineer
M.Tech CS @ SRMIST
Python + Go AI Engineering
RAG & Agent Architectures
Open Source Contributor
Featured Release

OmniSLM v0.5 is now available

Introducing native agent orchestration, seamless vLLM continuous batching integration, and enhanced memory providers for complex multi-turn workflows.

Explore OmniSLM

Latest Insights

Thoughts on building production-grade AI infrastructure and the shift towards Small Language Models.

View all notes

Engineering Note: Vector Isolation

"Never trust the LLM prompt to filter tenant data. In multi-tenant RAG architectures, isolation must happen at the physical or metadata layer before the retrieved context ever reaches the inference engine."

Engineering Note: Async Inference

"Synchronous HTTP requests to an LLM endpoint will eventually bring down your system. Always decouple the web tier from the inference tier using a robust message queue like RabbitMQ."

Subscribe to AI Engineering Notes

Occasional insights on Small Language Models, RAG architectures, and building production-ready local AI systems. No spam, ever.

Other Engineering Work

Case studies of production architectures, from academic intelligence platforms to Web3 supply chains.

View all projects

SeedTracking

Blockchain-based seed supply chain platform with ML fraud detection.

Spring BootReactSolidityEthereum
Read Case Study

Problem

Fraud and opacity in agricultural seed supply chains costs farmers billions annually. Counterfeit seeds reduce crop yields and there's no reliable way to verify authenticity.

Outcome

Enables end-to-end traceability of seed batches. ML model flags anomalous distribution patterns that indicate potential fraud.

Architecture Highlights

Smart contracts on Ethereum handle state changes, while IPFS is used for decentralized document storage. An ML service scores fraud risk.

Local LLM Application

Privacy-first local LLM platform built with Java 21, Spring AI, and MongoDB.

Java 21Spring BootSpring AIOllama
Read Case Study

Problem

Java enterprise teams need LLM capabilities but existing tools are Python-only, creating a skills gap.

Outcome

Bridges the Java-AI gap. Enterprise teams can integrate LLM features using familiar Spring patterns.

Architecture Highlights

A Spring Boot application using WebFlux for reactive endpoints, MongoDB for session storage, and Spring AI for model orchestration.

PaathAI

AI-driven lecture intelligence platform for transcription, summarization, and progress tracking.

JavaSpring BootAI/NLPTranscription
Read Case Study

Problem

Students miss key points in lectures, and there's no structured way to search, review, or track coverage of syllabus topics across sessions.

Outcome

Transforms passive lecture recordings into structured, searchable knowledge bases with syllabus alignment.

Architecture Highlights

An AI platform that processes lecture audio, maps content to syllabus topics, and provides searchable summaries with progress analytics.

RAG System for Local LLM

Privacy-preserving Retrieval-Augmented Generation pipeline using FAISS and Ollama.

PythonFAISSOllamaSentence Transformers
Read Case Study

Problem

Organizations with sensitive documents can't use cloud-based AI services due to data privacy and compliance requirements.

Outcome

Enables AI-powered document Q&A for privacy-sensitive organizations. Processes documents locally with zero data leakage.

Architecture Highlights

A pipeline that ingests documents, chunks them, embeds them locally using SentenceTransformers, and stores them in FAISS. Ollama handles LLM inference.