About
AI Researcher with experience in building generative AI solutions. Skilled in taking products from concept to completion and leading teams effectively. Experienced in AI, ML, and DL technologies, web development, and cloud computing. I specialize in developing scalable AI solutions, from fine-tuning models to full deployment pipelines. My current focus is on Vision-Language Models (VLMs) for Vision-Based Retrieval systems. I'm also passionate about open-source, regularly contributing to various projects to advance AI innovation.
Work Experience
CognitiveLabOn Site
AI Researcher
TurboMLRemote
AI Developer
NeoHumansRemote
AI Researcher
Indian Institute of Science (IISC)On Site
Research Intern
Mandelbulb TechnologiesRemote
Generative AI Developer
Enable TechnologiesOn Site
Full Stack Developer
Skills
Open Source
omniparse 5094
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
VARAG 75
Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine
AI-Engineering.academy 156
Navigating the World of AI, One Step at a Time
indic_eval 31
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks
Indic-llm 9
A open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.
Research Projects
Indic Eval/Leaderboard
Developed an evaluation framework for Indic Large Language Models, accommodating multiple translated benchmarks and a leaderboard around it for comparison.
Ambari-7b
India's first Kannada bilingual LLM utilizing the LLama2/3 base model, fine-tuned across multiple stages with 1 billion Kannada tokens and tokenization efficiency by 85%
YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detection and segmentation.
VARAG
Vision-Augmented Retrieval and Generation : a system integrating textual and visual information, enhancing RAG by 35% and improving contextual precision by 60%.
Mixture of Lora Experts
A novel architecture facilitating the dynamic serving of multiple finetuned LLMs by swapping Lora Adapters during inference.
ViViD
A state-of-the-art Vision-Language model specialized in converting complex PDFs into markdown with high speed and efficiency.
Projects
Cognitune
All-in-one platform for LLMops, featuring distributed data processing, multi-GPU fine-tuning, dynamic evaluation, and one-click high-throughput API deployment.
Storyblocks
Generate Story Video from a Prompt : Transformed text prompts into dynamic story videos with script generation, synchronized audio, and consistent visual style.
Marker API
A production-ready server with 400 github ⭐, easily deployable to convert PDFs, Word documents, etc., into markdown to aid RAG pipelines.
PyRaft
Python implementation of the RAFT consensus algorithm from scratch using FastAPI, achieving a throughput of 50-250 transactions per second
Tokenizer Arena
A friendly arena to easily compare different tokenizers of various LLMs simultaneously, running completely in the browser.
Topic2Dataset
Create high-quality instruction fine-tuning datasets for LLMs by providing a topic or website, allowing massive synthetic data generation.