AI Lab

Learn RAG, embeddings, and vector search — interactively

Interactive AI engineering experiments by Shriyash Sharma — from retrieval pipelines and embeddings to semantic search, reasoning, and system design.

This space contains hands-on AI engineering experiments built to make complex concepts easier to understand through visualization and interaction.

Tools & experiments

RAG Explorer

Retrieval-Augmented Generation, step by step

Learn how Retrieval-Augmented Generation works step-by-step by exploring chunking, embeddings, vector search, retrieval, prompt construction, and answer generation.

RAGEmbeddingsVector SearchChunkingPrompting

Open Tool

Coming soon

Prompt Engineering Lab

Compare prompts and see how outputs shift

Experiment with system prompts, few-shot examples, and constraints to understand how prompt structure changes model behavior.

PromptingLLMFew-shot

Embedding Visualizer

See text projected into vector space

Visualize how AI converts text into vectors and groups semantically similar concepts together.

EmbeddingsDimensionality Reduction

Open Tool

Coming soon

Vector Search Explorer

Watch similarity ranking happen live

Inspect cosine similarity, top-k retrieval, and how index parameters trade off recall against latency.

Vector SearchpgvectorANN

Semantic Search Playground

Keyword vs. semantic retrieval

Compare traditional keyword search and AI-powered semantic search using the same dataset.

Semantic SearchHybrid Retrieval

Open Tool

Coming soon

AI Agent Simulator

Trace tool-calling and reasoning loops

Step through an agent's plan, tool calls, and observations to understand how multi-step reasoning is orchestrated.

AgentsTool CallingReasoning

Context Window Visualizer

Budget tokens across a prompt

Understand token limits, context windows, truncation, and how modern LLMs process large documents.

Context WindowTokensLLM

Open Tool

Concepts you'll explore

The ideas behind modern AI systems

A quick primer on the building blocks these tools make tangible. Go deeper with the RAG Explorer, Embedding Visualizer, and other experiments below. Long-form reading lives in the blog.

What is RAG?

Retrieval-Augmented Generation (RAG) pairs a search system with a language model. Instead of answering only from memorized training data, the model first retrieves relevant passages from a knowledge source and answers from that evidence — keeping responses current, accurate, and traceable.

Run the RAG Explorer

What are embeddings?

An embedding turns text into a vector of numbers that captures meaning. Texts with similar meaning get similar vectors, which is why embeddings power semantic search: they match ideas rather than exact keywords, finding the right passage even when the wording is different.

Explore the Embedding Visualizer

How vector search works

Vector search ranks stored vectors by how close they are to a query vector, usually with cosine similarity. A vector database compares the question against every chunk and returns the closest matches in milliseconds — the retrieval step that makes real-time RAG possible.

Try the Semantic Search Playground

How AI assistants use retrieval

A production assistant ingests documents ahead of time — chunking, embedding, and storing them. At query time it embeds your question, retrieves the most relevant chunks, and builds a grounded prompt so the model answers from real sources instead of guessing.

See context window limits

See it run step by step in the RAG Explorer