llamaindex
Assists with building RAG pipelines, knowledge assistants, and data-augmented LLM applications using LlamaIndex. Use when ingesting documents, configuring retrieval strategies, building query engines, or creating multi-step agents. Trigger words: llamaindex, rag, retrieval augmented generation, vector index, query engine, document loader, knowledge base.
Usage
Getting Started
- Install the skill using the command above
- Open your AI coding agent (Claude Code, Codex, Gemini CLI, or Cursor)
- Reference the skill in your prompt
- The AI will use the skill's capabilities automatically
Example Prompts
- "Analyze the sales data in revenue.csv and identify trends"
- "Create a visualization comparing Q1 vs Q2 performance metrics"
Documentation
Overview
LlamaIndex is a data framework for building RAG pipelines, knowledge assistants, and data-augmented LLM applications. It provides document loading from 300+ sources, flexible chunking strategies, multiple index types, hybrid retrieval with reranking, and production evaluation tools for question-answering systems.
Instructions
- When ingesting documents, use
SimpleDirectoryReaderfor local files or LlamaHub connectors for SaaS platforms, and run through anIngestionPipelinewith metadata extractors (title, summary) and deduplication. - When chunking, start with
SentenceSplitterat 1024 tokens with 200 token overlap, useMarkdownNodeParserfor structured documents,CodeSplitterfor code, and adjust based on evaluation results. - When indexing, use
VectorStoreIndexas the default for most RAG,KnowledgeGraphIndexfor entity relationships, andDocumentSummaryIndexfor per-document summaries. - When retrieving, implement hybrid retrieval (vector + keyword) for production, add a reranker (
CohereRerank) after retrieval for improved relevance, and setsimilarity_top_kbased on context window (3-5 for large models, 2-3 for smaller). - When building query engines, use
RetrieverQueryEnginefor standard RAG,CitationQueryEnginefor responses with source attribution, andSubQuestionQueryEnginefor complex multi-part queries. - When creating agents, use
ReActAgentwith tools wrapping query engines (QueryEngineTool), functions, and other agents for multi-step reasoning. - When evaluating, use
CorrectnessEvaluator,FaithfulnessEvaluator, andRelevancyEvaluatoron a test set before deploying.
Examples
Example 1: Build a RAG pipeline over company documentation
User request: "Create a question-answering system over our internal docs"
Actions:
- Load documents with
SimpleDirectoryReaderand extract metadata (title, summary) - Chunk with
SentenceSplitter(1024 tokens, 200 overlap) through anIngestionPipeline - Create
VectorStoreIndexwith OpenAI embeddings and configure hybrid retrieval - Build
CitationQueryEnginefor answers with source references
Output: A RAG system that answers questions with citations from company documentation.
Example 2: Create a multi-source research agent
User request: "Build an agent that can search across our docs, database, and web"
Actions:
- Create separate query engines for each data source (vector index, SQL, web search)
- Wrap each engine as a
QueryEngineToolwith descriptive tool descriptions - Build a
ReActAgentthat routes questions to the appropriate tool - Add
SubQuestionQueryEnginefor complex queries requiring multiple sources
Output: An intelligent agent that reasons about which data source to query and synthesizes multi-source answers.
Guidelines
- Use
SentenceSplitterwith 1024 token chunks and 200 token overlap as the starting point. - Always add metadata extractors to the ingestion pipeline; title and summary metadata improve retrieval significantly.
- Use hybrid retrieval (vector + keyword) for production; pure vector search misses exact term matches.
- Add a reranker (
CohereRerank) after retrieval to improve result relevance for small cost. - Evaluate with
CorrectnessEvaluatoron a test set before deploying; subjective quality assessment does not scale. - Set
similarity_top_kbased on context window: 3-5 chunks for large models, 2-3 for smaller models. - Use
IngestionPipelinewith deduplication for incremental data updates; do not re-embed unchanged documents.
Information
- Version
- 1.0.0
- Author
- terminal-skills
- Category
- Data & AI
- License
- Apache-2.0