A photo of mittens, the mastermind behind this project.

From Visual Workflows to MCP: The Evolution of Larry Llama’s Local RAG Pipeline

Remember Larry Llama? That local RAG system I built with n8n’s visual workflows, Ollama, and Qdrant? Well, like any good side project, it evolved. And by evolved, I mean I threw out mostly the entire n8n approach and rebuilt it as an MCP server specifically for my Obsidian notes. Meet the AI Note Searcher 5000—because if you’re going to abandon a perfectly good visual workflow system, you might as well give your replacement an appropriately dramatic name.

Why Ditch the Visual Workflow?

Don’t get me wrong—n8n is fantastic. The visual workflow approach got me up and running fast and taught me a lot about RAG pipelines. But after living with it for a while, I started hitting some friction points:

The Obsidian Problem: My knowledge base isn’t just random files in a folder—it’s a somewhat carefully curated Obsidian vault with frontmatter, internal links, tags, and daily notes. The generic file processing approach was missing all that semantic richness.

The Integration Gap: I found myself switching between Claude and my local RAG system constantly. I’d ask Claude something, realize I needed to check my notes, fire up the local system, then try to piece the context back together.

The Overhead: For what was essentially “search my notes intelligently,” maintaining Docker containers, webhooks, and a whole separate UI felt like overkill (even if it was very cool).

Enter MCP: Model Context Protocol

The game-changer was Anthropic’s Model Context Protocol. Instead of building a separate system that Claude talks to via API calls, MCP lets you extend Claude directly with custom tools. It’s like giving Claude native access to your data and workflows.

The beauty is elegant simplicity: Claude gets new capabilities, but the interaction model stays exactly the same. No context switching, no separate interfaces—just enhanced Claude (or any tool enabled LLM) that can seamlessly search and manipulate my notes.

The New Architecture

The AI Note Searcher 5000 keeps the parts that worked from Larry Llama while streamlining everything else:

Core Components:

Ollama (still bare-metal): Local embeddings with nomic-embed-text:latest
Qdrant (Docker): Vector storage for semantic search
FileWatcher: Monitors Obsidian folder, processes markdown files with proper frontmatter parsing
MCP Server: TypeScript-based server exposing multiple tools to Claude

What’s Different:

Dropped n8n, Apache Tika, and the web UI entirely
Added proper Obsidian markdown parsing with gray-matter
Built modular MCP tools for different note operations
Integrated daily note creation and intelligent file path suggestions

The data flow is beautifully straightforward:

FileWatcher detects changes in mounted Obsidian folder
Markdown files get parsed (frontmatter + content) and chunked intelligently
Ollama generates embeddings, stored in Qdrant with rich metadata
Claude can now search, read, create, and organize notes through MCP tools

The Magic of Hybrid Search

Here’s where things get really interesting. The vector search is great for semantic similarity—asking about “productivity systems” will surface notes about GTD, time-blocking, and workflow optimization even if they don’t use that exact phrase.

But sometimes you need literal matching. If I’m looking for that specific project I named “Phoenix Redesign,” semantic search might wander off into mythology. So the system implements hybrid search that combines:

Semantic search: Vector similarity using embeddings
Full-text search: Literal text matching in content, titles, and file paths
Query expansion: Aggressive date format conversion and keyword extraction

The result? Whether you search semantically (“How do I handle difficult conversations?”) or literally (“Phoenix Redesign”), you get relevant results. It’s the best of both worlds without the complexity of maintaining separate search indexes.

Development Experience

Building this in TypeScript was a joy compared to wrestling with n8n’s visual node editor for complex logic. The modular tool architecture means adding new capabilities is straightforward—each MCP tool is a separate class that registers itself with the server.

Current tools include:

search_notes: Hybrid semantic + full-text search
get_note_content: Retrieve full note contents
create_note: Create new notes with frontmatter support
get_directory_structure: Understand vault organization
suggest_file_path: Intelligent path suggestions based on content type

The Docker setup is minimal—just Qdrant and the note searcher, with the Obsidian folder mounted read-only. No more webhooks, no more UI containers, no more Apache Tika for simple markdown processing.

Integration with Claude Desktop

The killer feature is how seamlessly this integrates with Claude Desktop. After adding the MCP server to your config:

{
  "mcpServers": {
    "ai-note-searcher": {
      "command": "node",
      "args": ["/path/to/ai-note-searcher-5000/mcp-server.js"],
      "env": {
        "QDRANT_URL": "http://127.0.0.1:6333",
        "OLLAMA_URL": "http://127.0.0.1:11434",
        "NOTEBOOK_PATH": "/path/to/your/obsidian/vault"
      }
    }
  }
}

Claude gains native note-searching superpowers. I can ask “What did I write about React performance optimization?” and Claude searches my notes, finds relevant passages, and synthesizes insights—all in one conversational flow.

Living with the System

After a few months of daily use, this approach feels fundamentally different from Larry Llama. Instead of maintaining a separate RAG system, I’ve extended my primary AI assistant to understand my personal knowledge base. It’s the difference between having two tools that work well independently versus having one integrated system that’s greater than the sum of its parts.

The hybrid search consistently surprises me with its accuracy. Fuzzy semantic searches surface forgotten connections between ideas, while exact matches help me locate specific references instantly. The 0.3 similarity threshold might seem low, but for personal notes where you’re often searching for your own quirky phrasing, maximum fuzziness works pretty well.

What’s Next?

The modular architecture makes experimentation easy. I’m considering adding tools for:

Automated daily note templates based on calendar events
Cross-note link analysis and suggestion
Meeting note extraction from transcripts
Integration with task management workflows

But honestly? The current system already feels complete in a way Larry Llama never did. Sometimes the best next feature is shipping what works.

The Repository

The full (and still evolving) implementation is available at github.com/jarmentor/obsidian-notebook-mcp. It’s designed to be easily adaptable—swap out Obsidian for any markdown-based knowledge system, or extend the MCP tools for your specific workflows.

The README does not at all contain setup instructions, architecture diagrams, or debugging tips; however, it might in the near future.

Final Thoughts

Visual workflow tools like n8n are incredible for learning and rapid prototyping. They let you build complex systems without getting bogged down in implementation details. But sometimes the right evolution is recognizing when you’ve outgrown the scaffolding.

The MCP approach trades visual simplicity for integration depth. Instead of building a separate system that works alongside Claude, I’ve extended Claude to work naturally with my existing knowledge management workflow.

For anyone building local RAG systems: consider MCP early in your design process. The integration benefits are substantial, and the development experience is surprisingly pleasant once you embrace the TypeScript ecosystem.

Now if you’ll excuse me, I need to ask Claude to search my notes for that thing I wrote about… something. The beauty is, it’ll probably find it anyway.