
From Visual Workflows to MCP: The Evolution of Larry Llama’s Local RAG Pipeline
Remember Larry Llama? That local RAG system I built with n8n’s visual workflows, Ollama, and Qdrant? Well, like any good side project, it evolved. And by evolved, I mean I threw out mostly the entire n8n approach and rebuilt it as an MCP server specifically for my Obsidian notes. Meet the AI Note Searcher 5000—because if you’re going to abandon a perfectly good visual workflow system, you might as well give your replacement an appropriately dramatic name.
Why Ditch the Visual Workflow?
Don’t get me wrong—n8n is fantastic. The visual workflow approach got me up and running fast and taught me a lot about RAG pipelines. But after living with it for a while, I started hitting some friction points:
The Obsidian Problem: My knowledge base isn’t just random files in a folder—it’s a somewhat carefully curated Obsidian vault with frontmatter, internal links, tags, and daily notes. The generic file processing approach was missing all that semantic richness.
The Integration Gap: I found myself switching between Claude and my local RAG system constantly. I’d ask Claude something, realize I needed to check my notes, fire up the local system, then try to piece the context back together.
The Overhead: For what was essentially “search my notes intelligently,” maintaining Docker containers, webhooks, and a whole separate UI felt like overkill (even if it was very cool).
Enter MCP: Model Context Protocol
The game-changer was Anthropic’s Model Context Protocol. Instead of building a separate system that Claude talks to via API calls, MCP lets you extend Claude directly with custom tools. It’s like giving Claude native access to your data and workflows.
The beauty is elegant simplicity: Claude gets new capabilities, but the interaction model stays exactly the same. No context switching, no separate interfaces—just enhanced Claude (or any tool enabled LLM) that can seamlessly search and manipulate my notes.
The New Architecture
The AI Note Searcher 5000 keeps the parts that worked from Larry Llama while streamlining everything else:
Core Components:
- Ollama (still bare-metal): Local embeddings with
nomic-embed-text:latest
- Qdrant (Docker): Vector storage for semantic search
- FileWatcher: Monitors Obsidian folder, processes markdown files with proper frontmatter parsing
- MCP Server: TypeScript-based server exposing multiple tools to Claude
What’s Different:
- Dropped n8n, Apache Tika, and the web UI entirely
- Added proper Obsidian markdown parsing with gray-matter
- Built modular MCP tools for different note operations
- Integrated daily note creation and intelligent file path suggestions
The data flow is beautifully straightforward:
- FileWatcher detects changes in mounted Obsidian folder
- Markdown files get parsed (frontmatter + content) and chunked intelligently
- Ollama generates embeddings, stored in Qdrant with rich metadata
- Claude can now search, read, create, and organize notes through MCP tools
The Magic of Hybrid Search
Here’s where things get really interesting. The vector search is great for semantic similarity—asking about “productivity systems” will surface notes about GTD, time-blocking, and workflow optimization even if they don’t use that exact phrase.
But sometimes you need literal matching. If I’m looking for that specific project I named “Phoenix Redesign,” semantic search might wander off into mythology. So the system implements hybrid search that combines:
- Semantic search: Vector similarity using embeddings
- Full-text search: Literal text matching in content, titles, and file paths
- Query expansion: Aggressive date format conversion and keyword extraction
The result? Whether you search semantically (“How do I handle difficult conversations?”) or literally (“Phoenix Redesign”), you get relevant results. It’s the best of both worlds without the complexity of maintaining separate search indexes.
Development Experience
Building this in TypeScript was a joy compared to wrestling with n8n’s visual node editor for complex logic. The modular tool architecture means adding new capabilities is straightforward—each MCP tool is a separate class that registers itself with the server.
Current tools include:
search_notes
: Hybrid semantic + full-text searchget_note_content
: Retrieve full note contentscreate_note
: Create new notes with frontmatter supportget_directory_structure
: Understand vault organizationsuggest_file_path
: Intelligent path suggestions based on content type
The Docker setup is minimal—just Qdrant and the note searcher, with the Obsidian folder mounted read-only. No more webhooks, no more UI containers, no more Apache Tika for simple markdown processing.
Integration with Claude Desktop
The killer feature is how seamlessly this integrates with Claude Desktop. After adding the MCP server to your config:
{
"mcpServers": {
"ai-note-searcher": {
"command": "node",
"args": ["/path/to/ai-note-searcher-5000/mcp-server.js"],
"env": {
"QDRANT_URL": "http://127.0.0.1:6333",
"OLLAMA_URL": "http://127.0.0.1:11434",
"NOTEBOOK_PATH": "/path/to/your/obsidian/vault"
}
}
}
}
Claude gains native note-searching superpowers. I can ask “What did I write about React performance optimization?” and Claude searches my notes, finds relevant passages, and synthesizes insights—all in one conversational flow.
Living with the System
After a few months of daily use, this approach feels fundamentally different from Larry Llama. Instead of maintaining a separate RAG system, I’ve extended my primary AI assistant to understand my personal knowledge base. It’s the difference between having two tools that work well independently versus having one integrated system that’s greater than the sum of its parts.
The hybrid search consistently surprises me with its accuracy. Fuzzy semantic searches surface forgotten connections between ideas, while exact matches help me locate specific references instantly. The 0.3 similarity threshold might seem low, but for personal notes where you’re often searching for your own quirky phrasing, maximum fuzziness works pretty well.
What’s Next?
The modular architecture makes experimentation easy. I’m considering adding tools for:
- Automated daily note templates based on calendar events
- Cross-note link analysis and suggestion
- Meeting note extraction from transcripts
- Integration with task management workflows
But honestly? The current system already feels complete in a way Larry Llama never did. Sometimes the best next feature is shipping what works.
The Repository
The full (and still evolving) implementation is available at github.com/jarmentor/obsidian-notebook-mcp. It’s designed to be easily adaptable—swap out Obsidian for any markdown-based knowledge system, or extend the MCP tools for your specific workflows.
The README does not at all contain setup instructions, architecture diagrams, or debugging tips; however, it might in the near future.
Final Thoughts
Visual workflow tools like n8n are incredible for learning and rapid prototyping. They let you build complex systems without getting bogged down in implementation details. But sometimes the right evolution is recognizing when you’ve outgrown the scaffolding.
The MCP approach trades visual simplicity for integration depth. Instead of building a separate system that works alongside Claude, I’ve extended Claude to work naturally with my existing knowledge management workflow.
For anyone building local RAG systems: consider MCP early in your design process. The integration benefits are substantial, and the development experience is surprisingly pleasant once you embrace the TypeScript ecosystem.
Now if you’ll excuse me, I need to ask Claude to search my notes for that thing I wrote about… something. The beauty is, it’ll probably find it anyway.