Brand LogoBrand Logo (Dark)
HomeAI AgentsToolkitsGitHub PicksSubmit AgentBlog

Categories

  • Art Generators
  • Audio Generators
  • Automation Tools
  • Chatbots & AI Agents
  • Code Tools
  • Financial Tools

Categories

  • Large Language Models
  • Marketing Tools
  • No-Code & Low-Code
  • Research & Search
  • Video & Animation
  • Video Editing

GitHub Picks

  • DeerFlow — ByteDance Open-Source SuperAgent Harness

Latest Blogs

  • OpenClaw vs Composer 2 Which AI Assistant Delivers More Value
  • Google AI Studio vs Anthropic Console
  • Stitch 2.0 vs Lovable Which AI Design Tool Wins in 2026
  • Monetizing AI for Solopreneurs and Small Teams in 2026
  • OpenClaw vs MiniMax Which AI Assistant Wins in 2026

Latest Blogs

  • OpenClaw vs KiloClaw Is Self-Hosting Still Better
  • OpenClaw vs Kimi Claw
  • GPT-5.4 vs Gemini 3.1 Pro
  • Farewell to Bloomberg Terminal as Perplexity Computer AI Redefines Finance
  • Best Practices for OpenClaw
LinkStartAI© 2026 LinkstartAI. All rights reserved.
Contact UsAbout
  1. Home
  2. GitHub Picks
  3. QMD
QMD logo

QMD

An on-device Markdown search engine: BM25 + vector search + local LLM reranking, plus an MCP server for agent integrations.
9.6kTypeScriptMIT
typescriptbunbm25vector-searchhybrid-searchmcp-server
markdown-notes-search
agentic-retrieval
alternative-to-ripgrep
alternative-to-obsidian-search
alternative-to-notion-search
alternative-to-rag

What is it?

QMD is an on-device search engine for Markdown that turns your notes, meeting transcripts, docs, and knowledge bases into a retrieval layer agents can call directly. It combines keyword recall via SQLite FTS5 (BM25) with vector semantic recall, then uses local GGUF models through node-llama-cpp for query expansion and reranking to improve answerability without shipping data to a remote index. The CLI is designed for automation with structured outputs (--json/--files/--csv) and ships an embedded Model Context Protocol (MCP) server, exposing search/get/status as tools for agentic workflows.

Pain Points vs Innovation

✕Traditional Pain Points✓Innovative Solutions
With large Markdown corpora, grep/keyword search misses paraphrases and cross-section clues, which hurts agent context quality.QMD uses a hybrid pipeline (FTS5/BM25 + vectors + local LLM reranking) to optimize recall and answerability as separate stages.
Agent integrations often either stuff raw text into context or rely on remote vector DBs, creating cost and privacy drift.An embedded Model Context Protocol (MCP) server plus structured outputs lets agents fetch only the needed snippets instead of scanning everything.

Architecture Deep Dive

Hybrid retrieval pipeline (FTS5 + vectors + rerank)
Search is staged: fast keyword recall with SQLite FTS5 BM25, semantic recall with vectors, then local reranking with GGUF models via node-llama-cpp to optimize for answerability rather than raw match.
Query expansion and fusion ranking
The system generates query variants and retrieves in parallel, then uses fusion plus position-aware blending to preserve exact matches while benefiting from expanded recall and controlled candidate sizes.
Agent-facing interface layer (CLI + MCP)
A dual surface: CLI for automation-friendly structured outputs, and an MCP server that exposes search/get/status as callable tools inside agent workflows.

Deployment Guide

1. Install prerequisites (requires Bun)

bash
1bun --version

2. Install QMD globally

bash
1bun install -g https://github.com/tobi/qmd

3. Add collections and build embeddings (first run downloads local model cache)

bash
1qmd collection add ~/notes --name notes && qmd embed

4. Search using the right mode

bash
1qmd search "auth"  # BM25\nqmd vsearch "login flow"  # vector\nqmd query "how to deploy"  # hybrid+rerank

5. Run the MCP server (tool interface for agents)

bash
1qmd mcp  # local stdio MCP server

Use Cases

Core SceneTarget AudienceSolutionOutcome
Local retrieval tool for Claude Code/desktop agentsindividuals and teams using agentsindex local notes and project docs and return structured JSON snippetsfewer wasted tokens and more grounded answers
Offline retrieval layer for private knowledge basessecurity- and compliance-sensitive orgsrun hybrid retrieval and reranking on employee machines or internal hostsbetter search and QA without shipping content to external vector DBs
Continuous indexing for meeting notes and logsmanagers and engineers tracking decisionsorganize transcripts and logs as collections with periodic updatesnatural-language recall of what was decided and where it was written

Limitations & Gotchas

Limitations & Gotchas
  • Semantic search and reranking download local models on first use; plan for disk and time, and start with small collections.
  • On macOS you may need an additional SQLite build for extension support; constrained environments should provision deps and PATH early.

Frequently Asked Questions

How do I use QMD as a tool, not a fragile shell wrapper?▾
Run the embedded Model Context Protocol (MCP) server so agents can call it as tools and receive structured outputs; use --json/--files to keep responses minimal.
Why do search/vsearch/query return different results?▾
They are different pipelines: search optimizes exact keyword hits, vsearch optimizes semantic recall, and query fuses multiple signals and reranks to maximize answerable context.
How do I keep sensitive notes private?▾
Keep everything on-device: content is indexed in SQLite and semantic/rerank runs on local cached models. Also audit where you send outputs—shell history, scripts, and logs can leak more than the index.
View on GitHub

Project Metrics

Stars9.6 k
LanguageTypeScript
LicenseMIT
Deploy DifficultyMedium

Table of Contents

  1. 01What is it?
  2. 02Pain Points vs Innovation
  3. 03Architecture Deep Dive
  4. 04Deployment Guide
  5. 05Use Cases
  6. 06Limitations & Gotchas
  7. 07Frequently Asked Questions

Related Projects

Pi Monorepo
Pi Monorepo
14.1 k·TypeScript
zvec
zvec
8.2 k·C++
ZeroClaw
ZeroClaw
15.6 k·Rust
NanoClaw
NanoClaw
8.6 k·TypeScript