A self-hosted brain for everything you know.
Mantle turns your emails, files, notes, documents, conversations, contacts, events, and projects into one living, AI-queryable memory — owned by you, running on your hardware, with agents that genuinely remember.
You talk to it on the web or Telegram, text or voice. You connect Claude to it over MCP. You drop a PDF in chat and it’s indexed before you’ve finished your sentence. You mention “that gantry note from April” and it knows exactly which one — because it read it, summarised it, extracted the facts, linked the people and projects, and filed every receipt.
This page runs Mantle’s own theme system — 42 themes, light and dark, and every visit lands on a different one. Pin a favorite with the palette in the header, or roll the dice.
- Personawho your assistant is — learned, standing
- Recent turnsthe live conversation, every channel
- Digestsolder talk, compressed by topic
- Profile factsdurable truths, kept current
- Content indexsummaries, entities, vectors, passages
- Content storethe originals — append-only, citable
Efficiency
Each turn retrieves only what the question needs — a small, surgical prompt instead of a haystack.
Integrity
Knowledge never degrades: originals are append-only, facts are superseded rather than overwritten, and a standing audit watches for drift.
Personality
Your assistant truly gets to know you — and never forgets. One relationship that compounds.
Affordability
Frontier models only where they matter, economy models and local embeddings everywhere else. Turns cost cents.
Quality
A genuinely well-built base — 1,349 automated tests, and a measured eval number behind every ranking knob.
MCP / API
The whole brain is accessible to any application that speaks MCP — ~30 tools for search, graph, files, and email.
The architecture
The brain is the product
Most AI assistants are a chat window with amnesia. Mantle is built the other way around: the memory system is the core, and chat is just one doorway into it. Every piece of content that enters — an email, a voice note, a spreadsheet, a journal entry — flows through one pipeline into six layers of memory.
Who works where, what banks with whom — extracted automatically as you go, traversable in milliseconds. Plain Postgres, no graph database.
When a summary isn’t enough, a specialist agent replays the actual words of any past conversation window — last Tuesday or last year.
There is no “new chat”. There is one relationship that compounds.
Who it's for
One brain per install
What that brain holds — a life, a team, a product, a robot — is up to you.
One person, one life
Your inbox, your files, your journal, your todo list, your contacts, your secrets (sealed — the AI physically cannot read them) — finally in one place that answers questions. “When does Sarah’s passport expire?” “What did the electrician quote in March?” “What did we decide about the kitchen?” It knows, and it shows the receipt.
A team's working memory
Notes, pages (Notion-style documents), typed tables, shared files — every artifact indexed and queryable, with public share links for anything worth publishing, and Mantle-to-Mantle federation for exchanging scoped data between sovereign instances.
A company's docs behind an MCP chatbot
Point a Mantle instance at your documentation, manuals, and internal know-how; it becomes a fully-indexed brain — semantic search, passage retrieval, knowledge graph — that any MCP client can query with ~30 tools. Your support bot stops hallucinating answers and starts citing your actual docs.
A robot's integrated personality
A companion that resets every session is a toy. Mantle’s persona stack — the seed personality, what the reflector learns, the Life Logs identity block, facts that supersede instead of duplicate — is precisely the same companion yesterday, today, and in five years. Heartbeats give it the shape of deciding to speak without being annoying, voice and vision are already routed, and the unified conversation stream means a robot is just one more channel.
Humanoid robots — and not as a stretch
A robot with onboard inference is almost the purest expression of what Mantle is built to be, because the things robots conspicuously lack are exactly the things Mantle treats as the product. The local story is complete: embeddings are already computed on-device, and the adapter framework means the conversational model can be served from the robot’s own silicon with a cloud backup route — the brain, the vectors, and the model all stay on the robot. Nothing about the architecture assumes a cloud. And on local silicon, targeted context stops being a cost feature and becomes a latency feature: small, surgical prompts are fast prompts, and a companion lives or dies on conversational latency.
Why it's different
Say what it does, mechanically
It's genuinely yours
Self-hosted, a single set of Docker services, no SaaS in the runtime path. Embeddings are computed locally — the vectors never leave your box, and they cost $0. Secrets are AES-256-GCM sealed; the extractor is structurally unable to read them. Scheduled backups are built in: point your own rsync at one folder and the whole brain is portable.
One Postgres, no zoo
Vector search, the knowledge graph, full-text search, job queues, real-time UI updates, auth — all one database. No Pinecone, no Neo4j, no Redis, no message broker. The lean stack is what’s left after deleting every moving part personal-scale data doesn’t need — which is also why it restores from one pg_dump.
It builds a personality around you — and it never forgets
While you talk, a background reflector studies the conversation and appends what it learns to your assistant’s standing persona: how you like to be answered, what you corrected, the running jokes. Tell it once that you hate bullet points, and that’s simply who it is from then on. Nothing falls off the back of the context window — and when a summary isn’t enough, a recall specialist replays the actual words of any past conversation, from last Tuesday or last year.
Context that targets the question
Mantle doesn’t dump your life into the prompt. Each turn it retrieves just what this question needs — the top facts, the right documents down to the exact passages, the graph relationships of the entities involved — ranked by relevance, recency, and salience. A newsletter can never crowd out a real letter. The model sees a small, surgical prompt instead of a haystack — which is why answers are sharp, and why turns cost cents. Every ranking knob has a measured eval number behind it, not a vibe.
Engineered to be cheap
Frontier-model quality where it matters (your conversations), economy models for background compression, local embeddings for everything vector. Prompt prefixes are kept byte-stable for provider caching; oversized tool results spill to an addressable store instead of re-billing every turn. Measured on the author’s production instance: a full question-answer turn against the whole brain averages ~$0.09, and a month of real daily use ran under $5 in total LLM spend.
Agents with jobs, not just a chatbot
Your main assistant has tools to act with — notes, events, email send, image generation, page authoring — and specialists it delegates to: Remy replays past conversations losslessly, Researcher searches the web and cites, Pages and Tables edit documents block-by-block. Proactive heartbeats let it check in on schedules you define. Voice in, voice out.
Nothing happens without a trace
Every ingest, every extraction, every tool call, every model invocation becomes a queryable trace with cost attribution — rendered as a live “what did the brain just do” journey view. A standing integrity audit watches the corpus for drift and says exactly how to heal each finding.
It knows who you are — because you told it
The learned personality is one half; Life Logs are the other: short first-person entries about who you are, what you do, how you feel, distilled into an always-on identity block every agent reads on every turn. What it observes, it learns; what you declare, it never has to guess.
Measured, not promised
The numbers
- average per full Q&A turn against the whole brain
- ~$0.09
- total LLM spend in a month of real daily use
- <$5/mo
- embeddings — computed locally; vectors never leave your box
- $0
- layers of memory, all live
- 6
- MCP tools exposing the brain to any client
- ~30
- Postgres — vectors, graph, FTS, queues, realtime, auth
- 1
- automated tests on main
- 1,349
- color themes — you're looking at one
- 42
Cost figures measured on the author’s production instance over 30 days of real use (June 2026) — your models and usage will vary. The whole brain restores from one pg_dump.
Quick start
Run it
Mantle is self-hosted software, not a hosted service. There is nothing to sign up for — clone it and own it.
git clone https://github.com/crossworks-engineering/mantle && cd mantle pnpm install cp .env.example apps/web/.env.local # two generated secrets — see the guide ollama pull embeddinggemma # local dev only; production bundles it pnpm start
Open http://localhost:3000, create your account, and the onboarding wizard takes it from there: model keys, your assistant’s personality, who you are. Voice, vision, and image generation run on your own provider keys — bring the models you trust.
The doorways
One brain, five ways in
The brain is the product — chat is just one doorway into it.