Self-Hosted AI Agent with Epistemic Integrity
Honest. Cited. Verifiable. Run your own AI agent — no vendor lock-in, no cloud dependency, no hallucination left unlabeled.
Every answer carries an epistemic label: [FACT] [OPINION] [UNKNOWN]
Every claim has a traceable source chain — auditable, verifiable, no ghost citations.
Answers pass an epistemological filter before they reach you. Skepticism is a feature.
What's inside
Every design decision traces back to one goal: an AI agent you can actually trust because you can verify it.
Search your local corpus without embedding API calls. BM25 + dense retrieval, fully offline.
Reason → Act → Observe → repeat. Grounded answers through iterative tool use, not single-shot guessing.
Bahasa Indonesia · Arabic · English — plus Javanese, Sundanese. Automatic language detection.
Auto-detects language, literacy level, intent, and cultural context. Answers calibrated to the reader, not a generic profile.
MIGHAN (Creative) · TOARD (Planning) · FACH (Academic) · HAYFAR (Technical) · INAN (Simple). One brain, five voices.
Anti-injection, anti-toxic, anti-PII guard. Tool execution is whitelisted — OFF by default, explicit permission required.
Qwen2.5-7B + QLoRA (r=16, 40M trainable params). 713 Q&A trilingual dataset. 84 min Kaggle T4 training.
No OpenAI, no Anthropic, no Gemini API by default. Your data never leaves your machine without your explicit consent.
AI/ML, Coding, Web, Mobile, Game Dev, Linguistics, Visual AI — growing corpus, community-extensible.
Get running in 5 minutes
Python 3.11+ · Node 18+ · 4GB RAM minimum
1 · Clone & Install
2 · Build Index
3 · Run
4 · Ask
Completed ✓
RAG + BM25 Retriever
Local corpus search without API dependency
ReAct Agent Loop
Multi-step reasoning with tool use
Islamic Epistemology Engine
Maqasid 5-axis + Sanad validator + Constitutional check
Fine-tune v1 (Qwen2.5-7B QLoRA)
713 Q&A trilingual dataset on Kaggle T4
50+ Research Notes Corpus
AI/ML, Coding, Web, Mobile, Game, Linguistics, Visual AI
Hafidz Ledger (Merkle tamper-evident log)
Append-only verifiable corpus log
In Progress
GPU Inference Integration
Qwen2.5-7B + PeftModel local inference
Docker Deployment
VPS-ready container, sidixlab.com live
Planned
Streaming SSE fully wired to epistemology engine
Fine-tune v2 (expanded corpus)
Vision AI (LLaVA / Qwen-VL caption pipeline)
Distributed Hafidz sync (P2P corpus nodes)
Mobile app
Open source, built together
Whether you write code, research, or documentation — every contribution moves the mission forward.
Add topics to brain/public/research_notes/. Any domain welcome.
Help expand corpus coverage in more languages. Indonesian, English, Arabic, and beyond.
See translation tasks →Add agent tools to agent_tools.py. Whitelisted, sandboxed by default.
Improve or extend docs. Especially setup guides, architecture notes, and tutorials.
See doc tasks →Read our contribution guide, check open issues, and open your first PR.
Stay in the loop
Bug Reports
Report issues & track fixes
Feature Requests
Suggest new capabilities
Discussions
Ideas, Q&A, community
Backlog
Full project board
Feedback, questions, partnership ideas — we read every message.
Community
Updates, demos, and behind-the-scenes on the build.
Newsletter
Release notes, new features, and research updates. No spam. Unsubscribe anytime.
Or subscribe via RSS feed for changelog updates.
Support
SIDIX is MIT-licensed and solo-built. If it's useful to you, a sponsorship helps cover server costs and development time.