Enterprise RAG + Agent Runtime
Superagent
A private AI layer for teams that need precise answers from messy documents, product data, tickets, databases, and internal workflows.
365UI builds private AI systems for serious enterprise workflows
Superagent is the flagship 365UI platform: a self-hostable RAG and agent system already deployed and operated inside an S&P 500 enterprise environment for teams that need accurate answers, auditable workflows, and private AI operations over real business data.
Answer path
Ingest → Retrieve → Rerank → Act
Evidence-grounded answers with traceable retrieval and replaceable infrastructure.
Platform
The public story stays simple: private AI infrastructure for teams. The underlying work spans RAG, memory, inbox automation, realtime voice, agent tools, and hardware operations.
Enterprise RAG + Agent Runtime
A private AI layer for teams that need precise answers from messy documents, product data, tickets, databases, and internal workflows.
Organizational Memory
An AI-era organizational memory layer where humans and agents co-write knowledge, Markdown + Git is the source of truth, and validated execution memory evolves into durable know-how.
Executive Workflow Automation
A private assistant for high-volume inboxes: daily digest, smart triage, attachment parsing, semantic email search, and draft replies.
Realtime Personal Interface
A hands-free voice layer for reminders, local device signals, and realtime conversation with tool execution.
Autonomous Workflow Automation
Agents that turn repetitive expert workflows into reviewed automation: code-fix runs, recruiting shortlists, report generation, and tool-backed operations.
Model Evaluation + GPU Inference
Benchmarking and deployment patterns for Llama, DeepSeek, Qwen, GLM, Kimi, embedding, and reranker models across modern GPU clusters.
AI Collaboration Governance
A documentation governance standard for CLAUDE.md, AGENTS.md, .cursorrules, lesson_learned, and ADRs so human and AI teams can collaborate without context decay.
Proof of Work
365UI is built from systems that have been deployed and operated in production, including S&P 500 enterprise copilot infrastructure, high-scale retrieval, autonomous process agents, model-serving infrastructure, messaging assistants, and edge voice products.
Private enterprise copilot
Architected a fully private, self-hosted AI assistant from zero to production. The platform handles web and SharePoint-style crawling, complex PDF/Office parsing, tenant-specific configuration, and zero-code deployment across industries.
Search, tools, and reasoning
Built hybrid retrieval over 1M+ vectors and 778K+ keyword documents with RRF, HyDE, iterative multi-hop retrieval, sibling expansion, recency reranking, and ReAct orchestration across databases, web search, and sandboxed Python.
Production-grade optimization
Improved enterprise benchmark quality and made the system fast enough for real-time assistance by cutting latency from 137 seconds to 6.6 seconds and increasing import throughput by 260x.
Code and recruiting automation
Built agents for high-volume operational workflows, including an AI code-fix agent that resolved 12,000+ static analysis issues and an AI recruiter that derives screening criteria from job descriptions and generates one-click shortlists.
Model selection and serving
Benchmarked 40+ foundation, embedding, and reranker models, then deployed optimized inference on H100/H200/B300-class GPU clusters using vLLM, SGLang, and llama.cpp.
Human + agent collaboration infrastructure
Designed and open-sourced a governance standard for AI collaboration docs, separating CLAUDE.md, AGENTS.md, .cursorrules, lesson_learned, and ADRs to prevent rule sprawl, duplication, and context pollution.
Personal production product
Launched an AI assistant inside a closed messaging ecosystem with multimodal chat, intent routing, web search, finance data, deep research, group analytics, message relay, moderation bots, and a REST API gateway.
Ambient intelligence prototype
Deployed a 24/7 Raspberry Pi 5 assistant with full-duplex realtime voice, persistent memory, calendar and reminder tools, quiet-hours scheduling, and camera-based physical context awareness.
Superagent Deep Dive
Superagent turns fragmented enterprise knowledge into reliable AI workflows, with real deployment experience inside an S&P 500 enterprise. It is designed for domains where answers require multiple evidence types: documents, web pages, product catalogs, support tickets, structured databases, and live operational tools.
The key difference is control. Each layer is replaceable: crawler, parser, chunker, embedding model, vector database, keyword engine, reranker, LLM, tools, and UI. That makes the platform useful for teams that need private deployment and measurable answer quality instead of opaque SaaS behavior.
Request path
Crawl websites, parse PDFs and Office files, normalize HTML tables, enrich chunks with context, and keep indexes fresh as source data changes.
Combine semantic vector search, keyword search, RRF fusion, reranking, metadata filters, and parent/sibling expansion so answers come from the right evidence.
Expose RAG and tool-using agents through an OpenAI-compatible API, with task-specific models, structured tool calls, and a single chat/completions entry point.
Inject live product, inventory, ticket, CRM, SQL, or operational data into the answer path instead of relying only on static documents.
Trace every retrieval and generation step, compare answer quality, inspect failed recalls, and improve chunking, ranking, prompts, and tools over time.
Run inside a customer-controlled environment with replaceable models, vector stores, search engines, and data connectors.
OrgMem Deep Dive
OrgMem puts organizational knowledge, decision history, and agent execution memory into one auditable system. It is both a knowledge base and a long-term memory layer: humans keep governance, while agents gain structured read/write access and continuous learning.
Memory evolution path
Episode → Pattern → Strategy → Playbook
Failures are remembered, successes are reinforced, and repeatedly verified experience becomes durable organizational know-how.
Knowledge is not locked in a database or SaaS dashboard. Humans, agents, editors, and Git history can all read, edit, diff, blame, branch, and review it.
Agents can create and update knowledge, but formal organizational memory goes through staging and human review before commit, reducing hallucinated memory and poisoning risk.
Qdrant vectors, Elasticsearch BM25, RRF fusion, reranking, and entity boost combine semantic recall with exact keyword precision for real organizational knowledge.
OrgMem stores episodes, failure patterns, strategies, preferences, summaries, and promoted know-how so agents remember execution traces, root causes, and effective tactics.
Session-local hot memory is zero-latency, project strategies preload at session start, and cross-project cold knowledge is retrieved only when needed to reduce context pollution.
Confidence, access counts, supersession, and decay fight memory rot. Repeatedly verified patterns can be promoted into durable playbooks and lessons.
Capability Map
Private knowledge ingestion from web pages, PDFs, Office files, email, databases, and APIs
Hybrid retrieval with vector search, BM25, RRF fusion, reranking, and structured data injection
Agent orchestration with tool calls, long-running workflows, memory, and observability
Self-hosted deployment patterns for teams that cannot send proprietary data to generic SaaS
Operational playbooks for evaluation, tracing, rollback, prompt/version control, and data refresh
Research and production track across deep research, code agents, memory systems, AI collaboration governance, recruiting automation, WeChat assistants, edge voice, and GPU inference
Commercial Offer
Turn one messy knowledge domain into a private AI assistant with measurable answer quality.
Add multiple data sources, role-based workflows, review queues, and production observability.
Use the Superagent runtime behind your own product, portal, support flow, or internal ops stack.
Bring one workflow, one dataset, and one success metric. We will turn it into a working pilot before expanding the platform.