>
Under the Hood

Architecture

The operations graph, data model, and retrieval pipeline that power Workipedia. This is how raw calls, messages, and emails become structured business intelligence.

System Layers

Four layers, one operations graph.

Workipedia operates as a layered system. Raw sources flow in at the bottom, intelligence processes them in the middle, memory stores what matters, and surfaces deliver context where employees need it.

Source Layer
>Calls (STT)
>Email Inboxes
>SMS / Messages
>Calendar Events
>Document Upload + OCR
>Attachment Processing
Intelligence Layer
>Fact Extraction
>Identity Resolution
>Live Signal Detection
>Rolling Transcript Windows
>Sentiment Analysis
>Intent Classification
Memory Layer
>MemoryChunks (Vector)
>Facts (Structured)
>FieldValues (Schema)
>Procedures (Inferred)
>Customer Profiles
>Interaction History
Surface Layer
>Cockpit (Live Calls)
>Inbox (Messages)
>Email Sidebar
>AI Draft Assistance
>Context Cards
>Feedback Prompts
Data Model

Seven core entities.

The data model is designed around traceability and governance. Every piece of knowledge traces back to a source artifact, and every schema change goes through a governed proposal process.

Facts

Atomic units of knowledge — a customer preference, a gate code, a contact method. Every fact traces back to its source artifact.

FieldValues

Structured data fields inferred from expert behavior. The schema of how the business actually operates.

FactDefinitions

The meta-schema — what kinds of facts the system can learn. Grows as the business reveals new patterns.

MemoryChunks

Vectorized memory fragments for retrieval. Summaries, transcripts, and contextual snapshots.

StewardProposals

Governed change requests to the schema. Every proposed change goes through review before becoming live.

SignalEvents

Real-time detection events from live calls — intent shifts, sentiment changes, outcome predictions.

RetrievalTraces

Full audit trail of what the AI saw and why it suggested something. Traceability is not optional.

Retrieval Pipeline

From raw source to schema of work.

The retrieval pipeline prioritizes confirmed facts first, then active work state, then recent communication history, then broader memory. Every AI-assisted response can answer: what did the model see, and why did it suggest this?

workipedia ~ pipeline
01
Raw Source Ingestion

Calls, emails, messages, PDFs, scans, attachments — the messy reality of small business communication.

02
Extraction

Multi-modal extraction pulls facts, entities, preferences, and patterns from raw source material.

03
Evidence-Backed Facts

Every extracted fact links back to its source artifact. No fact exists without provenance.

04
Human Confirmation

Expert employees confirm, reject, or correct facts through lightweight prompts at natural moments.

05
Nightly Steward

Batch synthesis reviews the day's signals, proposes schema changes, and updates memory.

06
Schema of Work

The living, governed operating model of the business — procedures, fields, facts, and memory.

// Our advantage is not that other systems can send us clean facts.
// Our advantage is that we can find facts where other systems still
// see only emails, calls, PDFs, scans, and attachments.

Privacy + Governance

Layered redaction. Governed proposals. Full traceability.

Privacy Redaction
>Schema-aware masking
>Privacy filter span detection
>Deterministic fallbacks
>Role-aware display
Steward Governance
>All schema changes proposed, never auto-applied
>Human review before any field goes live
>Rollback capability on every change
>Audit trail of who approved what
Retrieval Traces
>Every AI response traces to source
>What did the model see?
>Why did it suggest this?
>Full transparency, not a black box