Architecture

A detailed look at how MachinaOs works under the hood. For contributors, system designers, and the curious. For a navigable tour with hyperlinks, see the DeepWiki page.

System Overview

MachinaOs is three loosely-coupled tiers talking over a single persistent WebSocket connection:

+------------------------------------------------------------------------+
|  Frontend tier (client/)                                               |
|    React + TypeScript + React Flow + Zustand + Ant Design              |
|    Dracula-themed canvas, parameter panel, credentials modal           |
+------------------------------------------------------------------------+
                             |
                 WebSocket   |   /ws/status  (single long-lived connection)
                             v
+------------------------------------------------------------------------+
|  Backend tier (server/)                                                |
|    FastAPI + 89 WebSocket handlers + dependency injection container    |
|    WorkflowService (facade) -> NodeExecutor -> per-node handlers       |
|    ParameterResolver -> template {{node.field}} substitution           |
|    AuthService -> encrypted credentials (Fernet + PBKDF2)              |
+------------------------------------------------------------------------+
                             |
                             v
+------------------------------------------------------------------------+
|  Execution tier                                                        |
|    Temporal workers (distributed)  or  local asyncio decide loop       |
|    Redis (cache, locks, streams)   SQLite (machina.db, credentials.db) |
|    Node.js server (JS/TS code exec) WhatsApp RPC (port 9400)           |
|    Android relay WebSocket         Temporal server (ports 7233/8080)   |
+------------------------------------------------------------------------+
                             |
                             v
   External services: OpenAI, Anthropic, Gemini, OpenRouter, xAI,
   DeepSeek, Kimi, Mistral, Groq, Cerebras, Google Workspace, WhatsApp,
   Telegram, Twitter/X, Brave, Serper, Perplexity, Apify, residential
   proxies, webhooks.

96 Workflow Nodes

AI, agents, social, Android, documents, Google Workspace, code, proxies, utilities

10 LLM Providers

OpenAI, Anthropic, Gemini, OpenRouter, xAI, DeepSeek, Kimi, Mistral, Groq, Cerebras

15 Specialized Agents

Android, Coding, Web, Task, Social, Travel, Tool, Productivity, Payments, Consumer, Autonomous, Orchestrator, AI Employee, RLM, Claude Code

89 WebSocket Handlers

Push-based updates replace REST polling

49 Built-in Skills

Across 10 categories, DB-backed with SKILL.md defaults

Three Execution Modes

Temporal distributed, Redis parallel, sequential fallback

Execution Engine

Conductor’s Decide Pattern

Workflow orchestration is a single function with fork/join parallelism:

_workflow_decide(ctx):
Find ready nodes (all dependencies satisfied)
asyncio.gather() the ready layer -> run in parallel
Checkpoint state to Redis/SQLite
Recurse until every node terminal (completed / skipped / failed)

Each workflow run has its own isolated ExecutionContext. No shared global state between concurrent runs. Decide loops serialize per execution via Redis SETNX distributed locks.

Three Execution Modes

workflow.execute(workflow_id, workflow_data)
                    |
                    v
    TEMPORAL_ENABLED and server reachable?
                    |
         yes -> _execute_temporal() -- per-node activities,
                                        retries, horizontal scaling
                                        (primary production mode)
          no -> Redis available?
                    |
             yes -> _execute_parallel() -- decide loop + Kahn layers
                                            + Prefect-style input hash cache
                                            + DLQ + heartbeat recovery
              no -> _execute_sequential() -- topological walk
                                             (fallback for minimal env)

Layer Computation via Kahn’s Algorithm

Before execution, the DAG is sorted into layers. Layer 0 is the set of nodes with no dependencies; each subsequent layer depends only on earlier layers. Parallel execution runs each layer with asyncio.gather():

Layer 0: [start, cronScheduler]
Layer 1: [httpRequest, whatsappReceive]
Layer 2: [aiAgent]
Layer 3: [whatsappSend, console]

Toolkit sub-nodes (e.g., Android service nodes connected to androidTool) are detected and excluded from execution layers — they run only when the parent toolkit invokes them via tool calling.

Result Caching, Recovery, and DLQ

Prefect-style caching: every node result is stored in Redis/SQLite keyed by hash_inputs(inputs). Re-running an identical node returns the cached result with status TaskStatus.CACHED.
Heartbeat recovery: RecoverySweeper scans executions:active every 60s; nodes with stale heartbeats (> 5 min) are marked stuck and recovered on next startup.
Dead Letter Queue: failed nodes (after retry exhaustion) are quarantined with full input snapshot. Inspect, replay, or purge via the get_dlq_* / replay_dlq_entry / purge_dlq WebSocket handlers.

Edge Conditions

Edges carry optional conditions for runtime branching with 20+ operators (eq, neq, gt, lt, contains, exists, matches, in, …). Unmatched branches are marked TaskStatus.SKIPPED.

Event-Driven Deployment

Deployments are event-driven: each trigger event spawns an independent concurrent execution run. There is no iteration loop.

deploy_workflow(workflow_id)
        |
        v
  Set up triggers, return
        |
        +-> cronScheduler fires   -> ExecutionRun 1  (isolated context)
        +-> cronScheduler fires   -> ExecutionRun 2  (isolated context)
        +-> whatsappReceive fires -> ExecutionRun 3  (isolated context)
        +-> webhookTrigger fires  -> ExecutionRun 4  (isolated context)
        +-> telegramReceive fires -> ExecutionRun 5  (isolated context)
        +-> taskTrigger fires     -> ExecutionRun 6  (from delegated agent)

Multiple runs execute simultaneously with no interference. The firing trigger is marked complete before downstream execution starts; every other trigger node in the same run is auto-marked _pre_executed with {not_triggered: True} so they never block as event waiters.

Push vs Polling Triggers

Push triggers (asyncio.Future + dispatch):
  whatsappReceive, webhookTrigger, chatTrigger, taskTrigger,
  telegramReceive, start

Polling triggers (asyncio.Queue + poll coroutine):
  twitterReceive   (X API has no webhook on free tier)
  gmailReceive     (Gmail push requires paid Google Cloud setup)

Scheduler:
  cronScheduler    (APScheduler directly, not through event waiter)

Every push trigger goes through the generic event_waiter module: register a Waiter with a filter closure, suspend on wait_for_event(), resume when an external service dispatches a matching event. Backend supports both in-memory (asyncio.Future) and Redis Streams modes for Temporal multi-worker deployments.

AI Agent System

10 LLM Providers

Provider	Native SDK Path	LangChain Path	Notes
OpenAI	Yes	Yes	GPT 4.x/5.x + o-series reasoning
Anthropic	Yes	Yes	Extended thinking via `budget_tokens`
Gemini	Yes	Yes (fallback)	Native bypasses LangChain Windows hang
OpenRouter	Yes	Yes	200+ models through one API
xAI	Yes (shared OpenAIProvider)	Yes	OpenAI-compatible
DeepSeek	Yes (shared OpenAIProvider)	Yes	Chat + Reasoner (always-on CoT)
Kimi	Yes (shared OpenAIProvider)	Yes	Moonshot K2.5 / K2-thinking
Mistral	Yes (shared OpenAIProvider)	Yes	Large / Small / Codestral
Groq	No (LangChain only)	Yes	Llama 4, Qwen3, GPT-OSS
Cerebras	No (LangChain only)	Yes	Llama, Qwen

Dual-Path Architecture

execute_chat(parameters)                        execute_agent(parameters)
  direct chat completions                         LangGraph tool-calling loop
        |                                               |
        v                                               v
is_native_provider(provider)?              create_model(provider, ...)
        |                                               |
   yes -+-- create_provider() -> LLMResponse            v
        |                                       LangChain ChatOpenAI /
    no -+-- create_model() -> chat_model                ChatAnthropic /
            .invoke() -> LLMResponse                    ChatGoogleGenerativeAI
                                                        |
                                                        v
                                                   bind_tools(...)
                                                        |
                                                        v
                                                 LangGraph StateGraph
                                                 (agent node <-> tools node loop)

The native path returns a normalized LLMResponse dataclass across all providers. The LangChain path is used for agent tool-calling because LangGraph’s checkpointer, state graph, and tool-execution callback layer have no native equivalent today.

15 Specialized Agents

All specialized agents share the same handle architecture (input-main, input-memory, input-skill, input-tools, input-task) and inherit AI_AGENT_PROPERTIES. They only differ in icon, title, theme color, and default prompt.

android_agent      coding_agent        web_agent         task_agent
social_agent       travel_agent        tool_agent        productivity_agent
payments_agent     consumer_agent      autonomous_agent  orchestrator_agent
ai_employee        rlm_agent           claude_code_agent

Routing:

13 agents route to handle_chat_agent (LangGraph loop, shared code path)
rlm_agent routes to handle_rlm_agent -> RLMService (REPL-based recursive LM)
claude_code_agent routes to handle_claude_code_agent -> Claude Code SDK

Agent Teams Topology

orchestrator_agent and ai_employee have an extra input-teammates handle. Connected agents become delegate_to_<agent_type> tools automatically:

                   +-------------------------+
                   |     AI Employee         |
                   |   (orchestrator_agent)  |
                   +-----+------+------+-----+
                         |      |      |         input-teammates
           +-------------+      |      +------------+
           |                    |                   |
    +------v------+     +-------v-----+     +-------v-----+
    | coding_agent|     |  web_agent  |     | task_agent  |
    +-------------+     +-------------+     +-------------+
       delegate_         delegate_             delegate_
       to_coding_        to_web_               to_task_
       agent tool        agent tool            agent tool

The team lead’s LLM decides when to delegate based on task context. Delegation is fire-and-forget: the child spawns as asyncio.create_task(), the parent continues, and the child broadcasts its own status updates independently. Results can be retrieved via the auto-injected check_delegated_tasks tool or consumed by taskTrigger nodes elsewhere in the workflow.

LangGraph StateGraph Flow

             +---------+
  START ---> |  agent  |
             | (LLM)   |
             +----+----+
                  |
          should_continue()?
              /        \
       tools /          \ end
            v            v
       +---------+      END
       |  tools  |
       | (exec)  |
       +----+----+
            |
            +---------> loop back to agent

Max iterations guard against infinite tool loops. Tools are built as Pydantic-schema StructuredTool instances with the node’s parameter schema.

Memory, Skills, Tokens, and Cost

Markdown-Based Memory

The simpleMemory node stores conversation history as editable Markdown in the parameter panel. The AI Agent handler reads, parses to LangChain messages, executes, appends the new exchange, trims to a window, and archives removed messages to an InMemoryVectorStore (optional) using HuggingFace BAAI/bge-small-en-v1.5 embeddings.

AI Agent reads memoryContent (Markdown)
        |
        v
_parse_memory_markdown() -> LangChain Messages
        |
        v
(optional) vector store similarity_search(prompt, k=retrievalCount)
        |
        v
Execute LLM with full history + retrieved context
        |
        v
Append human + ai messages to Markdown
        |
        v
_trim_markdown_window(windowSize) -> (kept, removed)
        |
        v
If longTermEnabled: store.add_texts(removed)
        |
        v
Save updated Markdown back to node parameters

Skill System

49 built-in skills across 10 folders under server/skills/:

assistant/ (5)           android_agent/ (12)       autonomous/ (5)
coding_agent/ (2)        productivity_agent/ (6)   rlm_agent/ (1)
social_agent/ (5)        task_agent/ (3)           travel_agent/ (2)
web_agent/ (8)

Each skill is a folder with a SKILL.md file containing YAML frontmatter (name, description, allowed-tools, metadata) and Markdown instructions. First load seeds the database; after that the database is source of truth so users can edit skill instructions in the UI. “Reset to Default” reloads the original SKILL.md. The masterSkill node aggregates multiple skills with enable/disable toggles. A split-panel editor shows the skill list on the left and the selected skill’s Markdown on the right. When connected to an agent, the backend expands the skillsConfig parameter into individual skill entries injected into the agent’s system message.

Token Tracking and Compaction

Every AI execution stores a TokenUsageMetric row with input/output/cache/reasoning token counts and calculated costs (USD) based on server/config/pricing.json. Cumulative state per session lives in SessionTokenState. Compaction threshold priority:

Per-session custom_threshold (user-set)
Model-aware: 50% of the model’s context window (e.g., 500K for Claude Opus 4.6 with 1M context)
Global COMPACTION_THRESHOLD fallback from .env

When the cumulative session tokens cross the threshold, CompactionService.compact_context() generates a 5-section summary (Task Overview, Current State, Important Discoveries, Next Steps, Context to Preserve) following the Claude Code pattern and replaces the memory content. Anthropic and OpenAI also have native compaction APIs (context_management edits, compact_threshold) that are configured transparently.

Communication Layer

A single persistent WebSocket at /ws/status handles all frontend-backend communication. There are 89 WebSocket handlers registered with the @ws_handler decorator in server/routers/websocket.py, covering: node parameters, tool schemas, node execution, triggers, dead letter queue, deployment, AI operations, API keys, OAuth flows (Claude, Twitter, Google), Android, WhatsApp, Telegram, workflow storage, chat messages, console logs, skills, memory, user settings, pricing, agent teams, and model registry.

Request/Response Pattern

Frontend                                Backend
   |                                       |
   |-- {type: "execute_node", data: ...}-->|
   |                                       |-- dispatch to handle_execute_node()
   |                                       |-- run node handler
   |                                       |
   |<-- {type: "node_status", ... }---------|  (broadcast)
   |<-- {type: "node_output", ... }---------|  (broadcast)
   |<-- {type: "execute_node_response", ...}|  (direct reply, request_id matched)

Broadcast Message Types

node_status         workflow_status      api_key_status
node_output         android_status       token_usage_update
variable_update     compaction_starting  node_parameters_updated
variables_update    compaction_completed

Auto-Reconnect and Keepalive

Frontend sends {"type": "ping"} every 30 seconds. On disconnect, the WebSocket context schedules a reconnect after 3 seconds with a 100ms mount delay to avoid React Strict Mode double-connect in development. Connection is gated on isAuthenticated so logged-out users never connect.

Cache, Persistence, and Security

Cache Fallback Hierarchy

MachinaOs follows the n8n cache pattern with automatic environment-based fallback:

Production (Docker):  Redis  ->  SQLite (cache_entries)  ->  in-memory dict
Local development:             SQLite (cache_entries)  ->  in-memory dict
                               (Redis disabled via REDIS_ENABLED=false)

CacheService in server/core/cache.py checks each backend in order. TTL expiration is supported across all three. A background cleanup_expired_cache() task removes expired SQLite rows.

Encrypted Credentials

API keys and OAuth tokens live in a separate SQLite database (credentials.db) isolated from the main machina.db. Encryption uses Fernet (AES-128-CBC + HMAC-SHA256) with keys derived from a server-scoped config key via PBKDF2HMAC (600K iterations, OWASP 2024 recommendation).

API_KEY_ENCRYPTION_KEY (.env)  +  salt (from credentials.db, 256 bits)
                    |
                    v
           PBKDF2HMAC-SHA256
           (600,000 iterations)
                    |
                    v
        urlsafe_b64encode -> Fernet key
                    |
                    v
           held in memory only
                    |
                    v
      encrypt() / decrypt() inside EncryptionService

Two separate credential systems inside credentials.db:

API key system (EncryptedAPIKey table): provider keys the user enters manually
OAuth token system (EncryptedOAuthToken table): tokens from OAuth flows (Google, Twitter, Claude.ai)

All routers access credentials through AuthService only. Direct database access is forbidden. Decrypted values are cached in AuthService memory dicts and never written to disk or Redis. For cloud deployments, CREDENTIAL_BACKEND can be switched to keyring (OS-native) or aws (Secrets Manager) via the CredentialBackend abstraction.

Authentication

JWT in HttpOnly cookies following the n8n pattern. Two modes:

AUTH_MODE=single - first user becomes owner, registration disabled after
AUTH_MODE=multi - open registration for cloud deployments
VITE_AUTH_ENABLED=false - bypass login entirely for local development

WebSocket connections check the JWT cookie before accepting. Auth context has exponential-backoff retry (5 attempts) to survive race conditions where the frontend starts before the backend is ready.

Node Catalog

Browse all 96 workflow nodes by category

AI Models

10 LLM providers with native SDK and LangChain paths

AI Agents

15 specialized agents with teams and delegation

GitHub

Source code and in-repo docs-internal/ deep dives

Getting Started

Tutorials

Node Catalog

Deployment

FAQ

​Architecture

​System Overview

96 Workflow Nodes

10 LLM Providers

15 Specialized Agents

89 WebSocket Handlers

49 Built-in Skills

Three Execution Modes

​Execution Engine

​Conductor’s Decide Pattern

​Three Execution Modes

​Layer Computation via Kahn’s Algorithm

​Result Caching, Recovery, and DLQ

​Edge Conditions

​Event-Driven Deployment

​Push vs Polling Triggers

​AI Agent System

​10 LLM Providers

​Dual-Path Architecture

​15 Specialized Agents

​Agent Teams Topology

​LangGraph StateGraph Flow

​Memory, Skills, Tokens, and Cost

​Markdown-Based Memory

​Skill System

​Token Tracking and Compaction

​Communication Layer

​Request/Response Pattern

​Broadcast Message Types

​Auto-Reconnect and Keepalive

​Cache, Persistence, and Security

​Cache Fallback Hierarchy

​Encrypted Credentials

​Authentication

​Related

Node Catalog

AI Models

AI Agents

GitHub

Architecture

System Overview

Execution Engine

Conductor’s Decide Pattern

Three Execution Modes

Layer Computation via Kahn’s Algorithm

Result Caching, Recovery, and DLQ

Edge Conditions

Event-Driven Deployment

Push vs Polling Triggers

AI Agent System

10 LLM Providers

Dual-Path Architecture

15 Specialized Agents

Agent Teams Topology

LangGraph StateGraph Flow

Memory, Skills, Tokens, and Cost

Markdown-Based Memory

Skill System

Token Tracking and Compaction

Communication Layer

Request/Response Pattern

Broadcast Message Types

Auto-Reconnect and Keepalive

Cache, Persistence, and Security

Cache Fallback Hierarchy

Encrypted Credentials

Authentication

Related