Skip to main content

Architecture

A detailed look at how MachinaOs works under the hood. For contributors, system designers, and the curious. For a navigable tour with hyperlinks, see the DeepWiki page.

System Overview

MachinaOs is three loosely-coupled tiers talking over a single persistent WebSocket connection:
+------------------------------------------------------------------------+
|  Frontend tier (client/)                                               |
|    React + TypeScript + React Flow + Zustand + Ant Design              |
|    Dracula-themed canvas, parameter panel, credentials modal           |
+------------------------------------------------------------------------+
                             |
                 WebSocket   |   /ws/status  (single long-lived connection)
                             v
+------------------------------------------------------------------------+
|  Backend tier (server/)                                                |
|    FastAPI + 89 WebSocket handlers + dependency injection container    |
|    WorkflowService (facade) -> NodeExecutor -> per-node handlers       |
|    ParameterResolver -> template {{node.field}} substitution           |
|    AuthService -> encrypted credentials (Fernet + PBKDF2)              |
+------------------------------------------------------------------------+
                             |
                             v
+------------------------------------------------------------------------+
|  Execution tier                                                        |
|    Temporal workers (distributed)  or  local asyncio decide loop       |
|    Redis (cache, locks, streams)   SQLite (machina.db, credentials.db) |
|    Node.js server (JS/TS code exec) WhatsApp RPC (port 9400)           |
|    Android relay WebSocket         Temporal server (ports 7233/8080)   |
+------------------------------------------------------------------------+
                             |
                             v
   External services: OpenAI, Anthropic, Gemini, OpenRouter, xAI,
   DeepSeek, Kimi, Mistral, Groq, Cerebras, Google Workspace, WhatsApp,
   Telegram, Twitter/X, Brave, Serper, Perplexity, Apify, residential
   proxies, webhooks.

96 Workflow Nodes

AI, agents, social, Android, documents, Google Workspace, code, proxies, utilities

10 LLM Providers

OpenAI, Anthropic, Gemini, OpenRouter, xAI, DeepSeek, Kimi, Mistral, Groq, Cerebras

15 Specialized Agents

Android, Coding, Web, Task, Social, Travel, Tool, Productivity, Payments, Consumer, Autonomous, Orchestrator, AI Employee, RLM, Claude Code

89 WebSocket Handlers

Push-based updates replace REST polling

49 Built-in Skills

Across 10 categories, DB-backed with SKILL.md defaults

Three Execution Modes

Temporal distributed, Redis parallel, sequential fallback

Execution Engine

Conductor’s Decide Pattern

Workflow orchestration is a single function with fork/join parallelism:
_workflow_decide(ctx):
  1. Find ready nodes (all dependencies satisfied)
  2. asyncio.gather() the ready layer -> run in parallel
  3. Checkpoint state to Redis/SQLite
  4. Recurse until every node terminal (completed / skipped / failed)
Each workflow run has its own isolated ExecutionContext. No shared global state between concurrent runs. Decide loops serialize per execution via Redis SETNX distributed locks.

Three Execution Modes

workflow.execute(workflow_id, workflow_data)
                    |
                    v
    TEMPORAL_ENABLED and server reachable?
                    |
         yes -> _execute_temporal() -- per-node activities,
                                        retries, horizontal scaling
                                        (primary production mode)
          no -> Redis available?
                    |
             yes -> _execute_parallel() -- decide loop + Kahn layers
                                            + Prefect-style input hash cache
                                            + DLQ + heartbeat recovery
              no -> _execute_sequential() -- topological walk
                                             (fallback for minimal env)

Layer Computation via Kahn’s Algorithm

Before execution, the DAG is sorted into layers. Layer 0 is the set of nodes with no dependencies; each subsequent layer depends only on earlier layers. Parallel execution runs each layer with asyncio.gather():
Layer 0: [start, cronScheduler]
Layer 1: [httpRequest, whatsappReceive]
Layer 2: [aiAgent]
Layer 3: [whatsappSend, console]
Toolkit sub-nodes (e.g., Android service nodes connected to androidTool) are detected and excluded from execution layers — they run only when the parent toolkit invokes them via tool calling.

Result Caching, Recovery, and DLQ

  • Prefect-style caching: every node result is stored in Redis/SQLite keyed by hash_inputs(inputs). Re-running an identical node returns the cached result with status TaskStatus.CACHED.
  • Heartbeat recovery: RecoverySweeper scans executions:active every 60s; nodes with stale heartbeats (> 5 min) are marked stuck and recovered on next startup.
  • Dead Letter Queue: failed nodes (after retry exhaustion) are quarantined with full input snapshot. Inspect, replay, or purge via the get_dlq_* / replay_dlq_entry / purge_dlq WebSocket handlers.

Edge Conditions

Edges carry optional conditions for runtime branching with 20+ operators (eq, neq, gt, lt, contains, exists, matches, in, …). Unmatched branches are marked TaskStatus.SKIPPED.

Event-Driven Deployment

Deployments are event-driven: each trigger event spawns an independent concurrent execution run. There is no iteration loop.
deploy_workflow(workflow_id)
        |
        v
  Set up triggers, return
        |
        +-> cronScheduler fires   -> ExecutionRun 1  (isolated context)
        +-> cronScheduler fires   -> ExecutionRun 2  (isolated context)
        +-> whatsappReceive fires -> ExecutionRun 3  (isolated context)
        +-> webhookTrigger fires  -> ExecutionRun 4  (isolated context)
        +-> telegramReceive fires -> ExecutionRun 5  (isolated context)
        +-> taskTrigger fires     -> ExecutionRun 6  (from delegated agent)
Multiple runs execute simultaneously with no interference. The firing trigger is marked complete before downstream execution starts; every other trigger node in the same run is auto-marked _pre_executed with {not_triggered: True} so they never block as event waiters.

Push vs Polling Triggers

Push triggers (asyncio.Future + dispatch):
  whatsappReceive, webhookTrigger, chatTrigger, taskTrigger,
  telegramReceive, start

Polling triggers (asyncio.Queue + poll coroutine):
  twitterReceive   (X API has no webhook on free tier)
  gmailReceive     (Gmail push requires paid Google Cloud setup)

Scheduler:
  cronScheduler    (APScheduler directly, not through event waiter)
Every push trigger goes through the generic event_waiter module: register a Waiter with a filter closure, suspend on wait_for_event(), resume when an external service dispatches a matching event. Backend supports both in-memory (asyncio.Future) and Redis Streams modes for Temporal multi-worker deployments.

AI Agent System

10 LLM Providers

ProviderNative SDK PathLangChain PathNotes
OpenAIYesYesGPT 4.x/5.x + o-series reasoning
AnthropicYesYesExtended thinking via budget_tokens
GeminiYesYes (fallback)Native bypasses LangChain Windows hang
OpenRouterYesYes200+ models through one API
xAIYes (shared OpenAIProvider)YesOpenAI-compatible
DeepSeekYes (shared OpenAIProvider)YesChat + Reasoner (always-on CoT)
KimiYes (shared OpenAIProvider)YesMoonshot K2.5 / K2-thinking
MistralYes (shared OpenAIProvider)YesLarge / Small / Codestral
GroqNo (LangChain only)YesLlama 4, Qwen3, GPT-OSS
CerebrasNo (LangChain only)YesLlama, Qwen

Dual-Path Architecture

execute_chat(parameters)                        execute_agent(parameters)
  direct chat completions                         LangGraph tool-calling loop
        |                                               |
        v                                               v
is_native_provider(provider)?              create_model(provider, ...)
        |                                               |
   yes -+-- create_provider() -> LLMResponse            v
        |                                       LangChain ChatOpenAI /
    no -+-- create_model() -> chat_model                ChatAnthropic /
            .invoke() -> LLMResponse                    ChatGoogleGenerativeAI
                                                        |
                                                        v
                                                   bind_tools(...)
                                                        |
                                                        v
                                                 LangGraph StateGraph
                                                 (agent node <-> tools node loop)
The native path returns a normalized LLMResponse dataclass across all providers. The LangChain path is used for agent tool-calling because LangGraph’s checkpointer, state graph, and tool-execution callback layer have no native equivalent today.

15 Specialized Agents

All specialized agents share the same handle architecture (input-main, input-memory, input-skill, input-tools, input-task) and inherit AI_AGENT_PROPERTIES. They only differ in icon, title, theme color, and default prompt.
android_agent      coding_agent        web_agent         task_agent
social_agent       travel_agent        tool_agent        productivity_agent
payments_agent     consumer_agent      autonomous_agent  orchestrator_agent
ai_employee        rlm_agent           claude_code_agent
Routing:
  • 13 agents route to handle_chat_agent (LangGraph loop, shared code path)
  • rlm_agent routes to handle_rlm_agent -> RLMService (REPL-based recursive LM)
  • claude_code_agent routes to handle_claude_code_agent -> Claude Code SDK

Agent Teams Topology

orchestrator_agent and ai_employee have an extra input-teammates handle. Connected agents become delegate_to_<agent_type> tools automatically:
                   +-------------------------+
                   |     AI Employee         |
                   |   (orchestrator_agent)  |
                   +-----+------+------+-----+
                         |      |      |         input-teammates
           +-------------+      |      +------------+
           |                    |                   |
    +------v------+     +-------v-----+     +-------v-----+
    | coding_agent|     |  web_agent  |     | task_agent  |
    +-------------+     +-------------+     +-------------+
       delegate_         delegate_             delegate_
       to_coding_        to_web_               to_task_
       agent tool        agent tool            agent tool
The team lead’s LLM decides when to delegate based on task context. Delegation is fire-and-forget: the child spawns as asyncio.create_task(), the parent continues, and the child broadcasts its own status updates independently. Results can be retrieved via the auto-injected check_delegated_tasks tool or consumed by taskTrigger nodes elsewhere in the workflow.

LangGraph StateGraph Flow

             +---------+
  START ---> |  agent  |
             | (LLM)   |
             +----+----+
                  |
          should_continue()?
              /        \
       tools /          \ end
            v            v
       +---------+      END
       |  tools  |
       | (exec)  |
       +----+----+
            |
            +---------> loop back to agent
Max iterations guard against infinite tool loops. Tools are built as Pydantic-schema StructuredTool instances with the node’s parameter schema.

Memory, Skills, Tokens, and Cost

Markdown-Based Memory

The simpleMemory node stores conversation history as editable Markdown in the parameter panel. The AI Agent handler reads, parses to LangChain messages, executes, appends the new exchange, trims to a window, and archives removed messages to an InMemoryVectorStore (optional) using HuggingFace BAAI/bge-small-en-v1.5 embeddings.
AI Agent reads memoryContent (Markdown)
        |
        v
_parse_memory_markdown() -> LangChain Messages
        |
        v
(optional) vector store similarity_search(prompt, k=retrievalCount)
        |
        v
Execute LLM with full history + retrieved context
        |
        v
Append human + ai messages to Markdown
        |
        v
_trim_markdown_window(windowSize) -> (kept, removed)
        |
        v
If longTermEnabled: store.add_texts(removed)
        |
        v
Save updated Markdown back to node parameters

Skill System

49 built-in skills across 10 folders under server/skills/:
assistant/ (5)           android_agent/ (12)       autonomous/ (5)
coding_agent/ (2)        productivity_agent/ (6)   rlm_agent/ (1)
social_agent/ (5)        task_agent/ (3)           travel_agent/ (2)
web_agent/ (8)
Each skill is a folder with a SKILL.md file containing YAML frontmatter (name, description, allowed-tools, metadata) and Markdown instructions. First load seeds the database; after that the database is source of truth so users can edit skill instructions in the UI. “Reset to Default” reloads the original SKILL.md. The masterSkill node aggregates multiple skills with enable/disable toggles. A split-panel editor shows the skill list on the left and the selected skill’s Markdown on the right. When connected to an agent, the backend expands the skillsConfig parameter into individual skill entries injected into the agent’s system message.

Token Tracking and Compaction

Every AI execution stores a TokenUsageMetric row with input/output/cache/reasoning token counts and calculated costs (USD) based on server/config/pricing.json. Cumulative state per session lives in SessionTokenState. Compaction threshold priority:
  1. Per-session custom_threshold (user-set)
  2. Model-aware: 50% of the model’s context window (e.g., 500K for Claude Opus 4.6 with 1M context)
  3. Global COMPACTION_THRESHOLD fallback from .env
When the cumulative session tokens cross the threshold, CompactionService.compact_context() generates a 5-section summary (Task Overview, Current State, Important Discoveries, Next Steps, Context to Preserve) following the Claude Code pattern and replaces the memory content. Anthropic and OpenAI also have native compaction APIs (context_management edits, compact_threshold) that are configured transparently.

Communication Layer

A single persistent WebSocket at /ws/status handles all frontend-backend communication. There are 89 WebSocket handlers registered with the @ws_handler decorator in server/routers/websocket.py, covering: node parameters, tool schemas, node execution, triggers, dead letter queue, deployment, AI operations, API keys, OAuth flows (Claude, Twitter, Google), Android, WhatsApp, Telegram, workflow storage, chat messages, console logs, skills, memory, user settings, pricing, agent teams, and model registry.

Request/Response Pattern

Frontend                                Backend
   |                                       |
   |-- {type: "execute_node", data: ...}-->|
   |                                       |-- dispatch to handle_execute_node()
   |                                       |-- run node handler
   |                                       |
   |<-- {type: "node_status", ... }---------|  (broadcast)
   |<-- {type: "node_output", ... }---------|  (broadcast)
   |<-- {type: "execute_node_response", ...}|  (direct reply, request_id matched)

Broadcast Message Types

node_status         workflow_status      api_key_status
node_output         android_status       token_usage_update
variable_update     compaction_starting  node_parameters_updated
variables_update    compaction_completed

Auto-Reconnect and Keepalive

Frontend sends {"type": "ping"} every 30 seconds. On disconnect, the WebSocket context schedules a reconnect after 3 seconds with a 100ms mount delay to avoid React Strict Mode double-connect in development. Connection is gated on isAuthenticated so logged-out users never connect.

Cache, Persistence, and Security

Cache Fallback Hierarchy

MachinaOs follows the n8n cache pattern with automatic environment-based fallback:
Production (Docker):  Redis  ->  SQLite (cache_entries)  ->  in-memory dict
Local development:             SQLite (cache_entries)  ->  in-memory dict
                               (Redis disabled via REDIS_ENABLED=false)
CacheService in server/core/cache.py checks each backend in order. TTL expiration is supported across all three. A background cleanup_expired_cache() task removes expired SQLite rows.

Encrypted Credentials

API keys and OAuth tokens live in a separate SQLite database (credentials.db) isolated from the main machina.db. Encryption uses Fernet (AES-128-CBC + HMAC-SHA256) with keys derived from a server-scoped config key via PBKDF2HMAC (600K iterations, OWASP 2024 recommendation).
API_KEY_ENCRYPTION_KEY (.env)  +  salt (from credentials.db, 256 bits)
                    |
                    v
           PBKDF2HMAC-SHA256
           (600,000 iterations)
                    |
                    v
        urlsafe_b64encode -> Fernet key
                    |
                    v
           held in memory only
                    |
                    v
      encrypt() / decrypt() inside EncryptionService
Two separate credential systems inside credentials.db:
  • API key system (EncryptedAPIKey table): provider keys the user enters manually
  • OAuth token system (EncryptedOAuthToken table): tokens from OAuth flows (Google, Twitter, Claude.ai)
All routers access credentials through AuthService only. Direct database access is forbidden. Decrypted values are cached in AuthService memory dicts and never written to disk or Redis. For cloud deployments, CREDENTIAL_BACKEND can be switched to keyring (OS-native) or aws (Secrets Manager) via the CredentialBackend abstraction.

Authentication

JWT in HttpOnly cookies following the n8n pattern. Two modes:
  • AUTH_MODE=single - first user becomes owner, registration disabled after
  • AUTH_MODE=multi - open registration for cloud deployments
  • VITE_AUTH_ENABLED=false - bypass login entirely for local development
WebSocket connections check the JWT cookie before accepting. Auth context has exponential-backoff retry (5 attempts) to survive race conditions where the frontend starts before the backend is ready.

Node Catalog

Browse all 96 workflow nodes by category

AI Models

10 LLM providers with native SDK and LangChain paths

AI Agents

15 specialized agents with teams and delegation

GitHub

Source code and in-repo docs-internal/ deep dives