Building an Agentic Mesh for News Intelligence: A2A Communication, Temporal Workflows, and Scalable Multi-Agent Design

Subodh MIshra
|
Apr 27, 2026
false Building an Agentic Mesh for News Intelligence: A2A Communication, Temporal Workflows, and Scalable Multi-Agent Design Article URL

    Introduction to Agentic Mesh and Its Challenges

    What is an agentic mesh?

    In recent years the term agentic mesh has moved from academic papers to production‑grade systems. At its core, an agentic mesh is a network of autonomous, purpose‑built AI agents that collaborate to solve complex problems. Each agent encapsulates a single capability—such as summarisation, entity extraction, or sentiment analysis—and communicates with its peers through well‑defined protocols. The mesh topology allows the system to dynamically route data, parallelise work, and recover from failures without a single point of control.

    Imagine a newsroom that receives a breaking story, enriches it with background context, extracts key entities, and then pushes personalised alerts to multiple platforms. Instead of a monolithic pipeline that runs every step sequentially, the mesh spins up the exact agents required for that story, lets them exchange messages, and discards them when the job is finished. This on‑demand, plug‑and‑play approach is what makes modern news intelligence both fast and cost‑effective.

    Challenges with monolithic AI pipelines

    Traditional AI pipelines are often built as long, linear chains of micro‑services or as a single heavyweight application. While straightforward to prototype, they suffer from several drawbacks:

    1. Scalability bottlenecks – A single service becomes the limiting factor when traffic spikes, and horizontal scaling is hard because the whole pipeline must be duplicated.
    2. Tight coupling – Changing one component (e.g., swapping a summarisation model) forces downstream services to be retested, increasing the risk of regression. 
    3. Limited fault tolerance – If any step crashes, the entire request fails, requiring expensive retry logic and manual intervention.
    4. Resource waste – Every request must pass through every stage, even if only a subset of capabilities is needed, leading to unnecessary compute consumption.
    5. Lack of temporal awareness – Traditional pipelines are typically batch-oriented and operate within fixed execution windows. They lack native support for: 

     

    • real-time event handling   
    • long-running workflows 
    • stateful retries and resumability 

     

    As a result, they struggle to react to evolving signals (e.g., breaking news, rapidly changing narratives) in a timely and reliable manner. Any attempt to add durability or real-time responsiveness often requires custom scheduling, retry logic, and state management – leading to increased system complexity. 

      

    These pain points push organisations toward a more modular, resilient architecture – precisely what an agentic mesh promises. However, moving from monolithic pipelines to a distributed mesh introduces its own set of engineering challenges: protocol design, state management, orchestration, and observability become critical concerns.  

     

    So we built an agent-first mesh architecture where: 

     

    • Orchestrator decides 
    • Specialists execute 
    • A2A connects everything 
    • Temporal handles durability 
    • OpenAI Agents SDK powers reasoning 

     

    This is not theoretical. This is what we actually built. 

    Designing a Scalable Multi‑Agent Architecture

    Layered Architecture for Agentic Systems

    Scalability in an agentic mesh is achieved by separating concerns into distinct layers. This layered approach mirrors classic software architecture but is adapted for agent-based, AI-driven systems.

    In this architecture, responsibilities are clearly divided across communication (A2A), reasoning (OpenAI Agents SDK), and durability (Temporal).

    Core Layers

    Identity & Access Layer

    Handles:

    Handles:

    • authentication (SSO) 
    • identity federation 
    • token issuance 
    • service identity 

     

    Ensures: 

    • every request is authenticated 
    • identity context flows through the system 

    Tenant Control Plane

    Introduces SaaS capabilities:

    • tenant provisioning   
    • organization/workspace mapping   
    • entitlements and plans   
    • connector ownership

     

    Ensures: 

    • strict isolation between tenants   
    • scalable multi-tenant architecture 

    Policy & Governance Layer

    Combines:

    • access control (RBAC/ABAC) 
    • quotas (agent, tenant, user) 
    • service-to-service authorization 

     

    Enforcement happens before execution, ensuring: 

    • controlled usage   
    • security   
    • predictable costs 

    Memory Control Plane

    Memory is treated as a platform capability, not agent-owned.

    Supports: 

    • session memory 
    • user memory   
    • tenant memory   
    • audit/provenance 

     

    Key principles: 

    • scoped memory (user / tenant / platform) 
    • privacy-first defaults 
    • policy-aware retrieval 
    • auditability 

    Orchestration Layer

    Uses reasoning systems to:

    • interpret user intent   
    • create execution plans   
    • route tasks 

    Transport Layer (A2A Communication)

    Handles agent communication via:

    • gRPC (preferred) 
    • HTTP JSON-RPC (fallback) 

     

    Security: 

    • mTLS for service-to-service trust 
    • JWT for identity verification 

    Workflow Layer (Temporal)

    Temporal ensures: 

    • durability   
    • retries   
    • long-running workflows   
    • human-in-the-loop 

    Agent Execution Layer

    Each agent: 

    • is independently deployable   
    • runs in Docker   
    • exposes A2A interface   
    • performs one responsibility 

    Persistence & Storage

    Includes: 

    • Postgres (metadata, policy, tenant data) 
    • Vector DB (memory retrieval) 
    • Object storage (artifacts) 
    • Redis (cache/session) 

    Observability & Audit Layer

    Ensures: 

    • tracing   
    • monitoring 
    • debugging
    • audit logs 

    Secure Communication (mTLS + Tokens)

    We use a dual-layer trust model:

    • mTLS → verifies service identity at transport layer   
    • JWT tokens → carry identity, tenant, and permissions 

     

    This ensures: 

    • zero trust between services   
    • strong authentication   
    • secure A2A communication 

    Memory-Aware Execution

    Unlike traditional systems:  

    • memory is not global   
    • memory is scoped   
    • memory access is policy-controlled 

     

    Retrieval follows: 

    1. session
    2. user  
    3. tenant 
    4. platform

    Layer Summary

    Layer Role Key Technologies
    User API entry FastAPI
    Identity & Access Authentication, SSO, federation, token issuance Keycloak, JWT
    Tenant Control Plane Multi-tenancy, org/workspace management, entitlements FastAPI, Postgres
    Policy & Governance Access control, quotas, authorization Redis, Postgres, Policy Service
    Memory Control Plane Scoped memory, retrieval, lifecycle, privacy enforcement Postgres, Vector DB (Pinecone/Weaviate/FAISS), Redis, S3
    Orchestration Planning & routing OpenAI Agents SDK
    Internal Agent Orchestration Multi-step execution, branching, checkpoints LangGraph
    Transport Agent communication A2A, gRPC, HTTP JSON-RPC
    Transport Security Service-to-service trust mTLS, TLS
    Workflow Durable execution, retries, long-running workflows Temporal
    Agent Execution Domain-specific capabilities Python services, Docker
    Persistence Storage & audit Postgres, S3 / Blob Storage
    Cache & Session Ephemeral state, quotas, session memory Redis
    Event Layer (Optional) Async pub/sub (advanced) Redis Streams, Kafka
    Observability & Audit Monitoring, tracing, metrics, logging, audit Prometheus, Grafana, OpenTelemetry, New Relic

    Key Architectural Insight

    A2A handles communication.

    OpenAI Agents SDK handles reasoning.

    Temporal handles durability.

    By keeping these concerns separate: 

    • you avoid tight coupling   
    • you prevent over-engineering   
    • and you retain flexibility to evolve each layer independently 

     

    For most systems, this minimal layered design is sufficient. End-to-end observability ensures full visibility into agent decisions, workflows, and system performance in real time. Additional layers like message buses should only be introduced when scaling demands it.  

    Key technologies enabling scalability

    • Docker – Containerisation guarantees that every agent runs in an isolated, reproducible environment. It also simplifies horizontal scaling via orchestration platforms like Kubernetes. 
    • gRPC – Binary protocol that reduces payload size and latency, crucial for real‑time mesh interactions. 
    • Temporal – Provides stateful workflow management with built‑in durability, making it easy to recover from failures without custom retry logic. 
    • Postgres – Relational store for metadata, job queues, and audit logs. Its ACID guarantees help maintain consistency across distributed agents. 
    • OpenAI Agents SDK – Offers high‑level abstractions for creating and registering agents, handling authentication, and exposing a unified API surface. 
    • Pydantic – Validates data structures in Python, preventing malformed messages from propagating through the mesh. 

     

    Together, these tools create a robust foundation that can handle the unpredictable load patterns typical of news intelligence pipelines. 

    Implementing Temporal Workflows for Durability and Orchestration

    Introduction to Temporal.io and its applications

    Temporal is an open‑source orchestration platform that treats workflows as first‑class citizens. Unlike traditional job schedulers, Temporal persists every state transition to durable storage, enabling automatic replay after crashes. For a news intelligence mesh, Temporal can manage:

    • Continuous monitoring of RSS feeds, social media streams, and internal data sources. 
    • Retry policies for flaky third‑party APIs (e.g., price‑check services or sentiment analysis providers). 
    • Human‑in‑the‑loop approvals when a story requires editorial sign‑off before distribution. 

     

    Because workflows are defined in code, developers can use familiar languages – Python, Go, orJava – to express complex branching logic without learning a new DSL. 

    Durability and retry mechanisms in Temporal workflows

    Temporal guarantees that no work is lost, even if the underlying worker process crashes. Each activity (e.g., “fetch article”, “run summarisation”) is recorded in a durable event store. If a worker dies, Temporal re‑queues the activity on a healthy worker, applying exponential back‑off and custom retry policies.

    Below is a simplified flowchart that illustrates how Temporal ensures durability and automatic retries for a continuous monitoring workflow that watches a news source, extracts entities, and stores results in Postgres.

     

    Temporal’s built‑in visibility tools let operators monitor each step, see retry counts, and inspect historic runs. This level of observability is essential for production‑grade news pipelines where missed alerts can have real‑world impact. 

    Building a News Intelligence Workflow with Agents

    Building a News Intelligence Workflow with Agents

    Breaking down a user query workflow

    When a journalist or analyst submits a query—“What are the emerging trends in renewable energy investments in Europe?”—the mesh must orchestrate several agents: a retrieval agent to gather sources, a summarisation agent, an entity‑linking agent, and finally a presentation agent that formats the answer for the UI. The following flowchart maps this end‑to‑end process.

    Breaking down a user query workflow

    Each arrow represents an A2A call over gRPC. The Presentation Agent is a lightweight FastAPI service that formats the final JSON payload and streams it back to the front‑end. Because every step is an independent, containerised agent, the system can scale each component independently based on load. 

    Key considerations for designing agent boundaries

    Designing clear boundaries prevents the mesh from devolving into a tangled web of responsibilities. Below is a checklist that helps architects decide where one agent should stop and another begin.

    Component Responsibility Design Considerations
    Retrieval Agent Query external sources, cache results Use Docker for isolation; respect rate limits;
    Summarisation Agent Condense long texts into concise abstracts Keep model size manageable; expose temperature & max‑tokens as parameters.
    Entity‑Linking Agent Identify entities and link to knowledge graph Ensure deterministic output; store mappings in Postgres for reuse.
    Presentation Agent Format and deliver results to UI Implement FastAPI endpoints; validate schema with Pydantic; support streaming responses.
    Orchestration Layer Coordinate workflow steps, handle retries Leverage Temporal for durability; define retry policies per activity.

    By answering these questions – What data does the component own? How does it communicate? What are its scaling requirements? – You can create a mesh that is both flexible and maintainable. 

    Practical Implementation: Code Examples and Tools

    Planner using OpenAI Agents SDK (Reasoning Layer)

    This is your brain, not your transport.

    from __future__ import annotations 
     
    from pydantic import BaseModel 
    from agents import Agent, Runner 
     
     
    class ExecutionPlan(BaseModel): 
        mode: str 
        reason: str 
        target_agent: str | None = None 
     
     
    planner_agent = Agent( 
        name=“News Planner”, 
        instructions=( 
            “You are a planner for a news intelligence system.\n“ 
            “Return a structured execution plan.\n“ 
            “Allowed modes: chat, remote_agent, workflow.\n“ 
            “Use target_agent=’summariser’ for summarisation requests.\n“ 
            “Use mode=’workflow’ for full news intelligence requests.” 
        ), 
        output_type=ExecutionPlan, 

     
     
    async def build_plan(message: str) -> ExecutionPlan: 
        result = await Runner.run(planner_agent, message) 
        return result.final_output 

    Why this matters

    • Uses OpenAI Agents SDK correctly  
    • Produces structured execution plans  
    • Keeps reasoning separate from transport and workflows  

    A2A gRPC Call (Communication Layer)

    This is how your orchestrator talks to a summariser agent.

    from __future__ import annotations 

     

    import asyncio 

     

    from a2a.client import A2AClient 

    from a2a.types import Message, Part, Role, TextPart 

     

     

    async def call_summariser_grpc(text: str) -> str: 

        client = A2AClient( 

            url=”grpc://localhost:50051″, 

        ) 

     

        request = Message( 

            role=Role.user, 

            parts=[ 

                Part( 

                    root=TextPart( 

                        text=f”Summarise this article in 150 tokens or less:\n\n{text}” 

                    ) 

                ) 

            ], 

        ) 

     

        final_text_parts: list[str] = [] 

     

        async for event in client.send_message(request): 

            if isinstance(event, Message): 

                for part in event.parts: 

                    if isinstance(part.root, TextPart): 

                        final_text_parts.append(part.root.text) 

     

        return “”.join(final_text_parts).strip() 

     

     

    if __name__ == “__main__”: 

        sample = “Renewable energy investments are rapidly growing across Europe.” 

        result = asyncio.run(call_summariser_grpc(sample)) 

        print(“Summary:”, result) 

    This call uses the A2A SDK’s gRPC transport to send a structured Message to the remote summariser agent and stream back the response. 

    Temporal Workflow (Durable Orchestration Layer)

    This is your news intelligence pipeline.

    from __future__ import annotations 
     
    from datetime import timedelta 
     
    from temporalio import workflow 
     
     
    @workflow.defn 
    class NewsIntelligenceWorkflow: 
        @workflow.run 
        async def run(self, query: str) -> str: 
            # Step 1: Retrieve articles 
            article = await workflow.execute_activity( 
                “retrieval_agent.fetch“, 
                query, 
                schedule_to_close_timeout=timedelta(seconds=30), 
            ) 
     
            # Step 2: Summarise content (via summariser agent) 
            summary = await workflow.execute_activity( 
                “summarisation_agent.summarise“, 
                article, 
                schedule_to_close_timeout=timedelta(seconds=30), 
            ) 
     
            # Step 3: Entity linking / enrichment 
            enriched = await workflow.execute_activity( 
                “entity_link_agent.link“, 
                summary, 
                schedule_to_close_timeout=timedelta(seconds=30), 
            ) 
     
            # Step 4: Persist results 
            await workflow.execute_activity( 
                “results_store.save“, 
                {“query”: query, “answer”: enriched}, 
                schedule_to_close_timeout=timedelta(seconds=15), 
            ) 
     
            return enriched 

    Use OpenAI Agents SDK for reasoning and planning, A2A for communication between services, and Temporal for durability, retries, and long-running workflows. That separation keeps the system clean: the SDK thinks, A2A connects, and Temporal remembers. The Agents SDK’s documented primitives center on agents, runners, tools, handoffs, and typed outputs, which makes it a strong fit for planner and narrative layers rather than transport or persistence. 

    Future Considerations and Best Practices

    Balancing flexibility and coupling in agent systems

    An overly flexible mesh can become a “spaghetti” of services where every agent knows about many others. To avoid this, adopt domain‑driven boundaries: group related capabilities into a bounded context and expose only the minimal interface required for collaboration. Use versioned A2A contracts so that downstream agents can continue operating when an upstream agent evolves. Monitoring tools (e.g., OpenTelemetry) should be integrated at the transport layer to surface latency spikes and coupling violations early.

    Avoiding over‑engineering in agentic mesh design

    While the allure of a fully decoupled mesh is strong, not every use‑case needs the full stack. For low‑traffic internal tools, a simple local execution path may suffice. Reserve Temporal and heavyweight gRPC setups for scenarios that truly benefit from durability, high concurrency, or multi‑region redundancy. Start with a minimal viable mesh, then iteratively introduce layers—adding a message bus, containerisation, or orchestration only when performance metrics justify the added complexity.

    By adhering to the principles outlined above—clear A2A contracts, layered scalability, durable Temporal workflows, and pragmatic engineering—you can build a production‑grade news intelligence platform that scales with demand, recovers from failures, and remains adaptable to future AI breakthroughs. The combination of Python, FastAPI, A2A SDK, gRPC, Temporal, Postgres, OpenAI Agents SDK, Pydantic, and Docker provides a powerful, vendor‑agnostic toolkit to turn the vision of an agentic mesh into reality.


    Recommended Resources