Insights | Metadata Morph

From Call Transcript to Salesforce in 60 Seconds: Building a Meeting Notes Agent

December 24, 2025 · 4 min read

Metadata Morph

AI & Data Engineering Team

Sales reps spend an average of 5–6 hours per week on CRM data entry. They log call notes, update opportunity stages, create follow-up tasks, and capture next steps — after every single call. It's manual, it's inconsistent, and it's the reason CRM data is always slightly out of date.

A meeting notes agent eliminates this entirely. It processes the call transcript, extracts structured data, and updates Salesforce before the rep has finished their post-call coffee.

Multi-Agent Orchestration: When One Agent Isn't Enough

December 17, 2025 · 6 min read

Metadata Morph

AI & Data Engineering Team

A single agent with access to all your tools sounds like the simplest architecture. In practice, it's the architecture that breaks first. As tool count grows, context windows fill up, prompts become unwieldy, and the agent starts making worse decisions because it's trying to do too many things at once.

Multi-agent systems solve this by decomposing complex workflows into specialized agents with focused responsibilities, coordinated by an orchestrator. The result is more reliable, more observable, and — counter-intuitively — cheaper to operate.

LLM Cost Management for Data Pipelines: When to Use Claude, OpenAI, or Ollama

December 10, 2025 · 6 min read

Metadata Morph

AI & Data Engineering Team

LLM costs in production pipelines scale differently from anything else in your data infrastructure. A poorly architected pipeline that sends every event through GPT-4o can burn through thousands of dollars per day. A well-architected one running the same workload might cost a tenth of that — by routing each task to the model that's just capable enough for the job.

This post covers the cost architecture decisions that keep AI pipelines economically viable at scale.

Automated KPI Commentary: Teaching an AI Agent to Write the 'So What'

December 3, 2025 · 5 min read

Metadata Morph

AI & Data Engineering Team

Every metrics review has the same pattern: someone pulls up the dashboard, sees that revenue is up 8% week-over-week, and then spends 20 minutes writing a sentence explaining why. Then they do it again for conversion rate. Then for churn. Then for CAC.

The numbers are already in your warehouse. The context — seasonality, campaigns, product launches, prior period comparisons — is also already in your warehouse. The gap is the synthesis, and that's exactly what a KPI commentary agent closes.

dbt Testing Strategies Before Feeding Data to LLMs: Preventing Garbage-In, Garbage-Out

November 27, 2025 · 5 min read

Metadata Morph

AI & Data Engineering Team

An AI agent is only as reliable as the data it reasons from. Feed it nulls, duplicates, or stale data and it will produce confident, coherent, and wrong answers — often without any obvious signal that something is off. The LLM doesn't know what it doesn't know.

dbt's testing framework is the right place to enforce data quality before data reaches your agents. This post covers a layered testing strategy that catches the most common failure modes before they become AI failures.

Real-Time Agent Context with Kafka: Sub-Second Data Freshness for AI Pipelines

November 20, 2025 · 5 min read

Metadata Morph

AI & Data Engineering Team

Batch pipelines are sufficient for most analytical workloads. They're not sufficient for AI agents making time-sensitive decisions. An anomaly detection agent that works on yesterday's data misses the incident happening right now. A customer churn agent fed weekly snapshots can't act on a user who disengaged three hours ago.

Real-time streaming closes this gap. With Kafka as the event backbone and Flink for stream processing, your agents can operate on data that is seconds old rather than hours or days.

Self-Writing Data Quality Reports: An Agent That Monitors Your Pipelines Overnight

November 13, 2025 · 4 min read

Metadata Morph

AI & Data Engineering Team

Every data team has the same Monday morning ritual: someone checks whether last night's pipelines ran cleanly, hunts through logs for failures, and manually compiles a status update for stakeholders. It's important work — and it's entirely automatable.

A data quality reporting agent runs overnight, checks every layer of your pipeline, and delivers a clear, human-readable report before anyone opens their laptop. When something is wrong, the report explains what failed, what downstream models are affected, and what the likely cause is.

Building a RAG Pipeline on Your Existing Data Warehouse

November 6, 2025 · 6 min read

Metadata Morph

AI & Data Engineering Team

The most common failure mode in enterprise AI projects is asking an LLM questions about your business data and getting confidently wrong answers. The model doesn't know your revenue figures, your customer data, or your internal processes — it only knows what it was trained on.

Retrieval-Augmented Generation (RAG) fixes this by giving the model the relevant context it needs at query time, retrieved from your actual data. The surprising part: you probably don't need a new data infrastructure to do it. Your existing warehouse already has the data — you just need the retrieval layer on top.

Replacing Manual Month-End Close Reporting with an AI Agent

October 29, 2025 · 4 min read

Metadata Morph

AI & Data Engineering Team

Month-end close is one of the most labor-intensive rituals in any finance team's calendar. Data analysts spend days pulling figures from ERPs, reconciling discrepancies across systems, and formatting reports that executives will read in five minutes. The underlying work is predictable, rule-based, and repeatable — the exact profile for an AI agent to take over.

This post walks through how to build a monthly close reporting agent that handles the full cycle: data extraction, reconciliation, anomaly flagging, and narrative generation.

Lakehouse Architecture Deep Dive: Iceberg, Delta Lake, and Hudi Compared

October 22, 2025 · 6 min read

Metadata Morph

AI & Data Engineering Team

The data lakehouse is built on a deceptively simple idea: add a metadata and transaction layer on top of cheap object storage, and you get warehouse-grade reliability at lake-scale costs. The table format is that layer — and three formats dominate the market: Apache Iceberg, Delta Lake, and Apache Hudi.

All three solve the same core problem. The differences in how they solve it have real consequences for query performance, streaming support, tooling compatibility, and operational complexity. This post cuts through the marketing to give you a technical basis for choosing.