From A to B Without Inventing Letters Between: How Google ADK Patterns Cut Our LLM Costs by 41.2%

I discovered I was designing my AI tools backwards.

Here’s an example. This was my newsletter processing chain: reading emails, calling a newsletter processor, extracting companies, and then adding them to the CRM. This involved four different steps, costing $3.69 for every thousand newsletters processed.

Before: Newsletter Processing Chain

# Step 1: Find newsletters (separate tool)
ruby read_email.rb --from "newsletter@techcrunch.com" --limit 5
# Output: 340 tokens of detailed email data

# Step 2: Process each newsletter (separate tool)
ruby enhanced_newsletter_processor.rb
# Output: 420 tokens per newsletter summary

# Step 3: Extract companies (separate tool)
ruby enhanced_company_extractor.rb --input newsletter_summary.txt
# Output: 280 tokens of company data

# Step 4: Add to CRM (separate tool)
ruby validate_and_add_company.rb startup.com
# Output: 190 tokens of validation results

# Total: 1,230 tokens, 4 separate tool calls, no safety checks
# Cost: $3.69 per 1,000 newsletter processing workflows

Then I created a unified newsletter tool which combined everything using the Google Agent Development Kit to make more complex tools:

# Single consolidated operation
ruby unified_newsletter_tool.rb --action process --source "techcrunch" --format concise --auto-extract-companies
# Output: 85 tokens with all operations completed

# 93% token reduction, built-in safety, cached results
# Cost: $0.26 per 1,000 newsletter processing workflows
# Savings: $3.43 per 1,000 workflows (93% cost reduction)

Why is the unified newsletter tool more complicated? It includes multiple actions in a single interface (process, search, extract, validate), implements state management that tracks usage patterns and caches results, has safety callbacks and rate limiting built-in, and produces structured JSON outputs with metadata instead of plain text. It requires more setup code and configuration than the simple tools.

But here’s the counterintuitive part: despite being more complex internally, the unified tool is SIMPLER for the LLM to use because it provides consistent, structured outputs that are easier to parse, even though those outputs are longer.

The transformation delivered measurable improvements across all metrics. To understand the impact, we ran tests of 30 iterations per test scenario. The results show the impact of the new architecture:

Metric	Before	After	Improvement
LLM Tokens per Op	112.4	66.1	41.2% reduction
Cost per 1K Ops	$1.642	$0.957	41.7% savings
Success Rate	87%	94%	8% improvement
Tools per Workflow	3-5	1	70% reduction
Cache Hit Rate	0%	30%	Performance boost
Error Recovery	Manual	Automatic	Better UX

As a result of this, we were able to reduce tokens by 41%, which translated linearly into cost savings. The success rate improved by 8%, and we were able to hit the cache 30% of the time, which is another cost savings.

While individual tools produced shorter, “cleaner” responses, they forced the LLM to work harder parsing inconsistent formats. Structured, comprehensive outputs from unified tools enabled more efficient LLM processing, despite being longer.

Debugging many of these tools is hard.

Automated error recovery means unified tools provide actionable suggestions when operations fail, like suggesting valid domain formats for invalid inputs or automatically retrying failed API calls with exponential backoff. Instead of cryptic error messages that require manual debugging, the tools guide users toward successful completion.

My workflow relied on dozens of specialized Ruby tools for email, research, and task management. Each tool had its own interface, error handling, and output format.

This approach created several problems. Context pollution where each tool added to Claude’s context. Token waste through verbose outputs designed for human reading. Inconsistent error handling. State loss with no memory between operations.

The Google ADK documentation revealed five key architectural patterns that could solve these issues.

The Unified Tool Pattern uses single tools with multiple actions instead of separate tools per operation. Format Control Systems provide response formats optimized for different use cases with concise formats offering 70-85% token reduction for chaining operations, detailed formats providing full information for final display, and ids_only formats achieving 85-95% reduction for bulk operations.

Safety Callbacks implement input validation and guardrails before operations execute. State Management creates persistent memory across operations with intelligent caching. Tool Delegation enables smart routing and batch processing capabilities.

After building comprehensive testing frameworks, I measured the actual impact on LLM token usage and costs through rigorous testing. I used 30 iterations per test for proper statistical power using Welch’s t-test for unequal variances.

For email search & summary workflows, total LLM tokens showed a 41.2% reduction (p=0.01, statistically significant) with legacy tools averaging 112.4 tokens per operation versus unified tools averaging 66.1 tokens per operation. Cost savings reached 41.7% reduction per operation.

For company research workflows, legacy tools had a 0% success rate due to dependency issues while unified tools achieved 100% success rate with 30/30 successful operations.

I implemented these patterns across seven core tools, creating a modernized ecosystem.

The Enhanced Task Manager includes safety guardrails blocking sensitive keywords, rate limiting at 30 operations per minute, batch operations with validation, and state management for preferences.

The Unified Email Tool consolidated 5 separate email tools into one interface with safety blocking for sensitive content and test domains, rate limiting at 100 operations per hour, and contact state management.

The Unified Research Tool provides multi-source aggregation from various APIs, intelligent caching with TTL, service-specific rate limiting, and batch enrichment capabilities.

The unified approach not only reduced tokens by 93% but also added automatic company extraction and validation, rate limiting to prevent API abuse, content filtering for spam/irrelevant newsletters, batch processing for multiple newsletters, and state management to cache summaries and remember preferences.

State management provides persistent memory across operations and turns out to be both free and straightforward to implement. The state system remembers context between operations by tracking usage patterns, recent contacts, search cache with timestamps, and rate limiting across services.

This enables smart auto-completion for frequent contacts, cached search results to avoid duplicate API calls, usage tracking for optimization insights, and rate limit awareness across all operations. The persistent state file grows incrementally with each operation, learning user preferences and optimizing future interactions without requiring complex infrastructure.

The Google ADK framework provides a blueprint for building enterprise-grade agent tools. Even if you’re not using Python, the architectural patterns of unified interfaces, format control, safety callbacks, and state management can transform any agent toolkit. You can find complete implementation examples on GitHub to start building your own unified tools.

The 41.2% LLM cost reduction alone justifies the effort, but the safety, reliability, and user experience improvements make this essential for any serious agent automation system.

Subscribe