Eventbrite: How I Cut External API Costs from $15K/Day to $40/Month

Eventbrite: How I Cut External API Costs from $15K/Day to $40/Month | Celestinosalim.com

Eventbrite: How I Cut External API Costs from $15K/Day to $40/Month

As a Senior Software Engineer at Eventbrite (March 2022 - February 2025), I owned cost and reliability for critical request paths in the SEO/Growth org. This is the story of two related engineering efforts: a cross-service caching layer that eliminated $15K/day in external API spend, and an Ads platform transition where improved observability led to sunsetting a $60K/month third-party ML system.

The Problem: Expensive, Fragile External Dependencies

Eventbrite's core homepage and discovery surfaces depended on multiple external API integrations. Each page load triggered redundant calls across dependent services with no shared caching boundary. The costs were concrete:

$15K per day in external API charges, accumulating from redundant calls across services
No deduplication - the same data was fetched multiple times across different request paths for the same page render
Fragile crawl/index pipeline - when external APIs were slow or rate-limited, search engine crawlers encountered degraded pages, hurting organic visibility

The team knew the invoice total. We did not have per-service, per-call-path visibility into where the money was going.

The Architecture: Centralized Caching and Deduplication

I designed and built a cross-service caching and deduplication layer that sat between our services and external API integrations. The goals were straightforward: eliminate redundant calls, reduce latency, and gain per-path cost visibility.

Centralized retrieval layer. Instead of each service making its own external API calls, I introduced a shared retrieval layer with a unified cache. When Service A fetched data that Service B also needed within the same request cycle, the second call hit cache instead of the network.

Request deduplication. I identified that many external calls were semantically identical requests made milliseconds apart from different code paths. The deduplication layer collapsed these into a single outbound request and fanned the response back to all callers.

Cache invalidation strategy. External data had varying freshness requirements. I designed tiered TTLs based on data volatility - event metadata cached aggressively (changes rarely), pricing data cached with short TTLs (changes frequently), availability data always fetched fresh. This ensured we never served stale data where it mattered.

Per-path cost instrumentation. Every external call was tagged with the originating service and request path, feeding into observability dashboards. For the first time, the team could see exactly which code paths were generating the most external API spend.

Results: SEO/Growth Platform

| Metric | Before | After | |--------|--------|-------| | External API call volume | Baseline | -90% | | Daily API spend | ~$15,000/day | ~$40/month | | Organic impressions (YoY) | Baseline | +482% | | LCP (Core Web Vital) | 4.6s | 2.3s |

The cost reduction was dramatic - from roughly $450K/month to approximately $1,200/month - a 99.7% reduction. But the performance gains mattered just as much. By eliminating redundant network round-trips, page load times dropped significantly. The LCP improvement from 4.6s to 2.3s (measured via CrUX) came from the same architectural redesign: centralized retrieval removed frontend request waterfalls across dependent services.

The organic impression growth (+482% YoY) was not caused by caching alone. It was the compound effect of faster pages, better crawl efficiency, and the distributed indexing pipelines my teammate and I were building in parallel (covered in my SEO/Growth writeup). But the caching layer was a prerequisite - crawlers cannot index pages they time out on.

The Ads Platform Transition

In my final year at Eventbrite, I transitioned to the Ads platform, where I re-architected development workflows across three sub-teams and introduced real-time budget observability.

The problem: Budget fulfillment - how much of an advertiser's spend actually gets deployed - was a black box. Teams could not see whether campaigns were pacing correctly until invoicing, days or weeks later.

What I built: A real-time budget observability layer that gave the team immediate visibility into spend pacing, fulfillment rates, and campaign performance. This was instrumentation, not a new feature - but instrumentation that changed decisions.

The result: Budget fulfillment improved by 5%. More importantly, the improved visibility enabled a data-driven decision to sunset a third-party ML ranking system that was costing approximately $60K/month. The data showed that the ML system's marginal ranking improvement did not justify its cost. Without the observability layer, that call would have remained a debate rather than a decision.

What I Learned

Cost visibility changes behavior faster than cost optimization. The caching layer was technically straightforward. What made it impactful was the per-path instrumentation that came with it. When engineers could see that their code path was generating $3K/day in external calls, they started designing differently without being asked. The same pattern repeated on the Ads platform - real-time budget visibility changed team decisions within days. This is the same principle I apply in Production Engineering Sprints - build the observability first, then optimize what the data shows you.

Deduplication is the highest-leverage caching pattern. Traditional caching asks "have we seen this exact query before?" Deduplication asks "are we making the same request right now, in parallel, from different callers?" In a microservices architecture where the same page render triggers multiple services, deduplication catches waste that TTL-based caching misses entirely.

Infrastructure changes compound. The caching layer improved page speed, which improved crawl efficiency, which improved indexing, which improved organic visibility, which drove more traffic, which generated more revenue. No single metric tells the story. The 482% impression lift was the compound result of multiple infrastructure improvements landing in sequence, including the distributed indexing pipelines my teammate and I built in parallel. Systems thinking means understanding that a caching change is also a performance change is also an SEO change is also a revenue change.

Measure before you cut. On the Ads platform, I could have spent months trying to optimize the ML ranking system. Instead, I built the observability to measure whether it was working. It was not - at least not enough to justify $60K/month. The cheapest optimization is removing something that is not earning its cost.

Work With Me On Something Similar

If you're facing similar retrieval cost challenges - runaway API spend, no per-path visibility, or a system that works but costs too much to operate - the same approach applies. Centralized caching, deduplication layers, and cost instrumentation are patterns I apply across AI and traditional service architectures alike.

Explore AI Consulting Services or submit an inquiry - I'll respond within one business day.

celestino.ai: A Voice Agent That Speaks for Me