Arepa.AI: Agentic AI Platform for Spanish-Speaking SMBs

Arepa.AI: Agentic AI Platform for Spanish-Speaking SMBs | Celestinosalim.com

Arepa.AI: Agentic AI Platform for Spanish-Speaking SMBs

Small businesses in Latin America operate in a different reality than Silicon Valley startups. They don't have engineering teams. They don't have data infrastructure. They often don't have a website. But they have the same operational problems that AI can solve: answering customer questions, scheduling appointments, managing inventory, and following up on leads.

Arepa.AI is the platform I'm building to bridge that gap. The name is a nod to my Venezuelan roots - the arepa is the most universal food in the culture, and this project aims to make AI equally accessible.

The Problem

Most AI tooling assumes English-first, enterprise-scale, and technical users. That leaves out millions of small businesses across Latin America who could benefit from automation but can't afford a $200K consulting engagement or navigate English-language documentation.

The specific gaps I'm targeting:

Language: LLM performance degrades significantly in Spanish, especially with regional dialects and informal business communication
Cost: SMBs can't justify $0.50/query. The unit economics need to work at $0.01/query or less
Complexity: Business owners need to interact with AI through voice and WhatsApp, not dashboards

Tech Stack

| Layer | Technology | Rationale | |---|---|---| | Agent Orchestration | LangGraph | State machines over chains - business workflows map naturally to state graphs | | Observability | LangSmith | Full trace visibility for debugging Spanish-language edge cases | | Vector Store | Supabase (pgvector) | Isolated namespace per business, multilingual embedding support | | Embeddings | Multilingual model | Preserves semantic accuracy across Spanish dialects | | Infrastructure | AWS (Lambda, S3, CloudWatch) | Serverless execution with per-business cost isolation | | IaC | Terraform | All infrastructure as code from day one | | Voice | LiveKit | Same stack as celestino.ai |

Architecture Decisions

LangGraph for Agent Orchestration

I chose LangGraph over raw LangChain or custom orchestration for three reasons:

State machines over chains: Business workflows (lead qualification, appointment booking, follow-up sequences) map naturally to state graphs, not linear chains
Human-in-the-loop: SMB owners need to approve actions before the agent takes them. LangGraph's interrupt/resume model handles this cleanly
Observability: LangSmith integration gives me full trace visibility, which matters when debugging Spanish-language edge cases

RAG Pipeline

The retrieval layer ingests business-specific content: menus, service lists, pricing, FAQs, and operating hours. Each business gets an isolated vector namespace in Supabase (pgvector).

Key design choices:

Chunking strategy: Semantic chunking tuned for Spanish sentence boundaries
Embedding model: Multilingual model (not English-only) to preserve semantic accuracy
Hybrid search: Combining vector similarity with keyword matching for proper nouns (business names, product names) that embeddings handle poorly

Infrastructure

AWS: Lambda for agent execution, S3 for document storage, CloudWatch for monitoring
Terraform: All infrastructure is IaC from day one. No clicking in consoles
Cost ceiling: Hard limits on per-business monthly spend. If a business's agent costs exceed $50/month, something is wrong with the architecture

Current Status

This project is in active development. What's working:

Core agent framework with LangGraph state management
RAG pipeline with Spanish-optimized chunking and retrieval
Voice interface prototype using the same LiveKit stack from celestino.ai
Terraform modules for multi-tenant AWS deployment

What's next:

WhatsApp Business API integration for the primary customer channel
Billing and usage metering per business
Onboarding flow that lets business owners configure their agent without code

Technical Rationale

I'm building Arepa.AI because it sits at the intersection of everything I've learned: production AI engineering from Eventbrite, data pipeline design from FlowWest, and product thinking from building celestino.ai. It's also the hardest version of the problem - making AI work reliably in a language and market that most tooling ignores.

This isn't a demo. It's a business I'm building in public, with the same engineering rigor I'd bring to any production system.

Work With Me On Something Similar

If you're building AI for non-English markets, multilingual SMB automation, or production RAG pipelines with real cost constraints, the architecture decisions documented here apply directly. The same systems thinking - unit economics, multilingual grounding, multi-tenant infrastructure - transfers to any domain.

Explore AI Consulting Services or submit an inquiry - I'll respond within one business day.

Manito Car Wash: Practical AI for a Family Car Wash in Venezuela