Vapi Track/Vapi Fundamentals
Vapi Track
Module 1 of 6

Vapi Fundamentals

Set up your Vapi account, navigate the dashboard, configure API keys, and build your first voice agent from a system prompt.

16 min read

What You'll Learn

  • Understand what Vapi is, how it works as a voice AI infrastructure layer, and where it fits in the AI stack
  • Create a Vapi account, navigate the dashboard, and configure your first assistant with a model, voice, and system prompt
  • Generate and secure API keys for programmatic access to the Vapi REST API
  • Use the Vapi web dashboard to test a voice agent live in the browser before connecting a real phone number
  • Understand the core Vapi pricing model including per-minute costs for speech-to-text, LLM, and text-to-speech

What Is Vapi and Why It Matters

Vapi is a developer-first platform for building, testing, and deploying voice AI agents. Think of it as the infrastructure layer sitting between your LLM (like GPT-4o or Claude) and a real phone call. Vapi handles the hard parts: real-time speech-to-text transcription, streaming LLM responses, text-to-speech conversion, telephony integration, and low-latency orchestration - all in one API.

Before Vapi, building a voice agent meant stitching together Twilio for telephony, Deepgram or Whisper for STT, an LLM API for intelligence, ElevenLabs for voice output, and a WebSocket server to coordinate everything in real time. That stack is complex, expensive to maintain, and easy to get wrong on latency. Vapi collapses all of it into a single platform.

The core architecture: a caller dials in (or your system dials out), Vapi streams audio to its STT provider, the transcript goes to your chosen LLM with your system prompt, the LLM response streams back through TTS, and audio plays to the caller. This full round-trip takes 500-900ms end-to-end on Vapi, which is critical for natural-feeling conversation.

Vapi is language and framework agnostic. You interact with it via REST API, webhooks, or one of its SDKs (JavaScript, Python). It is used for AI receptionists, appointment booking bots, customer support agents, outbound sales dialers, and IVR replacement systems. The platform has grown rapidly because it dramatically reduces the time to go from "idea" to "working phone agent" from weeks to hours.

Quick Test: Create Your First Vapi Assistant

Go to dashboard.vapi.ai and create a free account (no credit card required).

Click "Assistants" in the left sidebar, then "Create Assistant."

Choose the Blank template, name it "Test Agent," and click "Create."

Note the assistant ID in the URL - you will use this in API calls later.

Dashboard Tour and First Assistant Setup

The Vapi dashboard has several key sections. Assistants is where you define your AI agents - their personality, LLM model, voice, and behavior. Phone Numbers is where you attach real phone numbers to assistants. Calls shows your call history and recordings. Files stores knowledge base documents. Analytics tracks usage and costs.

When creating an assistant, the most important settings are:

  • First Message: What the agent says when the call connects. Be specific - "Hello! Thank you for calling Acme Dental, this is Aria. How can I help you today?" is far better than a generic greeting.
  • System Prompt: The instructions that define the agent's role, persona, knowledge, and rules. This is your most powerful tool.
  • Model: The LLM provider and model (GPT-4o mini is the best cost-performance option for most use cases; Claude 3.5 Haiku is excellent for complex reasoning).
  • Voice: The TTS voice used to speak responses. Vapi supports ElevenLabs, PlayHT, Deepgram, OpenAI TTS, and others.
  • Transcriber: The STT provider. Deepgram Nova-2 is the default and generally best for accuracy and speed.

The Test Call button in the dashboard lets you call your agent directly from the browser using your microphone. This is the fastest way to iterate on prompts - no phone number required. Each test call appears in your Calls log with full transcript and recording.

System Prompt Best Practices

Start your system prompt with three things: role definition ("You are Aria, an AI receptionist for Acme Dental"), context ("You handle appointment scheduling and general inquiries"), and personality ("You are friendly, efficient, and empathetic"). Then add specific instructions and constraints. Keep it under 800 tokens for latency reasons - the entire prompt goes to the LLM on every turn.

API Keys and Programmatic Access

To use Vapi programmatically, you need two types of credentials: a Private API Key for server-side operations and a Public Key for client-side SDKs.

Find both under Organization Settings > API Keys. Your Private Key should never be exposed in client-side code or committed to Git. Use it only in server environments, backend workflows, or n8n. Your Public Key is safe to use in browser JavaScript for things like starting a web call from your website.

The Vapi REST API base URL is https://api.vapi.ai. Authentication uses Bearer token headers:

Authorization: Bearer your-private-api-key

Key API endpoints you will use constantly:

  • POST /assistant - Create a new assistant
  • GET /assistant/{id} - Retrieve assistant config
  • PATCH /assistant/{id} - Update assistant settings
  • POST /call - Initiate an outbound call
  • GET /call - List calls with filtering
  • GET /call/{id} - Get call details including transcript

For LLM provider keys, Vapi can use its own managed API keys (you pay Vapi) or you can supply your own OpenAI, Anthropic, or other provider keys under Provider Keys settings. Using your own keys is cheaper at scale but requires managing key rotation.

Understanding Vapi Pricing

Vapi pricing is usage-based and composed of three layers: the STT cost, the LLM cost, and the TTS cost. There is also a Vapi platform fee on top of the provider costs.

A typical call minute costs roughly:

  • Transcription (STT): ~$0.006-$0.01 per minute (Deepgram Nova-2)
  • LLM: ~$0.01-$0.05 per minute depending on model and conversation density
  • Voice (TTS): ~$0.01-$0.03 per minute (ElevenLabs being the most expensive)
  • Vapi platform fee: ~$0.05 per minute

Total: most simple assistants cost $0.05-$0.15 per minute. A 5-minute customer service call runs $0.25-$0.75. At scale, this adds up - a call center handling 10,000 minutes per month pays $500-$1,500 on Vapi alone.

Cost optimization strategies include using GPT-4o mini instead of GPT-4o (4x cheaper), choosing Deepgram Aura for TTS instead of ElevenLabs (3x cheaper), keeping system prompts concise, and enabling early call termination when the task is complete. Vapi provides cost analytics per assistant so you can see exactly where money is going.

The free tier includes $10 of credits with no time limit, which is enough to build and test a complete working agent before needing to add a payment method.

Try This Yourself

In the Vapi dashboard, go to Assistants and open the assistant you created. Set the First Message to "Hello! I am a test assistant. Ask me anything." and set the System Prompt to "You are a helpful assistant. Answer questions clearly and concisely. Keep responses under 3 sentences." Click Save, then click the Test Call button. Have a short conversation. Then check the Calls log - you will see the full transcript, duration, and cost breakdown.

Core Insights

  • Vapi is a full-stack voice AI infrastructure platform that handles STT, LLM orchestration, TTS, and telephony - eliminating the need to stitch together separate services
  • The Test Call button in the dashboard lets you iterate on prompts without a phone number, dramatically speeding up development
  • Private API keys are for server-side code only; Public keys are for browser SDKs - never mix them up
  • Call cost is the sum of STT + LLM + TTS + Vapi platform fees, typically $0.05-$0.15 per minute for standard agents
  • A well-crafted system prompt (role, context, personality, constraints) under 800 tokens is the single biggest driver of agent quality