HeyGen Track/HeyGen Fundamentals
HeyGen Track
Module 1 of 6

HeyGen Fundamentals

Set up your account, browse the avatar gallery, create photo avatars, and understand the core use cases for AI avatar video.

16 min read

What You'll Learn

  • Navigate the HeyGen interface and understand the core workspace layout
  • Browse and select avatars from the public gallery of 300+ stock presenters
  • Create a basic AI video using a text script and a stock avatar
  • Use photo avatars to turn static images into speaking presenters
  • Identify the right HeyGen use case for training, marketing, and internal comms

What HeyGen Is and Why It Matters

HeyGen is an AI video generation platform that lets you produce professional presenter-led videos without a camera, crew, or studio. At its core, the platform combines three elements: a library of realistic AI avatars, a text-to-speech voice engine, and a video editor that assembles everything into a polished final product.

The typical use cases fall into three buckets. First, content at scale - marketing teams, course creators, and agencies that need to publish dozens of videos per month without proportional increases in production cost. Second, personalization - sales teams sending individualized outreach videos where the avatar addresses the recipient by name. Third, localization - companies that want to publish the same video in 10 or 20 languages without re-recording the presenter.

The platform was founded in 2020 and by 2025 had processed over 100 million videos. Enterprise customers include HubSpot, Zoom, and Salesforce. The key value proposition is simple: what once required a studio, camera operator, lighting crew, and professional presenter can now be done in a browser in under 30 minutes.

The HeyGen workspace has four main areas:

  • Video Studio - where you build multi-scene videos from templates or from scratch
  • Avatar Library - the gallery of stock and custom avatars
  • Video Translate - the standalone tool for dubbing and localizing existing footage
  • API and Integrations - for programmatic video generation at scale

For new users, the recommended starting point is the Video Studio with a stock avatar. You pick a template, drop in your script, choose a voice, and generate. The first video typically takes under 10 minutes to produce.

Quick Test: Complete the Full HeyGen Creation Loop

Sign up for a free HeyGen account.

Pick any stock avatar and type a 3-sentence script about your role or company.

Choose a voice and click Generate.

Watch the full loop: script to rendered MP4.

Do not overthink it - the goal is just to experience the end-to-end workflow once.

Navigating the Avatar Gallery

The HeyGen avatar gallery is divided into two main sections: public avatars and your personal avatars. Public avatars are the 300+ stock presenters HeyGen provides on all paid plans. They cover a wide range of demographics, styles, and settings - business casual, formal, casual lifestyle, studio white backgrounds, outdoor environments, and more.

Each avatar entry in the gallery shows a short preview clip so you can evaluate the gestures and natural movement before committing. Many avatars offer multiple "looks" - essentially the same face but in different outfits or backgrounds. This lets you maintain presenter consistency across videos while varying the visual context.

Filtering the gallery effectively:

  • Use the gender, age, and ethnicity filters to find presenters that match your audience
  • Filter by background type (plain, office, outdoor) based on your brand needs
  • Sort by "Most Used" to see which avatars other creators trust for professional content
  • Check the resolution badge - some avatars are available in 4K for premium accounts

Photo avatars are a different category entirely. Instead of using HeyGen's stock presenters, you upload a still image of any person and HeyGen animates it to speak your script. This is useful for bringing product mascots, historical figures for educational content, or illustrated characters to life. Quality depends heavily on the source image - a clear, front-facing photo with good lighting will produce the most natural result.

One important distinction: photo avatars use a different generation pipeline than the custom video avatars covered in Module 2. Photo avatars are instant (no training required) but have more limited gesture range and realism compared to a proper Instant or Studio Avatar built from video footage.

Creating Your First Video from Script

The Script to Video workflow is the foundation of everything in HeyGen. Understanding it deeply makes every other feature more intuitive.

The core steps:

  1. Start a new project from the Video Studio. Choose blank canvas or pick a template as a starting point.
  2. Add your script by typing directly into the script panel, or pasting text. HeyGen supports scripts up to several thousand words split across multiple scenes.
  3. Select your avatar from the gallery. Position and resize the avatar within the scene using the visual editor.
  4. Choose a voice. HeyGen has 300+ voices across 40+ languages. You can also clone your own voice (covered in Module 2) or use the uploaded audio option if you prefer to record narration yourself.
  5. Set the scene background. Options include solid colors, gradient presets, uploaded images, video backgrounds, and screen share content.
  6. Add supporting elements - text overlays, logos, shapes, and transitions between scenes.
  7. Generate the video. HeyGen renders the full video in the cloud. Short videos (under 2 minutes) typically render in 2-5 minutes. Longer videos take proportionally more time.

Script writing tips that matter for AI video:

  • Write in short, punchy sentences. The AI lip sync performs best when words are not too long or densely packed.
  • Avoid acronyms the TTS engine will not pronounce correctly. Spell them out or use phonetic approximations.
  • Add punctuation deliberately. Commas and periods control pacing. A comma produces a short pause; a period produces a longer one.
  • Test a 30-second clip before generating the full video to catch pronunciation issues early.

Try This Yourself

Create a two-scene video. Scene 1: introduce a product or topic (30 seconds). Scene 2: summarize three key benefits (30 seconds). Use different backgrounds for each scene. This teaches you multi-scene editing and transition controls in a single short project.

Core Use Cases and Business Applications

HeyGen is not a single-purpose tool. The platform is used across remarkably different industries for different reasons, but a few use case patterns emerge consistently.

eLearning and training content is the most common starting point for teams. Traditional course production requires scheduling a presenter, booking a studio, lighting, recording, editing, and re-recording when the script changes. With HeyGen, a learning and development team can update a course module by simply editing the script and re-generating. No studio booking, no presenter availability conflict. The School of AI case study shows this in practice: they scaled from producing a handful of courses to 10x output using the same headcount.

Marketing and product videos are another high-volume use case. Landing page explainer videos, product walkthroughs, feature announcement videos, and ad creative can all be produced in the same workflow. The key advantage is iteration speed - if the marketing message changes, you update the script, not the video shoot.

Internal communications represent an underrated application. Executive announcements, HR onboarding modules, compliance training, and all-hands summaries can be delivered via a consistent AI presenter. This is especially valuable for globally distributed teams where asynchronous video communication is preferred over live meetings.

Sales outreach with personalized video is covered in depth in Module 4, but it is worth noting here as a fundamentally different use case from the others. Instead of one video published to many viewers, you generate hundreds of unique videos, each customized with the prospect's name, company, or specific context.

The common thread across all these use cases is the ratio of production time to output quality. HeyGen does not produce the same cinematic quality as a Hollywood-grade production, but for the vast majority of business video needs, the quality is entirely sufficient and the time savings are dramatic.

Core Insights

  • HeyGen combines AI avatars, text-to-speech, and a visual editor into a single platform that replaces the traditional video production workflow for most business use cases.
  • The public avatar gallery includes 300+ presenters with multiple looks - using filters for demographic match and preview clips before selecting saves significant time.
  • The Script to Video workflow follows seven steps from project creation to render, and short test clips before full generation prevent wasted render time.
  • Photo avatars provide an instant path to animated presenters from still images, but produce less natural movement than custom video avatars.
  • The highest-ROI use cases for HeyGen are eLearning content at scale, marketing video iteration, internal communications, and personalized sales outreach.