Presentation livestreams look simple until you operate one for 3 hours.
A “boring” talk typically needs:
- a clean speaker shot (ideally tracked)
- slides when they matter
- a wide stage shot as safety
- audience cutaways (optional but improves production value)
- PiP layouts (speaker + slides)
Doing this manually is repetitive and error-prone, especially with small crews or long programs.
This guide shows how to build an automated, operator-friendly workflow using PTZ cameras and switchers like Blackmagic ATEM or software mixers like vMix, with an emphasis on production-quality rules: clean cuts, no live PTZ moves, and predictable behavior.
Baseline automation:
- Track the presenter (framing stays correct)
- Switch between speaker / slides / wide / PiP at the right moments
- Use audio to detect “who is speaking” (single or multi-speaker)
Advanced goals:
- Detect slide changes and react
- Detect applause/laughter and cut to audience
- Multi-speaker panel automation using per-speaker mics + presets
A reliable automation workflow starts with a stable shot plan:
Recommended minimal set
- Stage (wide): static safe shot over the whole scene (required)
- Speaker: PTZ that follows the presenter
- Presentation: direct feed of slides (HDMI/SDI capture or NDI®)
- Optional: Audience camera for reactions
- Optional: PiP input (speaker + slides combined)
Select your cameras: Depending on your budget constraings and requirements, you can use a wide range of PTZ cameras by Telycam, Panasonic, Sony, Canon, Marshall, BirdDog or Obsbot.
Tracking and good switching depend on latency.
Common input paths:
- Capture card (SDI/HDMI) → usually the lowest latency
- NDI → moderate latency (often acceptable) - many PTZ cameras support NDI natively
- MJPEG streams → moderate latency (often acceptable) - almost all PTZ cameras support this
- SRT and RTSP → high latency - useful for sending video feeds over the internet but not for tracking
Slides are often the source of last-minute chaos.
Options:
- HDMI out of the presentation machine → capture → switcher
- NDI® output from the presentation machine → receiver → switcher/automation
Rules:
- Use a stable resolution
- Avoid OS popups / notifications
Single-speaker talk
- One clean mic feed + optional room mic
Multi-speaker panel / town hall
- You’ll get better automation with:
- one mic channel per speaker (or at least per speaking position)
- plus optional room/audience mic
Why it matters:
- Active-speaker logic requires usually isolated channels.
You need a switcher or software that can be driven by automation.
Typical choices:
- Blackmagic ATEM (hardware)
- vMix (software)
- Ross / Roland / OBS
Your automation layer must know:
- input mapping
- preview/program states
There are three common models:
- Start with wide
- Show slides at fixed segments
- Return to speaker
- Manual override for special moments
Good for:
- scripted / rehearsed talks
This can be easily implemented with most switchers using macros or plugins (for software mixers). However, it might not be flexible enough for dynamic presentations.
- Speaker mic active → show speaker
- Silence → wide
- Audience mic active → audience
- Optional: slide change triggers slide shots
Good for:
When implemented well, this gets you 80% of the way there, although this will require extensive programming skills to get right.
- Uses audio + video AI analysis to decide:
- when slides matter
- when to show speaker
- when to show wide / audience
- Coordinates PTZ moves with switcher states
Good for:
- presentations, panel discussion, long conferences, universities, town halls
This is the class of workflow MiruSuite is designed for: automated, high-quality production with minimal operator intervention.
MiruSuite is a ready-made solution that can be integrated into existing AV workflows, without the need for custom programming.
You need:
- at least one tracking-capable speaker camera (PTZ + tracking)
- a slide feed
- a switcher that can be controlled
- an automation layer like MiruSuite that handles switching and prevents live camera moves
Start with: wide + speaker + slides.
You either:
- drive switching with rules (audio activity + slide change detection)
- or use an AI auto-cut layer that analyses the slide feed and decides when it’s “interesting” to show slides.
Yes, if you have:
- mic channels per speaker (or per position)
- presets for each speaker from at least one camera (two cameras is better)
- logic that avoids flip-flops during crosstalk
If you’re producing talks that are:
- long
- repetitive
- high volume (universities, conferences, internal town halls)
MiruSuite is designed for exactly these formats:
- automated person tracking
- automated live cutting (“AutoCut”) for presentation scenarios
- slide-aware decisions
- switcher integration so cameras don’t move on program
- manual override when needed
MiruSuite supports common PTZ ecosystems (Panasonic, Sony, Canon, Marshall, BirdDog, Telycam) and typical switching stacks (Blackmagic ATEM, Ross Video, vMix, OBS, Roland AV).