斜杠中年斜杠中年AI × 沟通 × 商业 × 人生
AI Creation & Tools

Creating AI-Generated Videos with Wan2GP, LTX 2.3, and OmniVoice: Behind-the-Scenes of the Johor Election Stone Lion Podcast Video

An in-depth look at how to combine Wan2GP, LTX 2.3, and OmniVoice AI tools to produce timely commentary videos, using the Johor election stone lion podcast as a case study.

2026-06-09Updated: 2026-06-098 min readWesley Chong
#Wan2GP#LTX 2.3#OmniVoice#AI video generation#AI voice synthesis#content creation#Johor election
Creating AI-Generated Videos with Wan2GP, LTX 2.3, and OmniVoice: Behind-the-Scenes of the Johor Election Stone Lion Podcast Video|AI Creation & Tools 封面图

Summary

Learn how to use free open-source AI video generation tools Wan2GP for prototyping, LTX 2.3 for quality enhancement, and OmniVoice for realistic multilingual voice synthesis to quickly produce commentary content.

Creating AI-Generated Videos with Wan2GP, LTX 2.3, and OmniVoice: Behind-the-Scenes of the Johor Election Stone Lion Podcast Video

Introduction

A recent YouTube video titled 《居銮石狮子都要下来做Podcast,柔佛州选大乱斗?》 (Watch here) has sparked discussion across Malaysian social media. The video features the stone lion of Kluang as a podcast host humorously commenting on the latest developments in Johor's state election. What makes this video particularly interesting is that it was entirely generated using AI tools: the creator employed Wan2GP for video generation, LTX 2.3 for quality enhancement, and OmniVoice for realistic podcast voice synthesis.

This article details the characteristics of these tools, their collaborative workflow in video creation, and the implications of using AI tools for producing timely commentary content.


Tool Overview

Wan2GP: Low-VRAM-Friendly AI Video Generator

Wan2GP is an open-source AI video generation tool designed specifically for low-memory GPUs, capable of producing high-quality videos on consumer-grade graphics cards. It is based on optimized Wan series models, making it ideal for quickly generating short clips and social content.

LTX 2.3: Latest Open-Source AI Video Model

LTX 2.3 is an open-source AI video generation model released by the LTX Model team, supporting 4K resolution and 50 FPS video output with built-in native audio generation. The model excels in text-to-video and image-to-video tasks.

OmniVoice: Multilingual AI Voice Cloning and TTS

OmniVoice is a voice generation platform supporting 600+ languages, featuring zero-shot voice cloning and natural speech synthesis capabilities. It can generate target voices from short audio samples or synthesize multilingual speech directly from text.


Video Creation Workflow

1. Conceptualization and Script Writing

First, the author crafted a podcast script based on the latest Johor election news, incorporating humorous elements like the stone lion "coming down," election chaos, and local public reactions. The script used a Chinese-English mixed language style to increase fun and spreadability.

2. Audio Generation (OmniVoice)

Using OmniVoice, the author selected a middle-aged male voice as the stone lion's vocal characteristic. By uploading a sample audio clip (or using the built-in voice library), OmniVoice generated the complete podcast voiceover audio file. The tool's multilingual support ensured natural and fluent Chinese pronunciation.

3. Base Video Generation (Wan2GP)

With the voiceover ready, the author input text descriptions of key scenes into Wan2GP. Examples:

  • "An ancient stone lion walking through the streets of Kluang town, with the Johor state government building in the background"
  • "The stone lion holding a microphone, speaking seriously about election results"

Wan2GP quickly generated base video clips for these scenes in a low-VRAM environment, though resolution and detail might be limited.

4. Video Enhancement (LTX 2.3)

To improve video quality, the author imported Wan2GP-generated initial clips into LTX 2.3 for secondary processing. LTX 2.3's super-resolution and frame interpolation features enhanced clarity and smoothness, particularly in the stone lion's texture and motion details.

5. Audio-Video Synthesis and Post-Production

Finally, using video editing software (such as DaVinci Resolve or CapCut), the author synchronized the OmniVoice-generated audio with the LTX 2.3-enhanced video track. Subtitles, background music, and simple transition effects were added to complete the final video.


Results and Reflections

Through this workflow, the author successfully produced a timely, entertaining AI-generated video in just a few days. The video garnered thousands of views and numerous comments on YouTube, with audiences frequently praising the realism of the stone lion's voice and the video's satirical tone.

Key Advantages:

  • Extremely Low Cost: All tools offer free tiers or open-source versions, avoiding traditional video production labor and equipment expenses.
  • Remarkable Speed: From concept to finished product in under 24 hours, enabling rapid response to hot topics.
  • Creative Freedom: AI tools made previously difficult-to-achieve concepts (like a stone lion podcast) realizable.

Limitations and Improvement Directions:

  • Generated videos occasionally exhibit slight "unnaturalness" (e.g., lip-sync mismatches), requiring manual adjustment.
  • For complex camera movements and multi-character interactions, AI still struggles to fully replace live-action filming.
  • Future exploration could involve more advanced models (e.g., Wan 2.2) or motion control techniques to improve consistency.

Conclusion

This case demonstrates the powerful potential of modern AI toolchains in content creation. By combining Wan2GP (rapid prototyping), LTX 2.3 (quality enhancement), and OmniVoice (voice synthesis), creators can produce professional-quality video content at extremely low barriers. For news commentary, social satire, and educational content, this workflow is particularly suited for rapid response and experimental expression.

As AI video and speech models continue to advance, we can expect more creators to leverage similar toolchains to express viewpoints, tell stories—and perhaps give even more stone lions their own podcast channels.


FAQs

Are these tools really free to use?

Yes, Wan2GP and LTX 2.3 are open-source projects, and OmniVoice offers a free tier. However, advanced features or commercial use may require payment.

How does the generated video quality compare to professional production?

For quick news content and social commentary, these tools are sufficient. However, for highly customized and complex shots in professional productions, there's still room for improvement.

What hardware do I need to run these tools?

Wan2GP is designed for low-VRAM GPUs, and LTX 2.3 and OmniVoice also have online versions, making them usable on consumer-grade computers.

分享这篇文章 / Share Article
Wesley Chong

Author

Wesley Chong

Software developer, digital consultant, and Toastmasters speaker from Kluang, Malaysia.

Focusing on helping ordinary people upgrade communication, expression, business, and life with AI.

Related Reading