From Static to Motion: Creating AI Videos with LTX-Video

January 10, 2025

For the past few years, I’ve been exploring the realm of AI-generated imagery using Stable Diffusion locally. Recently, I discovered LTX-Video, a game-changing tool that allows me to transform my static creations into dynamic videos of up to 257 frames - all processed locally on my machine. Combined with MMAudio, I can now create complete audiovisual experiences without relying on cloud services.

Bringing Liminal Spaces to Life

I’ve always been fascinated by creepy, eerie, and liminal spaces in my AI art. Now, by adding the dimension of time, these unsettling environments take on a new level of immersion. The result? Low-fi videos that capture the essence of classic creepypasta aesthetics.

Examples

Note: While some of the source images used in these examples were generated using Stable Diffusion, others are sourced from real photographs and CG artwork created by other artists. All motion and audio effects were applied locally using LTX-Video and MMAudio.

Whimsical and Liminal Spaces

A dreamy bedroom suspended in the clouds, where reality meets fantasy.

Nostalgic animation inspired by Super Mario 64’s iconic skyboxes.

A surreal journey through the skies on an endless train ride.

Liminal Spaces and Urban Exploration

An empty indoor pool, frozen in time.

The familiar made unfamiliar: empty suburban spaces.

Exploring abandoned urban spaces.

A familiar yet unsettling backrooms-inspired environment.

AI Quirks

An interesting AI failure case: when the model doesn’t know how to handle the input, it creates unexpected results.

Personal Projects

A recreation of a personal dream using AI imagery.

⚠️ Content Warning -- Disturbing Imagery

The following section contains disturbing imagery.

A haunting scene in an abandoned house.

Inspired by SCP Foundation documentation.

An unsettling experimental piece.

Abstract horror exploration.

Technical Setup

For those interested in the technical aspects, I’m running:

Stable Diffusion for the base images
LTX-Video for motion generation (locally hosted) - capable of generating 24 FPS videos at 768x512 resolution in real-time
MMAudio for synchronized audio generation and atmospheric sound design
ComfyUI as the interface to orchestrate the workflow

The ability to process everything locally gives me complete control over the creative process and allows for rapid iteration and experimentation. LTX-Video’s real-time processing capabilities combined with MMAudio’s multimodal joint training approach enables me to create fully synchronized audiovisual experiences without relying on cloud services.

Acknowledgments

The workflow and inspiration for this project came from Reddit user Qparadisee’s post on r/StableDiffusion, who demonstrated the powerful combination of Stable Diffusion and LTX-Video for creating animated liminal spaces.