name: create-movie description: > Orchestrated movie creation for Horus persona. Guides through phases: Research → Script → Build Tools → Generate → Assemble. Uses Docker-isolated coding environment, free/open-source tools only, with full memory integration. allowed-tools: [Bash, Read, Write, Task, WebFetch, WebSearch] triggers:
- create movie
- make movie
- make film
- create film
- horus filmmaking
- horus movie
- create mockumentary
- create short film
- create music video
- vibe coding movie
- ai movie creation
- study filmmaking
- learn cinematography
- horus study
- horus learn filmmaking metadata: short-description: "Orchestrated movie creation (Research → Script → Build → Generate → Assemble)" author: "Horus" version: "0.1.0"
create-movie
Orchestrated movie creation for Horus persona. Creates mockumentaries, short films, music videos, and educational content through a phased workflow.
Philosophy
"AI isn't the artist, it's the amplifier" - Nobody & The Computer
Horus uses AI to turn imagination into audiovisual reality. He doesn't just use pre-built tools - he writes code to create his own tools.
Phases
HARDWARE CHECK → RESEARCH → SCRIPT → BUILD TOOLS → GENERATE → ASSEMBLE → LEARN
Phase 0: Hardware Detection (Automatic)
Before any generation, the orchestrator automatically detects hardware via /ops-workstation:
# Automatic hardware check on startup
./run.sh create "prompt"
# → Calls /ops-workstation gpu to detect VRAM
# → Calls /ops-workstation memory to detect RAM
# → Auto-selects optimal model variant
Auto-Selection Logic:
| Detected VRAM | Model Selected | Settings |
|---|---|---|
| ≥24GB | LTX-2 19B FP8 | 720p/1080p, audio on, batch=1 |
| 16-23GB | LTX-2 19B FP4 | 720p only, audio on, batch=1 |
| 12-15GB | LTX-2 Distilled 2B | 720p, audio optional, batch=1 |
| <12GB | RunPod suggested | Prompts to use /ops-runpod |
RAM-Based Optimizations:
| Detected RAM | Optimization |
|---|---|
| ≥128GB | Weight streaming enabled (offload to RAM) |
| 64-127GB | Partial offloading |
| <64GB | No offloading, strict VRAM limits |
Override Auto-Detection:
# Force specific model variant
./run.sh create "prompt" --model ltx2-fp4
./run.sh create "prompt" --model ltx2-distilled
./run.sh create "prompt" --runpod # Force cloud generation
Phase 1: Research (Library-First)
- Check Horus's Library First:
horus-filmmakingscope (past techniques, learnings)horus_lorescope (YouTube transcripts, film analysis)- Ingested movies with emotion tags
- Episodic archive (past filmmaking sessions)
- Search for New Resources:
/ingest-movie searchfor films to watch/ingest-youtube searchfor tutorials
- Deep Web Research:
/dogpilefor comprehensive multi-source search/surffor specific tutorials/references
Phase 2: Script (via /create-story)
- Integrates with
/create-storyskill for screenplay generation - Uses Chutes models (chimera, qwen, deepseek-r1) for creative writing
- Parses INT./EXT. headings, dialogue, action, audio cues
- Outputs structured scene breakdown with visual descriptions
Format Options:
screenplay(default) - Standard INT./EXT. scene headingsmockumentary- Interview segments with talking heads + B-rollreconstruction- Historical recreation with narrator framing
Phase 3: Build Tools
- Write code in Docker-isolated sandbox
- Create custom tools for specific effects
- Iterate on approaches
Phase 4: Generate
- Use ComfyUI, Stable Diffusion for images
- Use auto-selected video model based on hardware (LTX-2 FP8/FP4/Distilled)
- Use Whisper, IndexTTS2 for audio
- If hardware insufficient, automatically suggests
/ops-runpod
Phase 5: Assemble
- Combine assets with FFmpeg
- Output MP4 video or interactive HTML
Phase 6: Learn
- Store successful techniques in /memory
- Remember what worked for future movies
Quick Start
cd .pi/skills/create-movie
# Full orchestrated workflow (recommended)
./run.sh create "A 30-second film about discovering colors"
# With options
./run.sh create "film noir detective" \
--duration 60 \
--style "high contrast, shadows, venetian blinds" \
--format mp4 \
--work-dir ./noir_project
# Individual phases (for manual control)
./run.sh research "film noir lighting techniques"
./run.sh script --from-research research.json --duration 30 --use-create-story
./run.sh build-tools --script script.json
./run.sh generate --tools ./tools --script script.json --style "cinematic"
./run.sh assemble --assets ./assets --output movie.mp4 --format mp4
./run.sh learn --project-dir ./movie_project
CLI Commands
create
Full orchestrated workflow through all phases.
./run.sh create PROMPT [OPTIONS]
--output, -o Output file (default: movie.mp4)
--work-dir, -w Working directory (default: ./movie_project)
--duration, -d Target duration in seconds (default: 30)
--style, -s Visual style (e.g., 'cinematic', 'film noir')
--format, -f Output format: mp4 or html (default: mp4)
--store-learnings Store learnings in memory (default: true)
--skip-research Skip research phase if research.json exists
research
Library-first research: checks Horus's memory and ingested content before external search.
./run.sh research TOPIC [OPTIONS]
--output, -o Output file (default: research.json)
--skip-external Only search library, skip external sources
script
Generate screenplay with scene breakdown. Integrates with /create-story.
./run.sh script [OPTIONS]
--from-research, -r Research JSON file (required)
--prompt, -p Override topic from research
--duration, -d Target duration in seconds
--use-create-story Use /create-story skill for screenplay
--model, -m LLM model (default: chimera)
--output, -o Output file (default: script.json)
build-tools
Generate custom tools in Docker sandbox.
./run.sh build-tools [OPTIONS]
--script, -s Script JSON file (required)
--output-dir, -o Output directory (default: ./tools)
--skip-docker Use host instead of Docker sandbox
generate
Create images, video, and audio assets.
./run.sh generate [OPTIONS]
--tools, -t Tools directory (default: ./tools)
--script, -s Script JSON file (required)
--output-dir, -o Assets output directory (default: ./assets)
--style Visual style to apply
assemble
Combine assets into final output.
./run.sh assemble [OPTIONS]
--assets, -a Assets directory (required)
--output, -o Output file/directory (required)
--format, -f Output format: mp4 or html (default: mp4)
--fps Frames per second for MP4 (default: 24)
learn
Store filmmaking insights in memory after a project.
./run.sh learn [OPTIONS]
--project-dir, -p Project directory (required)
--scope Memory scope (default: horus-filmmaking)
--dry-run Show learnings without storing
study
Pre-phase: Learn filmmaking topics BEFORE creating movies. Targeted /dogpile with internal (memory) + external (web) search, then stores via /memory learn.
./run.sh study TOPIC [OPTIONS]
--scope Memory scope (default: horus-filmmaking)
--deep/--quick Deep research (dogpile) vs quick (YouTube search)
--list-topics Show suggested filmmaking topics
# Examples:
./run.sh study "cinematography lighting techniques" --deep
./run.sh study "camera framing composition" --deep
./run.sh study --list-topics
study-all
Comprehensive learning session - studies all core filmmaking topics.
./run.sh study-all [OPTIONS]
--scope Memory scope (default: horus-filmmaking)
Output Formats
MP4 Video
Standard video file, playable anywhere.
Interactive HTML
Web-based experience with:
- Frame-by-frame navigation
- Audio controls
- Scene metadata viewer
Available Skills
Horus has access to all skills in .pi/skills/:
| Skill | Purpose in Movie Creation |
|---|---|
/dogpile | Deep research on techniques, references |
/surf | Visit websites, tutorials, references |
/memory | Recall prior techniques, store learnings |
/create-image | Generate images for scenes |
/tts-train | Horus's voice for narration |
/ingest-movie | Ingest reference movies for style analysis |
/create-paper | Write stories, scripts, creative content |
/episodic-archiver | Archive movie creation sessions |
/anvil | Debug and harden custom tools |
/ingest-book | Search books for story inspiration |
Free/Open-Source Tools
| Purpose | Tool |
|---|---|
| Image Generation | Stable Diffusion (ComfyUI) |
| Video Generation | LTX-2 (recommended), Mochi 1, CogVideoX (fallbacks) |
| Video Processing | FFmpeg |
| Speech-to-Text | faster-whisper |
| Text-to-Speech | IndexTTS2 |
Video Model Selection Guide
Choose video model based on your GPU VRAM and use case. VRAM figures include 3-5GB headroom for pipeline overhead (ComfyUI/loader/audio), batch=1, FP8/FP4 where noted.
| VRAM | Recommended Models | Best For |
|---|---|---|
| 12GB (RTX 3060/4070) | LTX-2 Distilled (2B), CogVideoX-2B | Quick iterations, pre-viz |
| 16GB (RTX 4080/A4000) | LTX-2 19B FP4 (720p, ≤10s), WAN 2.2, SVD | Medium quality production |
| 24GB (RTX 4090/A5000) | LTX-2 19B FP8 (recommended), WAN 2.2, Mochi | High quality production |
| 40GB+ (A100/H100) | LTX-2 BF16 (43GB), Full Mochi, Open-Sora 2.0 | Maximum quality |
Safe Defaults (RTX A5000 24GB)
Model: LTX-2 19B FP8
Resolution: 720p
Clip length: 10s
Batch size: 1
Seed: fixed
Audio: on
If runtime VRAM >22GB or instability occurs: lower resolution to 540p, disable audio, or shorten clips. Avoid parallel jobs on 24GB.
Model Characteristics
| Model | Speed | Quality | Audio | Best Use Case |
|---|---|---|---|---|
| LTX-2 19B FP8 ⭐ | Fast | High | Yes | Recommended - Camera controls, audio sync |
| LTX-2 Distilled | Fastest | Medium | Yes | Rapid iteration, light VRAM |
| WAN 2.2 14B | Slow | Very High | No | Silent films, German Expressionism, art films |
| Mochi 1 | Slow | High | No | Final renders, prompt adherence |
| HunyuanVideo | Medium | High | No | Production quality |
| CogVideoX-5B | Medium | High | No | General purpose (fallback) |
Recommendation:
- Use LTX-2 19B FP8 for production work with audio sync and camera controls
- Use WAN 2.2 for silent films or when audio isn't needed (higher visual quality for same VRAM)
- Fallback to Mochi for maximum quality or CogVideoX for compatibility
LTX-2: Recommended Video Model
LTX-2 is a 19B parameter DiT-based audio-video foundation model.
Model Variants:
| Model | Size | VRAM | Quality | Recommended For |
|---|---|---|---|---|
| LTX-2 19B FP8 ⭐ | ~19GB (+3-5GB overhead) | 24GB | High | Production (A5000, 720p/1080p ≤12-15s, batch=1) |
| LTX-2 19B FP4 | ~12GB (+3-5GB overhead) | 16GB | High | Faster, slightly less quality (720p ≤10s) |
| LTX-2 BF16 (full) | ~43GB | 40GB+ | Highest | RunPod/A100 only |
| LTX-2 Distilled 2B | ~4GB | 12GB | Medium | Rapid iteration |
FP8 Compatibility: Requires compatible CUDA/cuDNN/PyTorch builds. Follow LTX-Video docs for driver requirements.
Key Features:
- Synchronized Audio-Video Generation: Generates coherent audio + video together
- Camera Controls: Dolly, jib, static shots with natural camera motion
- IC-LoRA: Style transformations (anime, sketch, etc.) with ~1GB VRAM
- Keyframe Interpolation: Morphing between keyframes
- Pose/Depth/Canny Controls: Precise composition control (Canny edge detection)
- Text-to-Video and Image-to-Video: Both workflows supported
ComfyUI Templates:
| Template | Use Case |
|---|---|
LTX2 Text-to-Video | Generate from text prompts |
LTX2 Image-to-Video | Animate a still image |
LTX2 Canny-to-Video | Edge detection guided generation |
LTX2 Distilled | Fast iteration, lower VRAM |
Installation:
# ComfyUI (recommended)
# Install "LTX-Video" from ComfyUI Manager
# Templates appear automatically
# Or standalone
pip install ltx-video
ComfyUI VRAM Optimization Flags:
# Reserve VRAM for other operations (prevents OOM during generation)
python -m main --reserve-vram 5
# Low VRAM mode - offloads to system RAM (slower but prevents OOM)
python -m main --lowvram
# Weight streaming - NVIDIA/ComfyUI collaboration for 256GB RAM systems
# Automatically offloads model weights to system RAM when VRAM exhausted
Additional Resources:
- ComfyUI_LTX-2_VRAM_Memory_Management - Nodes for long videos on consumer GPUs
Camera Control Reference (LTX-2)
LTX-2 supports cinematic camera movements via prompt keywords:
| Movement | Prompt Keywords | Effect |
|---|---|---|
| Static | static shot, locked camera | Fixed camera position |
| Dolly | dolly in, dolly out, push in | Camera moves toward/away from subject |
| Jib/Crane | jib up, jib down, crane shot | Vertical camera sweep |
| Pan | pan left, pan right | Horizontal rotation |
| Tilt | tilt up, tilt down | Vertical rotation |
| Tracking | tracking shot, follow shot | Camera follows subject |
| Zoom | zoom in, zoom out | Focal length change |
Example Prompts:
# Dramatic reveal
"Dolly in slowly to a detective examining evidence, noir lighting, static hold on face"
# Action sequence
"Tracking shot following runner through city streets, handheld, dynamic"
# Interview setup
"Static medium shot, subject centered, shallow depth of field, jib down to hands"
Combining Movements:
"Jib up while dolly out, revealing vast landscape, golden hour, cinematic"
WAN 2.2: Silent Film Alternative
WAN 2.2 is a 14B parameter model optimized for visual quality without audio:
Best For:
- Silent films and art cinema
- German Expressionism era aesthetics (Nosferatu, Metropolis, Cabinet of Dr. Caligari)
- High visual fidelity when audio isn't needed
- Projects where audio will be added separately
Comparison to LTX-2:
| Aspect | LTX-2 19B FP8 | WAN 2.2 14B |
|---|---|---|
| Audio | Synchronized | None |
| Speed (10-sec HD, A5000) | ~3.5-4.5 min | ~5-6 min |
| Visual Quality | High | Very High |
| VRAM (24GB) | Works | Works |
When to Choose WAN 2.2:
- Creating silent films with intertitles
- German Expressionism homages
- Music videos where audio is pre-recorded
- Art films with separate sound design
Practical Notes: Seed control recommended for stable multi-shot outputs. 720p preferred on 24GB for consistent speeds.
Performance Expectations
Video generation is compute-intensive. Plan for overnight batch processing rather than real-time iteration.
Local Generation Times (RTX A5000, 24GB VRAM)
| Video Length | Resolution | Model | Time |
|---|---|---|---|
| 5 seconds | HD (720p) | LTX-2 19B FP8 | ~1-1.5 min |
| 10 seconds | HD (720p) | LTX-2 19B FP8 | ~3.5-4.5 min |
| 10 seconds | Full HD (1080p) | LTX-2 19B FP8 | ~5-6.5 min |
| 15 seconds | HD (720p) | LTX-2 19B FP8 | ~6-7.5 min |
| 10 seconds | HD (720p) | WAN 2.2 | ~5-6 min |
Notes:
- Timings based on Alex Ziskind's benchmarks (RTX 5080) with +15-25% buffer for A5000
- Audio synchronization adds ~10-15% time vs video-only runs
- IO/storage affects throughput; prefer local NVMe, avoid network mounts
Realistic Workflow
For a 2-minute film (12 x 10-second clips):
- Generation time: ~42-54 min (LTX-2, 720p) to ~60-72 min (WAN 2.2)
- With retakes and iterations: 2-4 hours
- Full production with assembly: overnight task
Recommendation: Queue video generation as overnight background tasks. Use /task-monitor to track progress.
# Example: Run generation overnight
./run.sh generate --script script.json --output-dir ./assets &
# Check progress next morning
RunPod for Large Tasks
Use /ops-runpod when local generation would cause OOM errors.
When to Use RunPod
| Scenario | Local (A5000 24GB) | RunPod Needed |
|---|---|---|
| LTX-2 19B FP8, 10-sec HD | Works | No |
| LTX-2 19B FP8, 15-sec 1080p | Works (batch=1) | No |
| 1080p clips >12-15 sec (FP8) | May OOM | Prefer 720p or split; RunPod optional |
| LTX-2 BF16 (43GB full model) | OOM | Yes (A100 40GB+) |
| Very long videos (>20 sec 1080p) | Likely OOM | Yes |
| Batch processing (10+ clips) | Slow but works | Optional (faster) |
| WAN 2.2 + LTX-2 parallel | High OOM risk | Prefer sequential or RunPod |
OOM Threshold Guidance (A5000 24GB):
- LTX-2 FP8: 1080p clips over ~12-15s may OOM with audio; use 720p, shorten clips, or disable audio
- Control nets (pose/depth/canny) and multiple LoRAs increase memory; enable selectively
- Monitor runtime VRAM; keep ≤22GB to avoid instability
RunPod Workflow
# Provision GPU for large task
/ops-runpod provision --gpu a100-40gb --task "LTX-2 BF16 generation"
# Run generation on RunPod
/ops-runpod run --script generate.sh
# Download results and terminate
/ops-runpod download --output ./assets
/ops-runpod terminate
RunPod GPU Options:
- BF16/full precision: A100 40-80GB, H100 (required)
- FP8/FP4 tasks: L40S 48GB, A10G 24GB (cheaper alternatives)
Cost Consideration: RunPod charges by the hour. For overnight tasks, local generation is more cost-effective. Consider spot/preemptible instances for savings.
Troubleshooting & Fallbacks
OOM Mitigation:
- Reduce resolution (720p → 540p)
- Shorten clip length
- Set batch=1
- Switch FP mode (BF16 → FP8 → FP4)
- Disable audio
- Split long clips into segments
Stability:
- Fix seed for reproducibility
- Avoid parallel jobs on 24GB
- Reduce control nets and LoRA stacks
Fallback Path: If LTX-2 fails, switch to WAN 2.2 (video-only) or CogVideoX; add audio separately in post.
Memory Integration
After each movie, stores:
- Successful prompts
- Working tool code
- Technique insights
- Concept relationships
Scope: horus-filmmaking
Workflow Patterns (from Nobody & The Computer)
Multi-Model Collaboration
Different AI models handle different creative aspects, inspired by "Bach x Coltrane x Kuti x Takemitsu":
- Model A (Claude): Structure, composition, narrative arc
- Model B (GPT): Improvisation, dialogue, variation
- Model C (Grok): Energy, rhythm, pacing
- Model D (DeepSeek): Texture, atmosphere, silence
Each model builds on previous work. Constraints: 100 words max per turn for focused output.
Critique Loop
From "A.I.thoven" sessions - "roast the piece with love":
- Generate initial draft
- Critique constructively (what works, what doesn't)
- Iterate based on feedback
- Repeat until satisfied
Iteration Speed
Use LTX-2 Distilled for rapid iterations during creative exploration. Use LTX-2 13B for production with camera controls and audio sync. Fallback to Mochi for maximum quality when camera control isn't needed.
Example Session
Horus: I want to create a mockumentary about AI learning to paint.
[RESEARCH] Searching for documentary interview techniques, AI art history...
[SCRIPT] Breaking into 5 scenes: intro, discovery, struggle, breakthrough, reflection
[BUILD TOOLS] Writing code for interview framing effect, paint brush animation...
[GENERATE] Creating 45 frames, 3 audio tracks, 2 voice segments...
[ASSEMBLE] Combining into 2-minute video with transitions...
[LEARN] Storing 8 insights in memory for future films.
Output: ai_painter_mockumentary.mp4 (2:14)
Dependencies
- Docker (for isolated code execution)
- FFmpeg (video processing)
- Python 3.11+ (orchestrator)
- GPU recommended (for Stable Diffusion, video models)