Building Anna: A ComfyUI Pipeline for Virtual Influencers
After spending months testing different image and video models, we finally cracked the code for a virtual influencer pipeline that actually works. No manual uploads, no weird hacks – just ComfyUI orchestrating everything while Flux handles the pretty pictures and Wan makes them move. The best part? It posts straight to Instagram and TikTok using their official APIs.
I'm sharing the high-level approach here because honestly, the concept matters more than the specific tools. This stuff changes so fast that whatever we're using today might be different next month, but the workflow principles stay solid.
How We Think About It
The whole system follows one simple rule: create once, distribute everywhere. Here's the breakdown:
Persona Development
This is like creating a brand bible for a person who doesn't exist. We define everything – personality, style, what she wears, what colors work, what she'd never say. Think of it as character development for a digital human.
Image Generation with Flux
Flux cranks out the hero shots – thumbnails, carousel images, and clean keyframes that lock in composition and styling. It's remarkably consistent once you dial in the prompts.
Video Creation with Wan
Take those keyframes, add some direction, and Wan turns them into smooth vertical videos. We keep everything short, punchy, and story-driven because that's what works on social.
Automated Publishing
ComfyUI triggers a small service that talks directly to Instagram and TikTok's official APIs. No sketchy browser automation or manual posting – everything's above board and scalable.
Performance Tracking
We pull metrics from both platforms, analyze what's working, and feed that back into the system. The prompts get better, the content gets more targeted, and the whole thing keeps improving.
Why These Tools Work Together
We tried a bunch of different combinations before settling on this stack. Here's why these specific tools won out:
Flux for Static Content
Flux is just really good at controlled, consistent imagery. When you need the same character to look the same across hundreds of posts, consistency beats creativity every time. Plus it handles composition really well – important for thumbnails that need to grab attention.
Wan for Motion
Wan keeps the character looking like herself across video frames, which is harder than it sounds. Combined with solid keyframes from Flux, you get smooth motion that doesn't drift into weird territory halfway through a clip.
Together, they eliminate most of the guesswork. Flux establishes the visual language, Wan brings it to life.
The Technical Setup (Bird's Eye View)
I'm not going to get into specific node configurations because they change constantly, but here's how the pipeline flows:
Image Production Line
- Prompt templates with personality and style locked in
- Reference controls to keep the character consistent
- Flux generation with automatic cropping for different platform formats
- Output: thumbnails, carousel content, and video keyframes
Video Production Line
- Hero keyframes from the image line plus scene direction
- Wan generates 9:16 vertical videos (short clips work best)
- Basic post-processing: stabilization, cleanup, watermarks if needed
Content Packaging
- Caption templates that match the character's voice
- Safety checks for brand compliance and content guidelines
- Audio beds and sound effects where appropriate
- Final output: ready-to-post MP4s with cover images
Publishing & Analytics
- ComfyUI makes HTTP calls to our posting service
- Service handles Instagram Graph API and TikTok API calls
- Post IDs get logged for performance tracking
- Weekly dashboards for content optimization
Keeping It Legal and Clean
We use a simple Python microservice with proper OAuth tokens to handle all API calls. ComfyUI sends "ready to publish" payloads, and the service handles rate limits, retries, and error handling. Everything goes through official channels – no gray area automation or ToS violations.
The Rules That Keep Us Sane
After making every mistake in the book, here are the guidelines that actually matter:
Visual Consistency
Keep a reference set of her best looks. Don't let the character drift – hair color, eye shape, and facial structure should stay locked in. Audiences notice when their favorite influencer suddenly looks different.
Content Strategy
Short beats work better than complex narratives. Wan handles 6-15 second clips beautifully, but longer content tends to get wobbly unless you're really specific about scene direction.
Voice and Tone
Define how she talks and stick to it. We have templates for different post types, but the underlying personality stays consistent. Audiences follow people, not random content generators.
Quality Gates
Auto-flag weird hands, text artifacts, or anything that doesn't match brand guidelines before it goes live. It's way easier to catch problems in the pipeline than to deal with them after posting.
Content That Actually Performs
Through trial and error, we've found these formats work consistently:
- Micro-stories: 10-20 seconds with a setup, reveal, and call-to-action
- Routine content: Morning routines, outfit checks, quick tutorials – stuff people can relate to
- Carousel tips: Clean, simple slides with useful information (Flux is perfect for this)
- Response videos: Clean reaction shots that work for duets and replies
What We Learned the Hard Way
Here's what nobody tells you about virtual influencers:
Consistency beats novelty. Followers want to see the same person every time. That amazing new style you generated? Save it for a special occasion, not Tuesday's post.
Shorter is almost always better. Wan works great on brief clips. Try to stretch beyond 15 seconds and you'll spend more time fixing problems than creating content.
Cover images drive everything. A great Flux thumbnail can make or break your reach. It's worth spending time on getting those first frames perfect.
You still need a human. Automation handles the grunt work, but someone needs to review content, choose prompts, and shape the overall narrative. The best virtual influencers feel authentic because real people are making creative decisions.
Minimal Starting Stack
- ComfyUI with identity control nodes
- Flux for image generation
- Wan for video generation
- Simple API bridge service (Python/Node.js)
- Object storage for assets and logs
- Basic dashboard for performance metrics
The Ethics Side
Look, virtual influencers are weird territory. We try to keep things transparent and ethical:
- Use original or properly licensed music and voices
- Disclose that the influencer is AI-generated where appropriate
- Have human moderation for all content before it goes live
- Respect platform guidelines and API terms of service
- Avoid misleading claims or impersonation of real people
The technology is powerful, but it comes with responsibility. Use it thoughtfully.
What You Get
When everything's working, you have a content engine that can produce daily posts across multiple platforms without burning out your creative team. The character stays consistent, the quality stays high, and you can scale up or down based on what's actually performing.
It's not magic – it's just good engineering applied to social media. But sometimes that's exactly what you need to cut through all the noise and build something that lasts.
Want help setting up something similar for your project? The basic framework is adaptable to different characters, brands, and use cases. Just keep the core principles intact: consistency, quality gates, and official APIs only.