AI 不只「醬板鴨」:拆解「雪山救狐狸」提示詞懶人包給你抄

Beyond AI's 'Braised Duck with Chili': A Cheat Sheet to Dissecting the "Snowy Mountain Fox Rescue" Prompt, Ready for You to Copy

The story quietly began in March 2026, sweeping across major platforms like wildfire with its memes. How did an AI short film, costing less than NT$40, gain over a hundred million views? When we unpack the "Soy-Sauce Duck" meme, what we find is a reusable methodology for AIGC content production.

A woman in a red and black embroidered warrior costume approaches a wood-chopping farmer and asks, "Have you ever saved a fox in the snowy mountains?" The farmer thought it was a romantic opening of a traditional fox spirit repaying kindness, but the woman coldly replied, "No, I am that soy-sauce duck." The original video, humorously parodied, came from a scene in a Shaw Brothers martial arts film.

Caption - Original "Ai Lin Jun Ji" scene from the Soy-Sauce Duck AI video

🦊 Copy this "Fox in the Snowy Mountains" Prompt Pack

This AI short film, released by TikTok account "Ai Lin Jun Ji," reportedly cost less than NT$40 and took 5 hours to make. It's estimated to have garnered over a hundred million views across all platforms, even inspiring Jolin Tsai to re-create it. Suddenly, "saving a fox in the snowy mountains" became a phenomenal cultural meme, with countless people asking the same question: How exactly were these prompts written?

Digital Age recently published an article that fully dissected the three-stage prompts revealed by Threads creator Chris, the AI Explorer. It covered everything from anchoring to the Shaw Brothers aesthetic, to the rhythm of character appearances, and finally, to a "Manchu Han Imperial Feast" style twist. However, if we only focus on "copying the prompts," we'd be missing the real value.

In today's article, we want to go a step further: to extract 3 advanced principles from the "Soy-Sauce Duck" prompts that can be applied to any AIGC creation.

Caption - The backgrounds in Shaw Brothers' "The Flying Fox of Snowy Mountain" were all studio-filmed.

 

📽️ Prompting Tip #1: Style is Narrative – Use "Visual Anchors" Instead of "Scene Descriptions"

Most people's habit when writing AI video prompts is to describe the scene content.

"In an ancient courtyard, a farmer's wife is chopping wood..." – The result of this writing style is that the AI will generate what it "understands" as the most standard ancient courtyard: it could be Hengdian World Studios quality, or "Crouching Tiger, Hidden Dragon" quality, with extreme randomness.

But the prompts for "Saving a Fox in the Snowy Mountains" did something completely different from the start:

Original style of 1960s Shaw Brothers martial arts films, 100%还原摄影棚人工佈景假农家庭院 (artificial studio-set fake farm courtyard), Technicolor high saturation, non-naturalistic dramatic lighting…

This is not describing "what is in the picture," but rather defining "which aesthetic system the picture belongs to."

Why is it effective? AI video models (such as Keling, Jiemeng, Sora) are trained with massive amounts of "style tags." When you provide keywords like "Shaw Brothers film," "Technicolor," and "studio set," the AI directly retrieves a complete visual paradigm from its memory bank – including colors, lighting, composition, and even flaws (like film grain).

Prompting Tip #1: Instead of telling the AI "what to draw," tell it "what it should look like."

"What it should look like" is more stable than "what it is." Because "what it is" has countless ways of being drawn, while "what it should look like" locks down a single stylistic coordinate system. This is also why the original prompt emphasized "incorporating the stylistic foundation into each segment"—if the style drifts, the audience will be "taken out of the scene," and absurd comedy fears nothing more than not being "serious enough."

Drawing parallels: If you want to create an AI short film in the style of an "80s Hong Kong ghost movie"...

"1980s Hong Kong zombie film style, artificially constructed village house set, fluorescent green tones, crude lighting glitches, film scratches, actors' makeup showing obvious theatrical makeup quality..."

If you want to create a "Wong Kar-wai-esque urban ambiguity" style:

"90s Hong Kong art-house film, handheld camera slight wobble, high-saturation neon lights, wet rainy night ground reflections, characters' half-faces hidden in shadows, slow-motion frame skipping..."

Core logic: Style is not an embellishment; it is part of the narrative. In "Saving a Fox in the Snowy Mountains," the cheap, theatrical feel of Shaw Brothers is a crucial source of absurdity—if it were too refined, it would feel less like a "meme" and more like a serious martial arts drama.
Caption - The moment of "ingratitude" reversal in "Saving a Fox in the Snowy Mountains"

 

🥁 Prompting Tip #2: Rhythm is the Punchline – Control AI with "Dramatic Beats"

Special attention needed here: BGM stops abruptly, simultaneously adding a clangy martial arts heavy sound effect, and finally transitioning with a screen flash to white...

"6–14 second extreme rapid cuts, 0.6 seconds per frame, maintaining Shaw Brothers martial arts frozen pose style cinematography."

This is not describing the visual content, but rather arranging time.

Why is it effective? The core of comedy is "rhythm." Traditional comedy relies on the editor's intuition and experience to control it, while AI video tools' default output is often "smooth and continuous"—because the model tends to produce the most "natural" transitions.

But the hilarious point of "Saving a Fox in the Snowy Mountains" lies precisely in its unnatural breaks: abrupt BGM stop, screen flash to white, rapid cuts and freezes – these are the "formulaic language" of old-school martial arts films. Placed in today's context, they create a kind of "deliberately affected" humor.

Advanced principle number two: AI video prompts should not only describe "space" but also "time."

Most people only tell the AI what the visuals are, but experts tell the AI how the visuals switch, how the sound coordinates, and how the rhythm changes. This can be broken down into three dimensions:

Visual rhythm: shot length (0.6 seconds per frame), transition method (flash to white/freeze-frame/morph)

Auditory rhythm: when the BGM starts, when it stops, when sound effects cut in

Narrative rhythm: pause (farmer stops axe, looks up and raises eyebrow), acceleration (quick cuts for entrance), climax (appearance with heavy accent)

Drawing parallels: If you want a "suspenseful reversal" AI short film, you need to control the rhythm.

"The first 10 seconds: fixed camera slowly pushes in, no music, only ambient sound. 11th second: BGM suddenly cuts in with low-frequency heavy bass, camera quickly zooms into character's face. 12th second: black screen for 2 seconds. 13th second: quick cut flashback of 3 different angles of clue visuals..."

Core logic: AI models understand "rhythm" based on your timeline descriptions. The more precise the description, the more controllable the output. Don't expect AI to automatically generate "just right pauses"—you need to write those pauses into the prompts.
Caption - The "Are you that fox? No, I'm a soy-sauce duck" line spawned various online meme variations.

 

🔥 Prompting Tip #3: Use "Cultural Archetype + Subverted Expectation" to Create a Viral Meme

However, if we look at the reversal in isolation, it's not inherently funny. What truly makes it work is that it parasitizes a strong cultural archetype.

Why is it effective? "Fox repaying kindness" is one of the most classic motifs in Chinese folklore, like in "Strange Tales from a Chinese Studio": a man saves a fox, the fox transforms into a beautiful woman, and pledges herself to him. When the heroine asks, "Have you ever saved a fox in the snowy mountains?", the audience's brain automatically completes the entire story template, entering a comfortable expectation of "I know what's going to happen next."

Then, the line "I am a soy-sauce duck" punctures this expectation precisely like a needle. A good reversal is built upon strong "expectation." Without expectation, there is no reversal.

This is also why many AI short films with "forced reversals" aren't funny—no matter how bizarre the reversal itself, if the audience doesn't form a clear expectation at the beginning, they will only feel confused, not surprised.

Drawing parallels: Apply this formula: classic cultural archetype + use AI to generate its standard visual paradigm + insert a completely unrelated contemporary element at a key moment.

Cultural Archetype: Legend of the White Snake (Xu Xian and Madam White Snake's reunion on the broken bridge)

Visual Paradigm: Classical opera costume style, soft lighting, misty Jiangnan scenery

Reversal Setting: Madam White Snake hands Xu Xian an umbrella, with "XX Delivery, 30-minute delivery" printed on it.

Cultural Archetype: Romance of the Three Kingdoms (Oath of the Peach Garden)

Visual Paradigm: 94-version TV series quality, yellow earth, coarse cloth clothing

Reversal Setting: Guan Yu says, "Though Guan is but a warrior, he knows the meaning of loyalty and righteousness," then pulls out a mobile phone to scan a QR code and join a group chat.

Core logic: AI excels at "precisely recreating a certain style"—this is its advantage as a tool. What human creators need to do is, at that moment of precise recreation, inject an "alien object outside the cultural archetype." The clash between the two is the fuel for virality.

Caption - Viral AI videos allow for parodies of perfectly reproduced scenes.

 

📔 Here's a cheat sheet for you: Prompts are just the surface, the thinking model is the core.

The viral success of "Saving a Fox in the Snowy Mountains" seems accidental. But when we dissect that set of prompts, we see three clear layers of methodology:

Replace "scene description" with "style anchors" – Lock in visual paradigms to ensure stable output.

Control the timeline with "dramatic beats" – Incorporate editing mindset into prompts to master rhythm.

Create reversals with "cultural archetype + subverted expectation" – Let AI serve classic templates, then deviate at the last moment.

These three points don't rely on a specific tool or a fixed set of phrases. They are the underlying thinking models for AI video creation.

So, when we say "dissecting prompts," what we're really doing isn't just copying homework, but understanding the problem-solving mindset behind it. Soy-sauce duck will become outdated, and memes will iterate, but this methodology can be applied to the next viral hit. And you will be the creator of that next hit.

💡 Youuxi Conclusion: 2026, are you ready to farm shrimp?

If you feel the direction and depth of this article meet your expectations, we can:

Supplement with more specific AI tool operation screenshots or actual test comparisons (if you have relevant materials)

Adjust the writing style (more colloquial/more professional/more humorous)

Add a "practical exercise" section, using this methodology to generate prompts for a brand new case on the spot. AI+Super Drama looks forward to your feedback!

  • 🚀 Practical Course: Don't just browse, come and take the [AI+Super Drama Online Course]! Hand-in-hand, we transform AI novices into AI command masters.
  • 🔔 Real-time Updates: A day in the AI world is a year in the human world. Follow our [YT / IG / FB] immediately to get the latest AI news and prompt tips!

Youuxi AI will continue to navigate for you, let's elegantly evolve together in the AI wave of 2026!

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.