DEV Community

HYUN SOO LEE
HYUN SOO LEE

Posted on

How We Built an Automated Korean Saju Content Pipeline with Claude Vision + Python

How We Built an Automated Korean Saju Content Pipeline with Claude Vision + Python

The Problem: Longform Content at Scale, Without Losing Precision

Korean Saju (四柱, Four Pillars of Destiny) content is structurally rich — each reading contains a natal chart (원국), a current major cycle (대운), an annual cycle (세운), and layered interpretations across career, wealth, relationships, and health. A single public-figure reading can run 3,000–5,000 Korean words, with dozens of named variables: Heavenly Stems (天干), Earthly Branches (地支), Ten Gods (十星), and classical star markers like Heavenly Noble (天乙貴人) or Peach Blossom (桃花殺).

The manual workflow looked like this: a senior analyst runs the chart → captures screenshots → writes interpretation → editor formats per channel → QA checks for factual drift. That pipeline took 4–6 hours per reading, per channel.

Our goal: reduce that to under 45 minutes of human time, with zero factual hallucination on the chart data.


The Architecture: Four Stages

[Manse Calendar UI]
→ [Claude Vision Extraction]
→ [Prompt Chain Interpretation]
→ [Channel Formatter]
→ [QA Gate]
→ [Publish]

Each stage has a distinct responsibility and a distinct failure mode. We'll walk through each.


Stage 1 — Vision Extraction: Reading the Chart Image

The Manse calendar (만세력) is rendered as a UI screenshot, not a structured data export. That means the ground truth for every reading lives in an image.

We feed four screenshot types into Claude Vision:

Image Content
01_input_confirm.png Name, gender, solar date, birth hour
02_manse_calendar_complete.png Full four-pillar grid with Ten Gods, 12-phase markers, star markers
03–06_total_luck_N.png Longform interpretation scroll captures

The extraction prompt is strict and non-inferential:

Extract exactly what is visible in the image.
Do not infer, translate, or rephrase any character or label.
If the Hour Pillar Heavenly Stem Ten God reads "正財" (Direct Wealth),
output "正財". Do not substitute "偏財".
Output as structured JSON with keys:
name, gender, solar_date, lunar_date, birth_hour,
pillars: { year, month, day, hour } each containing
{ stem, branch, stem_god, branch_god, twelve_phase, star_markers[] },
current_daewoon: { stem, branch, stem_god, branch_god },
annual_cycle_2026: { stem, branch, stem_god, branch_god, star_markers[] }

Why this matters: In our live case — Kang Daniel (강다니엘), male, solar 1996-12-10, birth hour unknown — the Hour Pillar (時柱) Heavenly Stem is 甲 with Ten God label 正財 (Direct Wealth, 정재). An early prompt version that said "identify the Ten God" without the non-substitution guard produced 偏財 (Indirect Wealth) because the model pattern-matched on 甲 relative to 辛 and computed independently. The guard clause eliminated that class of error entirely.

Extracted ground truth (live case):

  • Day Pillar (日柱): 辛(신) / 巳(사) — Ten Gods: 比肩(비견) / 正官(정관)
  • Month Pillar (月柱): 庚(경) / 子(자) — Ten Gods: 劫財(겁재) / 食神(식신)
  • Year Pillar (年柱): 丙(병) / 子(자) — Ten Gods: 正官(정관) / 食神(식신)
  • Hour Pillar (時柱): 甲(갑) / 午(오) — Ten Gods: 正財(정재) / 偏官(편관)
  • Current Daewoon: 甲辰(갑진) — 正財(정재) / [辰子 semi-combination, Water reinforcement]
  • 2026 Annual Cycle: 丙午(병오) — 正官(정관) / 偏官(편관)
  • Star markers extracted: 天乙貴人(천을귀인) at Hour Branch 午, 天德貴人(천덕귀인) at Hour Branch 午, 암록(暗祿) at Month Pillar

Stage 2 — Prompt Chain: Interpretation Without Hallucination

Once the JSON is clean, we run a four-prompt chain. Each prompt receives only the JSON output of the previous stage plus a scoped instruction. No prompt sees the raw image again after Stage 1.

Prompt 1 — Structure Summary

Given this chart JSON, summarize the structural profile:

  • Day Master strength (身强/身弱)
  • Dominant element distribution
  • Ten God concentration (how many 官星, 食傷, 財星, 印星, 比劫)
  • Key tensions (충/沖, 합/合, 형/刑 between pillars) Output: structured summary, no prose yet.

For Kang Daniel: Day Master 辛金 is 身弱 (weak) — three 官星 (正官×2 + 偏官×1), two 食傷 (食神×2), zero 印星 in Heavenly Stems. Fire element at ~40%, Water ~30%, Metal ~20%, Wood ~10%, Earth ~0%. Critical tension: 子午沖 (子 in Year/Month Branch vs 午 in Hour Branch).

Prompt 2 — Domain Interpretation

Each domain (career, wealth, relationships, health) gets a separate sub-prompt with the structural summary as context. This prevents cross-domain contamination and makes QA easier — each block is independently auditable.

Prompt 3 — Annual Cycle Overlay

Given the structural summary and domain interpretations,
overlay the 2026 annual cycle (丙午).
Identify: which existing pillars does 午 interact with?
What does doubling of 官星 (正官 + 偏官) mean for a 身弱 Day Master?
Do not introduce chart elements not present in the JSON.

The 2026 overlay is where the most interesting dynamics surface. 丙午 年 means:

  • Heavenly Stem 丙 = 正官 (Direct Officer) relative to 辛 Day Master
  • Earthly Branch 午 = 偏官 (Seven Killings) relative to 辛
  • 午 repeats the Hour Branch, reactivating 天乙貴人 — unexpected support signal in a high-pressure year
  • 午 clashes with Year Branch 子 and Month Branch 子 (子午沖×2) — environmental disruption to established routines

Prompt 4 — Channel Formatting

This is where the pipeline branches. The same interpreted content gets reformatted per channel spec. Dev.to gets English technical framing. A YouTube script gets a different cadence. A blog post gets Korean prose. The interpretation JSON is the single source of truth; formatting is downstream.


Stage 3 — Channel Formatter

Each channel has a spec file that the formatter reads:

CHANNEL_SPECS = {
"devto": {
"language": "en",
"format": "markdown",
"tone": "technical_analytical",
"word_target": 1500,
"forbidden": ["fortune_telling_framing", "certainty_language", "gossip"],
"required_blocks": [
"hook", "conclusion_one_liner", "bazi_mechanics",
"classical_citation", "modern_application",
"infographic_placeholder", "reversal_insight",
"summary_three_lines", "cta"
]
},
"youtube_script": { ... },
"blog_ko": { ... }
}

The formatter prompt receives the interpretation JSON plus the channel spec and produces the final draft. Crucially, it does not re-read the chart — it only formats what the interpretation chain already produced.


[INFO_GRAPHIC]

KANG DANIEL — 辛巳 Day Pillar | 2026 丙午 Annual Cycle

PILLAR GRID
┌──────────┬──────────┬──────────┬──────────┐
│ 時柱 │ 日柱 │ 月柱 │ 年柱 │
│ Hour │ Day │ Month │ Year │
├──────────┼──────────┼──────────┼──────────┤
│ 正財 │ 比肩 │ 劫財 │ 正官 │
│ 甲 Wood │ 辛 Metal│ 庚 Metal│ 丙 Fire │
│ 甲 (갑) │ 辛 (신) │ 庚 (경) │ 丙 (병) │
├──────────┼──────────┼──────────┼──────────┤
│ 偏官 │ 正官 │ 食神 │ 食神 │
│ 午 Fire │ 巳 Fire │ 子 Water│ 子 Water│
│ 午 (오) │ 巳 (사) │ 子 (자) │ 子 (자) │
└──────────┴──────────┴──────────┴──────────┘

STAR MARKERS: 天乙貴人 + 天德貴人 @ 午 (Hour Branch)
KEY TENSION: 子午沖 (Month/Year Branch vs Hour Branch)

2026 OVERLAY: 丙午
Stem 丙 → 正官 (Direct Officer) — opportunity + accountability
Branch 午 → 偏官 (Seven Killings) — pressure layer
午 reactivates 天乙貴人 — hidden support signal
午 triggers 子午沖×2 — routine disruption likely

DAEWOON: 甲辰 | 甲 → 正財 | 辰子 semi-combination (Water+)
BODY STRENGTH: 身弱 | 官星×3 | 食傷×2 | 印星×0 in Stems


Stage 4 — QA Gate

Before any content publishes, it passes through a five-check QA gate:

Check 1 — Character Fidelity
Every Hanja character and Ten God label in the output is compared against the Stage 1 JSON. Any mismatch triggers a block. This is the 正財/偏財 guard at output level.

Check 2 — Certainty Language Scan
A regex + classifier pass flags phrases like "will definitely," "100%," "certain to," "absolutely." These are replaced with hedged equivalents or flagged for human review.

Check 3 — Gossip / Privacy Filter
Any sentence containing relationship-status assertions, health diagnoses, or financial predictions stated as fact is flagged. The filter is trained on a small dataset of problematic examples from early drafts.

Check 4 — Classical Citation Verification
If the interpretation chain inserted a classical text citation, the QA step verifies it matches a pre-approved citation library. Fabricated citations are a known LLM failure mode in this domain.

For reference, a structurally relevant citation for this reading: Zi Ping Zhen Quan (子平真詮, He Lun Qing, Qing Dynasty) discusses the 身弱 Day Master under heavy 官殺 pressure: the remedy is 印星 support; absent that, energy conservation becomes the primary strategic principle. This maps directly to the 2026 reading's core recommendation.

Check 5 — Word Count and Block Completeness
All nine required blocks must be present. Word count must fall within ±10% of target. Missing blocks or overruns trigger a regeneration request for the specific block, not the full piece.


The Reversal: The Star Marker Nobody Expects

The technically interesting finding in this chart — and the one most likely to be missed by a naive extraction — is that 天乙貴人 (Heavenly Noble) sits at the Hour Branch 午, the same branch that 2026's annual cycle repeats.

In a year where 官星 doubles and 子午沖 fires twice, the conventional read is "high pressure, high cost." The reversal: 午 also reactivates the most powerful support star in the chart. The 2026 annual cycle is not purely adversarial — it also lights up the branch where unexpected assistance and key-moment support are structurally encoded.

This is the kind of nuance that disappears if the Vision extraction is imprecise (wrong branch assignment) or if the interpretation chain doesn't cross-reference star markers against the annual cycle overlay. It's also the kind of insight that makes the content genuinely useful rather than generically cautionary.


Lessons Learned

1. Extraction and interpretation must be strictly separated. Any prompt that does both simultaneously will drift. The model will rationalize chart data to fit an interpretation it's already building.

2. Non-substitution guards are not optional. Especially for Ten God labels where the model has strong priors from training data. Explicit negative constraints outperform positive description alone.

3. Channel formatting is the last step, not the first. Early versions tried to format while interpreting. Output quality dropped significantly. Separation of concerns applies here exactly as it does in software architecture.

4. QA is cheaper than correction. A five-check gate that blocks 15% of drafts saves more time than editing published content that contains a factual error about a public figure's chart.

5. Star markers are high-value, low-extraction-rate data. They appear as small colored badges in the UI. Without explicit Vision prompting for badge-level detail, they are systematically missed — and they often contain the most analytically interesting information.


Summary

  • Automated Saju content pipeline: Vision extraction → prompt chain → channel formatter → QA gate, with strict separation of concerns at each stage
  • The critical engineering challenge is not generation quality but extraction fidelity — wrong Hanja or wrong Ten God label at Stage 1 propagates through every downstream step
  • The 2026 reading for this case study surfaces a structurally meaningful reversal (天乙貴人 reactivation in a high-pressure 官星 year) that only appears when star marker extraction is precise and cross-referenced against the annual cycle overlay

Explore the full Saju content platform and automated reading tools at runartree.com


This article uses publicly available birth information and a published Saju reading as a technical case study for content pipeline automation. All interpretations reflect classical Bazi analytical frameworks and are presented as structural analysis, not predictive certainty. Individual outcomes vary and no reading should be treated as a definitive forecast.


Project link

This article is based on an automated content workflow for a Korean Saju platform.

The key lesson is simple: generation alone is not enough. A useful publishing pipeline also needs formatting, QA, tracking links, and channel-specific editorial rules.


Bazi interpretation. Not medical, legal, or investment advice.

Top comments (0)