DEV Community

Kozo-KI
Kozo-KI

Posted on

I batch-processed 20 meeting minutes with Power Automate + LDX hub. It took 2 days and 8 HTTP actions.

This is Part 4 of a series documenting a non-engineer CEO's attempts to connect Copilot Studio and Power Automate to LDX hub's StructFlow API.
Part 1 — It didn't work yet. Part 2 — REST API via Power Automate, finally working. Part 3 — MCP direct connection, 2 hours.

In Part 3, I connected LDX hub directly to Copilot Studio via MCP. One record at a time, in a chat interface. It worked great.

But then I asked the obvious question: what about 20 files? Batch processing 20 Word documents from SharePoint, extracting structured data from each, and synthesizing them into a single company-wide dashboard?

That's not a job for MCP. That's a job for Power Automate.

This is the story of building that pipeline — every error, every detour, and the moment it finally worked.

What I built:

  • Microsoft Power Automate flow
  • 20 Word files in SharePoint
  • LDX hub ExtractDoc + StructFlow (REST API, not MCP)
  • Output: HTML management dashboard saved to SharePoint

Time required: ~2 days


Architecture

SharePoint (20 Word files)
  ↓ Get files (properties only)
  ↓ Initialize array variable: results[]
  ↓ Apply to each file:
    ├─ Get file content (by path)
    ├─ POST /uploads → file_id (upload session)
    ├─ PUT /uploads/{file_id} → upload binary (base64)
    ├─ POST /extractdoc/jobs → job_id
    ├─ Do until status = completed (poll GET /extractdoc/jobs/{job_id})
    ├─ GET /files/{output_file_id}/content → extracted text
    ├─ POST /structflow/jobs → job_id
    └─ Do until status = completed (poll GET /structflow/jobs/{job_id})
        → append body to results[]
  ↓ POST /structflow/jobs (cross-dept analysis)
  ↓ Do until status = completed
  ↓ Compose HTML dashboard
  ↓ Create file in SharePoint
Enter fullscreen mode Exit fullscreen mode

8 HTTP actions per file. 20 files. Sequential processing.


The errors, in order

Error 1: Wrong upload endpoint

I started with POST /api/v1/uploads. Got 404.

The correct endpoint (without the /api/v1 prefix) is:

POST https://gw.ldxhub.io/uploads
Enter fullscreen mode Exit fullscreen mode

Lesson: check the API docs directly. The base URL doesn't always include a version prefix.

Error 2: File content — multipart/form-data nightmare

POST /files requires multipart/form-data. Power Automate's HTTP connector doesn't handle this cleanly.

The workaround: use the chunk upload flow instead.

  1. POST /uploads — creates an upload session, returns file_id
  2. PUT /uploads/{file_id} — sends the file content as base64 JSON
{
  "data": "@{base64(body('パスによるファイル_コンテンツの取得'))}"
}
Enter fullscreen mode Exit fullscreen mode

This is the JSON-based chunk upload designed for MCP clients, but it works perfectly from Power Automate too.

Error 3: File not found (SharePoint path)

Getting file content by ID didn't work. The fix: use "Get file content by path" instead of "Get file content".

The correct path format:

concat('/Shared Documents/General/LDXhubtest/', items('それぞれに適用する')?['{FilenameWithExtension}'])
Enter fullscreen mode Exit fullscreen mode

The field name is {FilenameWithExtension} (with curly braces) — found by inspecting the raw output of the "Get files" action.

Error 4: ExtractDoc engine name

"engine": "docx" returned an error. The correct engine ID:

{
  "engine": "ki/extract"
}
Enter fullscreen mode Exit fullscreen mode

Check available engines with GET /extractdoc/engines first.

Error 5: Do until condition syntax

Power Automate's new designer is strict about condition expressions. This fails:

@{body('HTTP_3')?['status']}  equals  completed
Enter fullscreen mode Exit fullscreen mode

This works (in advanced mode):

@equals(body('HTTP_3')?['status'],'completed')
Enter fullscreen mode Exit fullscreen mode

Error 6: ExtractDoc doesn't return text directly

I assumed ExtractDoc would return the extracted text in the response body. It doesn't.

The response contains output_file_id. You then need:

GET /files/{output_file_id}/content
Enter fullscreen mode Exit fullscreen mode

to download the actual text. This requires an extra HTTP action between ExtractDoc polling and StructFlow job creation.

Error 7: Array variable append — null value

AppendToArrayVariable with body('HTTP_5')?['results'] returned a null error.

Fix: append body('HTTP_5') (the entire response), not just the results field.

Error 8: Cross-scope reference error

When I tried to reference loop-scoped actions from outside the loop (for the cross-department analysis step), Power Automate threw:

The action 'HTTP_5' is nested in a foreach scope of multiple levels. 
Referencing repetition actions from outside the scope is not supported.
Enter fullscreen mode Exit fullscreen mode

The solution: accumulate everything into the results array variable inside the loop, then pass variables('results') to the final analysis step outside the loop.


The working flow — key settings

File upload (HTTP)

URI: https://gw.ldxhub.io/uploads
Method: POST
Headers:
  Content-Type: application/json
  Authorization: Bearer {API_KEY}
Body:
{
  "filename": "@{items('それぞれに適用する')?['{FilenameWithExtension}']}"
}
Enter fullscreen mode Exit fullscreen mode

File content upload (HTTP 1)

URI: https://gw.ldxhub.io/uploads/@{body('HTTP')?['file_id']}
Method: PUT
Body:
{
  "data": "@{base64(body('パスによるファイル_コンテンツの取得'))}"
}
Enter fullscreen mode Exit fullscreen mode

ExtractDoc job (HTTP 2)

URI: https://gw.ldxhub.io/extractdoc/jobs
Method: POST
Body:
{
  "engine": "ki/extract",
  "file_id": "@{body('HTTP')?['file_id']}",
  "output_format": "text"
}
Enter fullscreen mode Exit fullscreen mode

Download extracted text (HTTP 8, after polling)

URI: https://gw.ldxhub.io/files/@{body('HTTP_3')?['output_file_id']}/content
Method: GET
Enter fullscreen mode Exit fullscreen mode

StructFlow job (HTTP 4)

{
  "model": "anthropic/claude-sonnet-4-6",
  "system_prompt": "以下の会議議事録から構造化データを抽出してください...",
  "example_output": { ... },
  "inputs": [{"id": "0", "data": {"minutes": "@{body('HTTP_8')}"}}]
}
Enter fullscreen mode Exit fullscreen mode

The result

After 2 days of iteration:

Metric Result
Departments processed 20 / 20
StructFlow jobs completed 20 / 20
Total tasks extracted 100
High-severity risks identified 21
Cross-department dependency entries 60+

The HTML dashboard shows:

  • Company-wide task list (all 100, with assignee, deadline, related dept)
  • Risk cards by severity (color-coded)
  • Cross-department dependency map
  • Per-department summary cards

Key insight on architecture: LDX hub handles all the intelligence — text extraction (ExtractDoc) and structured data generation (StructFlow). The HTML template I wrote just renders the JSON. The processing engine and presentation layer are fully separated.


MCP vs REST API — the actual comparison

Now that I've done both, here's the honest breakdown:

MCP (Part 3) REST API — Power Automate (Part 4)
Setup time ~2 hours ~2 days
Errors 2 8+
Best for Single record, interactive Batch processing
20-file batch ❌ Not practical ✅ Right tool
Polling complexity Handled by agent Manual Do until loops
File upload Via MCP chunk API Via REST chunk upload

MCP wins on simplicity for conversational use cases. REST API wins for scheduled batch jobs.


What I'd do differently

  1. Test with 1 file before 20. I wasted hours debugging a flow that was running on all 20 files.
  2. Check the API docs before assuming endpoint paths. The /api/v1/ prefix doesn't exist on all endpoints.
  3. Verify Do until conditions in advanced mode. The GUI condition builder generates subtly wrong expressions.
  4. Add error handling. The current flow times out silently if an API call fails mid-loop.

What's Next

Phase 2: A quality comparison between two approaches to dashboard generation:

  • Structured data route: StructFlow extracts JSON → HTML renders JSON (what we built)
  • Unstructured data route: raw meeting text passed directly to an LLM → HTML rendered from prose output

The hypothesis: structured data produces more consistent, queryable, and accurate dashboards. But how much better, exactly? And at what cost difference? That's the next experiment.


Kawamura International is a translation and localization company documenting its AI process experiments in public. StructFlow, RefineLoop, RenderOCR — and whatever comes next.

Top comments (0)