This is Part 4 of a series documenting a non-engineer CEO's attempts to connect Copilot Studio and Power Automate to LDX hub's StructFlow API.
Part 1 — It didn't work yet. Part 2 — REST API via Power Automate, finally working. Part 3 — MCP direct connection, 2 hours.
In Part 3, I connected LDX hub directly to Copilot Studio via MCP. One record at a time, in a chat interface. It worked great.
But then I asked the obvious question: what about 20 files? Batch processing 20 Word documents from SharePoint, extracting structured data from each, and synthesizing them into a single company-wide dashboard?
That's not a job for MCP. That's a job for Power Automate.
This is the story of building that pipeline — every error, every detour, and the moment it finally worked.
What I built:
- Microsoft Power Automate flow
- 20 Word files in SharePoint
- LDX hub ExtractDoc + StructFlow (REST API, not MCP)
- Output: HTML management dashboard saved to SharePoint
Time required: ~2 days
Architecture
SharePoint (20 Word files)
↓ Get files (properties only)
↓ Initialize array variable: results[]
↓ Apply to each file:
├─ Get file content (by path)
├─ POST /uploads → file_id (upload session)
├─ PUT /uploads/{file_id} → upload binary (base64)
├─ POST /extractdoc/jobs → job_id
├─ Do until status = completed (poll GET /extractdoc/jobs/{job_id})
├─ GET /files/{output_file_id}/content → extracted text
├─ POST /structflow/jobs → job_id
└─ Do until status = completed (poll GET /structflow/jobs/{job_id})
→ append body to results[]
↓ POST /structflow/jobs (cross-dept analysis)
↓ Do until status = completed
↓ Compose HTML dashboard
↓ Create file in SharePoint
8 HTTP actions per file. 20 files. Sequential processing.
The errors, in order
Error 1: Wrong upload endpoint
I started with POST /api/v1/uploads. Got 404.
The correct endpoint (without the /api/v1 prefix) is:
POST https://gw.ldxhub.io/uploads
Lesson: check the API docs directly. The base URL doesn't always include a version prefix.
Error 2: File content — multipart/form-data nightmare
POST /files requires multipart/form-data. Power Automate's HTTP connector doesn't handle this cleanly.
The workaround: use the chunk upload flow instead.
-
POST /uploads— creates an upload session, returnsfile_id -
PUT /uploads/{file_id}— sends the file content as base64 JSON
{
"data": "@{base64(body('パスによるファイル_コンテンツの取得'))}"
}
This is the JSON-based chunk upload designed for MCP clients, but it works perfectly from Power Automate too.
Error 3: File not found (SharePoint path)
Getting file content by ID didn't work. The fix: use "Get file content by path" instead of "Get file content".
The correct path format:
concat('/Shared Documents/General/LDXhubtest/', items('それぞれに適用する')?['{FilenameWithExtension}'])
The field name is {FilenameWithExtension} (with curly braces) — found by inspecting the raw output of the "Get files" action.
Error 4: ExtractDoc engine name
"engine": "docx" returned an error. The correct engine ID:
{
"engine": "ki/extract"
}
Check available engines with GET /extractdoc/engines first.
Error 5: Do until condition syntax
Power Automate's new designer is strict about condition expressions. This fails:
@{body('HTTP_3')?['status']} equals completed
This works (in advanced mode):
@equals(body('HTTP_3')?['status'],'completed')
Error 6: ExtractDoc doesn't return text directly
I assumed ExtractDoc would return the extracted text in the response body. It doesn't.
The response contains output_file_id. You then need:
GET /files/{output_file_id}/content
to download the actual text. This requires an extra HTTP action between ExtractDoc polling and StructFlow job creation.
Error 7: Array variable append — null value
AppendToArrayVariable with body('HTTP_5')?['results'] returned a null error.
Fix: append body('HTTP_5') (the entire response), not just the results field.
Error 8: Cross-scope reference error
When I tried to reference loop-scoped actions from outside the loop (for the cross-department analysis step), Power Automate threw:
The action 'HTTP_5' is nested in a foreach scope of multiple levels.
Referencing repetition actions from outside the scope is not supported.
The solution: accumulate everything into the results array variable inside the loop, then pass variables('results') to the final analysis step outside the loop.
The working flow — key settings
File upload (HTTP)
URI: https://gw.ldxhub.io/uploads
Method: POST
Headers:
Content-Type: application/json
Authorization: Bearer {API_KEY}
Body:
{
"filename": "@{items('それぞれに適用する')?['{FilenameWithExtension}']}"
}
File content upload (HTTP 1)
URI: https://gw.ldxhub.io/uploads/@{body('HTTP')?['file_id']}
Method: PUT
Body:
{
"data": "@{base64(body('パスによるファイル_コンテンツの取得'))}"
}
ExtractDoc job (HTTP 2)
URI: https://gw.ldxhub.io/extractdoc/jobs
Method: POST
Body:
{
"engine": "ki/extract",
"file_id": "@{body('HTTP')?['file_id']}",
"output_format": "text"
}
Download extracted text (HTTP 8, after polling)
URI: https://gw.ldxhub.io/files/@{body('HTTP_3')?['output_file_id']}/content
Method: GET
StructFlow job (HTTP 4)
{
"model": "anthropic/claude-sonnet-4-6",
"system_prompt": "以下の会議議事録から構造化データを抽出してください...",
"example_output": { ... },
"inputs": [{"id": "0", "data": {"minutes": "@{body('HTTP_8')}"}}]
}
The result
After 2 days of iteration:
| Metric | Result |
|---|---|
| Departments processed | 20 / 20 |
| StructFlow jobs completed | 20 / 20 |
| Total tasks extracted | 100 |
| High-severity risks identified | 21 |
| Cross-department dependency entries | 60+ |
The HTML dashboard shows:
- Company-wide task list (all 100, with assignee, deadline, related dept)
- Risk cards by severity (color-coded)
- Cross-department dependency map
- Per-department summary cards
Key insight on architecture: LDX hub handles all the intelligence — text extraction (ExtractDoc) and structured data generation (StructFlow). The HTML template I wrote just renders the JSON. The processing engine and presentation layer are fully separated.
MCP vs REST API — the actual comparison
Now that I've done both, here's the honest breakdown:
| MCP (Part 3) | REST API — Power Automate (Part 4) | |
|---|---|---|
| Setup time | ~2 hours | ~2 days |
| Errors | 2 | 8+ |
| Best for | Single record, interactive | Batch processing |
| 20-file batch | ❌ Not practical | ✅ Right tool |
| Polling complexity | Handled by agent | Manual Do until loops |
| File upload | Via MCP chunk API | Via REST chunk upload |
MCP wins on simplicity for conversational use cases. REST API wins for scheduled batch jobs.
What I'd do differently
- Test with 1 file before 20. I wasted hours debugging a flow that was running on all 20 files.
-
Check the API docs before assuming endpoint paths. The
/api/v1/prefix doesn't exist on all endpoints. - Verify Do until conditions in advanced mode. The GUI condition builder generates subtly wrong expressions.
- Add error handling. The current flow times out silently if an API call fails mid-loop.
What's Next
Phase 2: A quality comparison between two approaches to dashboard generation:
- Structured data route: StructFlow extracts JSON → HTML renders JSON (what we built)
- Unstructured data route: raw meeting text passed directly to an LLM → HTML rendered from prose output
The hypothesis: structured data produces more consistent, queryable, and accurate dashboards. But how much better, exactly? And at what cost difference? That's the next experiment.
Kawamura International is a translation and localization company documenting its AI process experiments in public. StructFlow, RefineLoop, RenderOCR — and whatever comes next.
Top comments (0)