Why We Built Iteration Layer

#api #programming

Content Processing Is a Mess

If you've built anything that touches documents or images, you know the drill. You need to extract data from PDFs, so you duct-tape together an OCR library and a regex parser. You need thumbnails, so you spin up ImageMagick in a Docker container. You need to generate reports or ebooks, so you wrestle with PDF libraries that treat a simple table like a research problem.

Each tool solves one narrow problem. Each one breaks in its own way. And the glue code connecting them — the format conversions, the error handling, the retry logic — that's where the real complexity lives. Not in the business logic you actually care about.

We've been on both sides of this. Before Iteration Layer, we built an AI-driven book publishing company. That meant building the entire content pipeline from scratch: parsing manuscripts, generating book covers programmatically, processing product images for Amazon, rendering marketing graphics for launches. Every piece worked. Every piece also broke in its own creative way at 2 AM before a release.

We maintained the Sharp pipeline that worked perfectly until someone uploaded a CMYK TIFF. We wrote the template engine that couldn't handle a font weight it hadn't seen before. We built the manuscript-to-EPUB converter that choked on tables. And every time we fixed one thing, we thought: this should just be an API call.

So we built it.

One Pipeline, Composable APIs

Content processing follows a natural lifecycle: you ingest raw content, you transform it, and you generate new output from it. Instead of building one monolithic platform that tries to do everything, we built focused APIs that map to this lifecycle.

Document Extraction handles structured ingestion. Give it a PDF, a Word document, a scanned image, or another supported file — it gives you structured JSON. You define a schema describing the fields you want, and the parser extracts them with confidence scores so you know when to trust the result and when to flag it for review. No OCR setup, no template configuration, no regex. It handles field types like text, numbers, dates, addresses, IBANs, currencies, arrays, and more out of the box.

Document to Markdown handles full-text ingestion. When you need clean markdown for RAG, summarization, search, or agent context, it converts documents, images, and public pages into markdown with tables, headings, links, and image context preserved.

Website Extraction handles public-page ingestion. Send a URL and a schema, and it returns typed JSON from pricing pages, product pages, job listings, public documentation, or other web pages, with confidence scores and citations.

Image Transformation covers the middle of the pipeline — the part where you need to upscale, resize, crop, convert formats, adjust quality, or chain multiple operations together. Define the transformation sequence in a single request: upscale, resize, convert to WebP, compress to a target quality, smart-crop around the detected subject. One API call instead of a Sharp pipeline you have to host and scale yourself. Need an image under a target file size for email? Tell the API the target and it figures out the quality and dimension tradeoffs.

Image Generation turns structured layer definitions into pixels. Define a canvas, stack text, images, gradients, QR codes, barcodes, and layout layers, and get back a generated image. Social cards, certificates, OG images, report graphics. Anything you'd design in Figma and then manually export repeatedly, you can now render programmatically.

Document Generation closes the loop on the output side. Feed it structured data and get back a polished PDF, DOCX, EPUB, or PPTX. Contracts, reports, ebooks, slide decks — generated from templates, populated with your data, ready to ship.

Sheet Generation handles tabular output. Feed it rows and column definitions, and get back XLSX, CSV, or Markdown spreadsheets without hand-building Excel files.

Each API works independently. You can use Document Extraction without ever touching image processing. You can generate documents without parsing a single one first. But the real value comes when you chain them together.

Composable by Design

The output of one API flows naturally into the input of the next. Parse a supplier catalog, feed each product into Image Generation, and you have marketplace-ready listing images — no glue code, no format juggling. Parse manuscripts, transform the cover art, generate the final EPUB. Extract article text, transform the hero image, generate a social card.

This is the Unix philosophy applied to content processing: small, focused tools that compose into workflows you haven't imagined yet.

That composability is a deliberate choice, and it cuts against the grain. Monolithic platforms lock you into one vendor's idea of a workflow. If their OCR is good but their image processing is mediocre, tough luck — you're stuck with both or neither.

We think APIs should work like building blocks. Snap them together however you want. Combine ours with your own services or third-party tools. The output is always standard JSON or image data — nothing proprietary, nothing locked in.

This also means you pay for the workflow you actually run. One credit pool covers the APIs, so credits are not stranded in a document bucket when the next project needs image processing or spreadsheet generation. Start with one API, add another when the need arises, swap one out without touching the rest.

MCP: APIs as Agent Tools

Every Iteration Layer API ships as an MCP server from day one. If you work with Claude, Cursor, or any MCP-compatible client, our APIs show up as tools your agent can call directly.

That changes the workflow fundamentally. Instead of writing integration code, you describe what you want. "Parse this supplier catalog and generate a product card for each item." The agent discovers Document Extraction, calls it with the right schema, takes the structured output, and feeds it into Image Generation — all without you specifying the sequence.

We've been using this internally while building the platform, and it's the kind of thing that feels like a gimmick until you try it. Once you've watched an agent chain three API calls to solve a task you were about to spend an hour scripting, it's hard to go back.

Building Blocks for What's Next

The APIs we ship cover the core of the content lifecycle, but we see them as the foundation, not the finish line.

Our vision is a library of focused, composable APIs that cover every step of content processing — from raw input to polished output. Each one does one thing well. Each one connects to everything else. The same way Unix pipes let you chain grep | sort | uniq into something no single tool could do, we want you to be able to chain Ingest, Transform, and Generate steps into workflows that solve your specific problem.

That means more APIs, more integrations, more field types, more output formats. But always following the same principles — focused scope, standard I/O, and the ability to snap into any pipeline.

We're building infrastructure that disappears. You shouldn't have to think about OCR libraries, image processing servers, or headless browser farms. You should think about your product, your users, and the content workflow that connects them. The plumbing is our problem.

Get Started

Pick the API that solves your most immediate problem — Document Extraction, Document to Markdown, Website Extraction, Image Transformation, Image Generation, Document Generation, or Sheet Generation. We also publish SDKs and an MCP server, so your next integration can be a direct API call or an agent tool.

Sign up for a free account — no credit card required. As new needs come up, add another API to the chain. They all use the same authentication, the same error format, and the same response structure, so adding a second or third takes minutes.

We built this because we needed it. We think you might too.