DEV Community

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The real problem with ingesting MongoDB into Delta Lake (and how I built a library to fix it)

The real problem with ingesting MongoDB into Delta Lake (and how I built a library to fix it)

2
Comments 4
5 min read
How We Accidentally Built a Customer Data Platform

How We Accidentally Built a Customer Data Platform

2
Comments
8 min read
Beyond the Model: Why Document Intelligence Is the Next AI Infrastructure Layer

Beyond the Model: Why Document Intelligence Is the Next AI Infrastructure Layer

Comments
4 min read
Types of Data Analytics: The Complete Guide With Examples, Use Cases & Career Path

Types of Data Analytics: The Complete Guide With Examples, Use Cases & Career Path

1
Comments
4 min read
Parsing Bank Statement PDFs: 5 Tools Compared for Developers (2026)

Parsing Bank Statement PDFs: 5 Tools Compared for Developers (2026)

1
Comments
6 min read
Rethinking Data Engineering: Why ETL Pipelines Still Take Too Long — and a New Way Forward

Rethinking Data Engineering: Why ETL Pipelines Still Take Too Long — and a New Way Forward

1
Comments 1
3 min read
Stop Losing Your Health Data! Build a Lifelong Electronic Health Record (EHR) System with Neo4j and GraphRAG 🏥💻

Stop Losing Your Health Data! Build a Lifelong Electronic Health Record (EHR) System with Neo4j and GraphRAG 🏥💻

Comments
3 min read
Building Your First Airflow DAG: Extracting Stock Data with Massive

Building Your First Airflow DAG: Extracting Stock Data with Massive

2
Comments
4 min read
How I scrape and de-dupe Meta ads for 1000 brands

How I scrape and de-dupe Meta ads for 1000 brands

5
Comments
6 min read
I Built a Market Intelligence Agent That Operates a Data Platform Autonomously via MCP

I Built a Market Intelligence Agent That Operates a Data Platform Autonomously via MCP

1
Comments
2 min read
Building a simple data pipeline with GCS and Databricks Autoloader

Building a simple data pipeline with GCS and Databricks Autoloader

Comments
1 min read
Implementing a Centralized Metrics Layer with dbt and a Semantic Layer

Implementing a Centralized Metrics Layer with dbt and a Semantic Layer

Comments
9 min read
Rebuilding the national narrative with AI and Docker

Rebuilding the national narrative with AI and Docker

1
Comments
3 min read
Columnar Databases (ClickHouse/Snowflake)

Columnar Databases (ClickHouse/Snowflake)

1
Comments
8 min read
ClickHouse Native JSON Support in 2026: A PR-by-PR Analysis

ClickHouse Native JSON Support in 2026: A PR-by-PR Analysis

6
Comments 1
24 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.