DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
The State of Apache Iceberg, Polaris, and Arrow: October–November 2025

The State of Apache Iceberg, Polaris, and Arrow: October–November 2025

2
Comments
7 min read
From 30 Minutes to 5: Solving Data Pipeline Deployment Bottlenecks with Git Sparse Checkout

From 30 Minutes to 5: Solving Data Pipeline Deployment Bottlenecks with Git Sparse Checkout

Comments
5 min read
Real-Time Cryptocurrency Data Pipeline

Real-Time Cryptocurrency Data Pipeline

Comments
12 min read
Azure Data Factory for ETL

Azure Data Factory for ETL

Comments
5 min read
FlightPath Server Has Landed

FlightPath Server Has Landed

Comments
1 min read
1 billion JSON records, 1-second query response: Apache Doris vs. ClickHouse, Elasticsearch, and PostgreSQL

1 billion JSON records, 1-second query response: Apache Doris vs. ClickHouse, Elasticsearch, and PostgreSQL

5
Comments
7 min read
Building a Sales Database in PostgreSQL — Schema, Data & JOIN Examples

Building a Sales Database in PostgreSQL — Schema, Data & JOIN Examples

3
Comments
6 min read
How to Avoid Common Data Management Pitfalls in Enterprise RAG Systems: A Guide to Effective Governance and Observability

How to Avoid Common Data Management Pitfalls in Enterprise RAG Systems: A Guide to Effective Governance and Observability

Comments
5 min read
SQL: is there a better way to code this?

SQL: is there a better way to code this?

Comments
1 min read
Building Self-Healing, Reliable Data Pipelines That Think

Building Self-Healing, Reliable Data Pipelines That Think

Comments
4 min read
From Postgres to Iceberg

From Postgres to Iceberg

1
Comments
11 min read
How to Data Engineer the ETLFunnel Way

How to Data Engineer the ETLFunnel Way

Comments
5 min read
How to Data Engineer the ETLFunnel Way

How to Data Engineer the ETLFunnel Way

Comments
3 min read
How to Data Engineer the ETLFunnel Way

How to Data Engineer the ETLFunnel Way

Comments
5 min read
A Dashboard About Scammers, Telemarketers, My Cellphone, and Who Annoys Me Most

A Dashboard About Scammers, Telemarketers, My Cellphone, and Who Annoys Me Most

Comments
3 min read
isql

isql

Comments
1 min read
The data lakehouse evolution

The data lakehouse evolution

Comments
11 min read
Digital Vaults: Unlocking the Future of Data Archiving with Cutting-Edge Databases

Digital Vaults: Unlocking the Future of Data Archiving with Cutting-Edge Databases

Comments
2 min read
OSPP Project Outcome: Supporting Flink Engine CDC Source Schema Evolution

OSPP Project Outcome: Supporting Flink Engine CDC Source Schema Evolution

Comments
6 min read
Beyond the Browser: Crafting a Robust Web Scraping Pipeline for Dynamic Sports Data

Beyond the Browser: Crafting a Robust Web Scraping Pipeline for Dynamic Sports Data

Comments 1
3 min read
From Dashboards to Decisions: Building Scalable Self-Service BI for Real Impact

From Dashboards to Decisions: Building Scalable Self-Service BI for Real Impact

Comments
2 min read
Real-World Strategies for Scaling AI in Large Organizations

Real-World Strategies for Scaling AI in Large Organizations

Comments
3 min read
Introducing ReelTrust: What if data engineering could solve our AI deepfakes problem?

Introducing ReelTrust: What if data engineering could solve our AI deepfakes problem?

Comments
5 min read
Who Needs Real-Time Streaming? Use Cases & Architecture Across Industries

Who Needs Real-Time Streaming? Use Cases & Architecture Across Industries

Comments
8 min read
Real-Time Data Streaming Platform: How We Built a Self-Hosted Platform with 90% Cost Reduction vs AWS Managed Services

Real-Time Data Streaming Platform: How We Built a Self-Hosted Platform with 90% Cost Reduction vs AWS Managed Services

Comments
6 min read
loading...