This course is designed to equip data engineers with the skills to build scalable and efficient data pipelines using Scala and Spark. Data engineers will learn best practices for development, testing, and deployment in cloud environments, with a focus on optimizing performance and ensuring data quality. The course provides the necessary tools to transform raw data into actionable insights, making it highly relevant in today’s data-driven world.

Data Engineering with Scala and Spark
Limited time! Save 40% on 3 months of Coursera Plus and full access to thousands of courses.

Recommended experience
What you'll learn
Set up a development environment for building data pipelines in Scala
Use Spark DataFrames, Datasets, and SQL with Scala for data processing
Profile and clean data using Deequ for improved data quality
Details to know

Add to your LinkedIn profile
March 2026
13 assignments
See how employees at top companies are mastering in-demand skills

There are 13 modules in this course
In this section, we explore functional programming, higher-order functions, polymorphic functions, and pattern matching in Scala for data engineering applications.
What's included
2 videos6 readings1 assignment
In this section, we explore cloud-based and local environments for data engineering pipelines, focusing on setup processes, trade-offs, and practical applications.
What's included
1 video5 readings1 assignment
In this section, we explore Apache Spark's APIs, focusing on DataFrame and Dataset for distributed data processing.
What's included
1 video3 readings1 assignment
In this section, we explore using Spark JDBC API for database access, designing database interfaces, and performing operations with configuration loading.
What's included
1 video3 readings1 assignment
In this section, we explore object stores, data lakes, and lakehouses, focusing on their roles in managing large-scale data workflows efficiently.
What's included
1 video6 readings1 assignment
In this section, we explore Spark transformations, aggregations, joins, and window functions to enhance data processing for BI and analytics. Key concepts include efficient data manipulation and pipeline development.
What's included
1 video4 readings1 assignment
In this section, we explore Deequ for implementing data quality checks, analyzing completeness and accuracy, and defining constraints to ensure reliable data pipelines.
What's included
1 video3 readings1 assignment
In this section, we explore test-driven development, static code analysis, and linting to improve code quality, maintainability, and consistency in data engineering projects.
What's included
1 video4 readings1 assignment
In this section, we explore CI/CD practices with GitHub to automate Scala data pipeline workflows, focusing on GitHub Actions, version control, and reliable deployment processes.
What's included
1 video4 readings1 assignment
In this section, we explore data pipeline orchestration using tools like Airflow, Argo, Databricks, and Azure Data Factory. We focus on workflow design, task management, and real-world implementation strategies.
What's included
1 video6 readings1 assignment
In this section, we analyze Spark UI metrics to identify performance issues, optimize data shuffling, and right-size compute resources for efficient data processing.
What's included
1 video4 readings1 assignment
In this section, we explore building batch pipelines using Spark and Scala, focusing on medallion architecture, data ingestion, transformation, and orchestration for scalable data processing.
What's included
1 video5 readings1 assignment
In this section, we explore building real-time data pipelines using Spark, Scala, and Kafka for IoT applications. Key concepts include data ingestion, transformation, and serving layer design.
What's included
1 video4 readings1 assignment
Instructor

Offered by
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.

Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy
Frequently asked questions
Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.
If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.
Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.
More questions
Financial aid available,

