Spark Execution Mode
Lesson objectives
In this lesson, we will explain the following topics:
- Learn about the different execution modes in Spark, including cluster mode, client mode, and local mode.
- Understand the differences and use cases for each execution mode.
- Explore how to configure and execute Spark applications in various modes.
Execution Modes
Execution Modes Overview
- Execution modes define the location of resources when running Spark applications.
- Three modes available:
- Cluster mode
- Client mode
- Local mode
Cluster Manager Components
Figure 1: A cluster driver and worker (no Spark Application yet).
Cluster Mode
- Most common mode for running Spark Applications.
- User submits a pre-compiled JAR, Python script, or R script to a cluster manager.
- The cluster manager then launches the driver process on a worker node inside the cluster.
- Executor processes also launched within the cluster.
- Cluster manager handles all Spark Application processes.
- This means that the cluster manager is responsible for maintaining all Spark Application–related processes.
Spark Cluster Mode
Figure 2: Spark’s cluster mode.
Client Mode
- Similar to cluster mode, but the Spark driver remains on the client machine that submitted the application.
- Client machine is responsible for maintaining the Spark driver process.
- Cluster manager maintains executor processes.
- Commonly used with gateway machines or edge nodes.
- The driver is running on a machine outside of the cluster but that the workers are located on machines in the cluster.
Spark Client Mode
Figure 3: Spark’s client mode.
Local Mode
- Runs the entire application on a single machine.
- Parallelism achieved through threads on the same machine.
- Ideal for learning, testing, or local development.
- Not recommended for production use.
Watch on Youtube
Watch on our Servers
You can download the videog the link and chose save link as: Download Video