Demo: Running Spark on Linux Ubuntu
Lesson objectives
In this lesson, we will explain the following topics:
- Demonstrate the process of installing and running Spark on Linux Ubuntu.
- Understand the configuration steps required for Spark installation on Ubuntu.
- Explore the execution of Spark applications on a Linux environment.
Apache Spark Installation on Ubuntu
Docker For testing
docker pull ubuntu:24.04
docker run -it --name spark-ubuntu-container ubuntu:24.04
apt-get update
apt-get install curl wget
1. Update the package list
sudo apt-get update
2. Install Java
Apache Spark requires Java. Install OpenJDK:
apt-get install openjdk-11-jdk
Verify the installation:
java -version
3. Download Apache Spark
Go to the Apache Spark download page and copy the link to the latest release. Use wget
to download it:
wget https://archive.apache.org/dist/spark/spark-3.4.3/spark-3.4.3-bin-hadoop3.tgz
4. Extract the Spark tar file
tar xvf spark-3.4.3-bin-hadoop3.tgz
5. Move Spark to the installation directory
mv spark-3.4.3-bin-hadoop3 /opt/spark
6. Set up environment variables
Open the .bashrc
file:
vi ~/.bashrc
Add the following lines at the end:
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
Save and close the file. Then, apply the changes:
source ~/.bashrc
Watch on Youtube
Watch on our Servers
You can download the videog the link and chose save link as: Download Video