Demo: RDD Text Manipulation

Demo: RDD Text Manipulation

Lesson objectives

In this lesson, we will explain the following topics:

  • Demonstrate text manipulation using RDDs in Spark.
  • Learn how to apply transformations and actions on text data.
  • Explore practical examples of RDD operations for text processing.

DEMO

Example: Text Manipulation RDD

text = ["Hello Spark", "Hello Scala", "Hello World"]
text_rdd = sc.parallelize(text)
print(f"Original Text RDD result: {text_rdd.take(10)}")

words_rdd = text_rdd.flatMap(lambda line: line.split(" "))
print(f"Words RDD result: {words_rdd.take(10)}")

upper_words_rdd = words_rdd.map(lambda word: word.upper())
print(f"Upper Words RDD result: {upper_words_rdd.take(10)}")

Watch on Youtube

Watch on our Servers

You can download the video by right clicking the link and chose save link as: Download Video

Download the code

You can download the Jupyter notebook, Databricks Notebook, or the Python source code using the following links: