Demo: RDD Text Manipulation
Lesson objectives
In this lesson, we will explain the following topics:
- Demonstrate text manipulation using RDDs in Spark.
- Learn how to apply transformations and actions on text data.
- Explore practical examples of RDD operations for text processing.
DEMO
Example: Text Manipulation RDD
text = ["Hello Spark", "Hello Scala", "Hello World"]
text_rdd = sc.parallelize(text)
print(f"Original Text RDD result: {text_rdd.take(10)}")
words_rdd = text_rdd.flatMap(lambda line: line.split(" "))
print(f"Words RDD result: {words_rdd.take(10)}")
upper_words_rdd = words_rdd.map(lambda word: word.upper())
print(f"Upper Words RDD result: {upper_words_rdd.take(10)}")
Watch on Youtube
Watch on our Servers
You can download the video by right clicking the link and chose save link as: Download Video
Download the code
You can download the Jupyter notebook, Databricks Notebook, or the Python source code using the following links: