0% found this document useful (0 votes)

33 views18 pages

Sumit Kothari Apache Spark and Scala Practical 17

scala and apache practical

Uploaded by

varunbhingarde1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views18 pages

Sumit Kothari Apache Spark and Scala Practical 17

scala and apache practical

Uploaded by

varunbhingarde1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

R.A.

Podar College of Commerce and

Economics (Autonomous)

Name : Sumit Kothari

Roll number : 17
Class : TY
Division : A
Programme : BSc Data Science and Analytics
Semester : V
Course name : Apache Spark and Scala Practical
Course code : 60506
Academic year : 2024-25
Faculty in-charge : Sita Nadar
Certificate

This is to certify that the journal of course Apache

Spark and Scala Practical has been satisfactorily
completed by Sumit Kothari as a partial fulfilment of
TY BSc(Data Science & Analytics) Semester V during
the academic year 2024 – 2025.

Date: 27th September, 2024

_______________
Faculty in-charge
Apache Spark and Scala Practical
INDEX

Sr Title Date of Signature

No submission
1 To setup and configure Scala 16.07.2024

2 To setup and configure 06.08.2024

Apache Spark
3 To perform basic 13.08.2024
mathematical operations in
Scala
4 To perform basic operations 20.08.2024
on collection and loops.
5 To create functions and 27.08.2024
procedures in Scala.
6. To implement a basic word 24.09.2024
count program using Spark
RDDs.
Name: Sumit Kothari
Roll number: 17
Practical 1
Aim: Write installation steps of scala on Windows.
Steps
*Scala is a very compatible language and thus can very easily be installed into the Windows.
*The most basic requirement is that we must have Java 1.8 or a greater version installed on
our computer.
*Verifying Java Packages: The first thing we need to have is a Java Software Development
Kit(SDK)
installed on the computer. We need to verify this SDK packages and if not installed then
install them. Just
go to the Command line(For Windows, search for cmd in the Run dialog( + R).
*Now run the following command:
java -version
Once this command is executed the output will show the java version and the output will be
as follows:

In case we are not having the SDK installed then download the latest version according to the
computer requirements
from oracle.com and just proceed with the installation.

Downloading and Installing Scala:

Downloading Scala: Before starting with the installation process, you need to download it.
For that, all
versions of Scala for Windows are available on scala-lang.org

Download the Scala and follow the further instructions for the installation of Scala.
Beginning with the Installation:

Getting Started:

Move on to Installing
Installation Process:

Finished Installation:
After completing the installation process, any IDE or text editor can be used to write Scala
Codes and Run them on the
IDE or the Command prompt with the use of command:
scalac file_name.Scala
scala class_name
Name: Sumit Kothari
Roll number: 17
Practical 2
Aim: Write steps to setup and configure Apache Spark
Prerequisites:
• Java Development Kit (JDK): Ensure JDK 8 or later is installed.
• Apache Spark: Download the latest stable version from the official website
(https://spark.apache.org/downloads.html).
• Hadoop: If you're using Hadoop for distributed processing, download and install it.
• Scala (Optional): If you plan to write Spark applications in Scala, install Scala.
Setup and Configuration:
1. Unzip Spark: Extract the downloaded Spark distribution to a desired location.
2. Set Environment Variables:
o SPARK_HOME: Set this variable to the extracted Spark directory.
o JAVA_HOME: Set this variable to the directory where your JDK is installed.
o HADOOP_HOME: If using Hadoop, set this variable to the Hadoop
installation directory.
o PATH: Add the bin directory of Spark and Hadoop (if applicable) to your
system's PATH.
3. Run Spark:
o Local Mode: To run Spark locally on your machine, open a terminal and
navigate to the bin directory of Spark. Execute the following command:
Bash
./spark-shell
o Standalone Mode: For a standalone cluster, configure spark-env.sh in the conf
directory of Spark. Set the necessary properties (e.g., spark.master,
spark.executor.cores, spark.executor.memory) and start the master and worker
nodes.
o YARN Mode: If using YARN, configure spark-defaults.conf and submit
applications using the yarn client.
o Mesos Mode: If using Mesos, configure spark-defaults.conf and submit
applications using the mesos client.
Name: Sumit Kothari
Roll number: 17
Practical 3
Aim: Write a scala program to perform basic mathematical operations in scala
Source code & Output:
Name: Sumit Kothari
Roll number: 17
Practical 4
4.1
Aim: Write a scala program to compute the sum of the two given integer values. If the two
values are the same, then return triple their sum.
Source code & Output:
4.2
Aim: Write a scala program to compute the sum of the two given integer values. If the two
values are the same, then return triples their sum.
Source code & Output:

4.3
Aim: Write a scala program to print the table of a number.
Source code & Output:
Name: Sumit Kothari
Roll number: 17
Practical 5
5.1
Aim: Write a program to greet the user
Source code & Output:

5.2
Aim: Write a recursive function that calculates the factorial
Source code & Output:
5.3
Aim: Write a program to print a List
Source code & Output:

5.4
Aim: Write a program to add two numbers
Source code & Output:
5.5
Aim: Write a higher order function to apply functions to a list
Source code & Output:

5.6
Aim: Write an anonymous to filter even numbers
Source code & Output:
Name: Sumit Kothari
Roll number: 17
Practical 06
Aim: To implement a basic word count program using Spark RDDs.
Steps:
Name:
Roll number:

1. Create a text file containing repeated words and save it in the C: drive. This file will
be used for the Word Count program.

2. Open Windows PowerShell, change the directory to the Spark bin folder using the cd
command, and then execute the command spark-shell to launch the Spark REPL.
3. Next, load your text file into Spark by typing the following command in the Spark
shell.

4. Next, execute the command text.collect() to display the contents of the file loaded into
Spark.

5. Now, run the command given below.

6. Next, run counts.collect() to display the list of words after splitting each line of text.

7. Use the command given below to map each word to a key-value pair where the word
is the key, and the value is 1.

8. Type in the next command to retrieve and display the results of the mapped RDD
from Spark to the driver program as a list.
9. The command written below is used to aggregate the values of each key in the RDD
mapf by summing them up, creating a new RDD reducef with the total counts for each unique
key.

10. Use reducef.collect to retrieve and display the aggregated results from the reducef
RDD.

11. Enter the command mentioned below in order to save the aggregated results from the
reducef RDD to the "spark_output" folder in the C drive. This folder will be created
automatically. One does not need to create it manually.

12. The folder can be accessed on the C drive.

13. Navigate to the "spark_output" folder, open the file named "part-00000," and you'll
find the words along with their counts from the text file created in step 1.

Output:

4.2. Spark Applications
No ratings yet
4.2. Spark Applications
19 pages
Lec - Spark
No ratings yet
Lec - Spark
65 pages
Pypark Scala Spark
No ratings yet
Pypark Scala Spark
26 pages
Unit 4 Spark Updated
No ratings yet
Unit 4 Spark Updated
86 pages
Spark PPT
No ratings yet
Spark PPT
55 pages
09 Gitos-Pro Manual
No ratings yet
09 Gitos-Pro Manual
201 pages
Final Note
No ratings yet
Final Note
31 pages
Bda 05
No ratings yet
Bda 05
12 pages
Lecture 10 - Spark
No ratings yet
Lecture 10 - Spark
87 pages
Unit V
No ratings yet
Unit V
23 pages
Installation Et Configuration de Spark
No ratings yet
Installation Et Configuration de Spark
14 pages
29 PDFsam Apache Spark Tutorial
No ratings yet
29 PDFsam Apache Spark Tutorial
7 pages
8 PDFsam Apache Spark Tutorial
No ratings yet
8 PDFsam Apache Spark Tutorial
7 pages
Instructions For Using Scala With VSCode
No ratings yet
Instructions For Using Scala With VSCode
10 pages
DS Writeup 11&12
No ratings yet
DS Writeup 11&12
5 pages
22 PDFsam Apache Spark Tutorial
No ratings yet
22 PDFsam Apache Spark Tutorial
7 pages
Group B Assignment No: 13: 1) Install Scala
No ratings yet
Group B Assignment No: 13: 1) Install Scala
5 pages
Day1 Main
No ratings yet
Day1 Main
188 pages
Spark
No ratings yet
Spark
160 pages
AmirghodsiSiama 2017 RunningYourFirstProgr ApacheSpark2XMachineL
No ratings yet
AmirghodsiSiama 2017 RunningYourFirstProgr ApacheSpark2XMachineL
9 pages
3 - Spark
No ratings yet
3 - Spark
51 pages
BDALab Assn5
No ratings yet
BDALab Assn5
16 pages
Learning Spark - Chapter 2
No ratings yet
Learning Spark - Chapter 2
6 pages
Apache Spark Installation and Programming Guide
No ratings yet
Apache Spark Installation and Programming Guide
2 pages
Bda 05
No ratings yet
Bda 05
12 pages
Apache Spark Tutorial
100% (1)
Apache Spark Tutorial
6 pages
Unit-5 Spark
No ratings yet
Unit-5 Spark
24 pages
1 PDFsam Apache Spark Tutorial
No ratings yet
1 PDFsam Apache Spark Tutorial
7 pages
What Is Spark?: Up To 100× Faster
No ratings yet
What Is Spark?: Up To 100× Faster
56 pages
Practical Assignment - :: Distributed Data Processing With Apache Spark
No ratings yet
Practical Assignment - :: Distributed Data Processing With Apache Spark
3 pages
Spark-Tutorial - IV - Python
No ratings yet
Spark-Tutorial - IV - Python
212 pages
Apache Spark Tutorial
100% (4)
Apache Spark Tutorial
36 pages
Installing Apache Spark and Scala: Windows
No ratings yet
Installing Apache Spark and Scala: Windows
3 pages
Bda 7
No ratings yet
Bda 7
4 pages
CMT 432 Big Data Science Prgramming Tools - Scala & Apache Spark Outline
No ratings yet
CMT 432 Big Data Science Prgramming Tools - Scala & Apache Spark Outline
2 pages
Unit IV Spark
No ratings yet
Unit IV Spark
23 pages
SABDE3G06 Big Data Sparks
No ratings yet
SABDE3G06 Big Data Sparks
57 pages
Practical 11cdscds
No ratings yet
Practical 11cdscds
4 pages
Cse413 201-15-3452 Lab-Report 02
No ratings yet
Cse413 201-15-3452 Lab-Report 02
6 pages
Suppose You Have A Large Dataset Stored in A Distributed File System Like HDFS
No ratings yet
Suppose You Have A Large Dataset Stored in A Distributed File System Like HDFS
11 pages
Parallel Programming With Spark: Matei Zaharia
No ratings yet
Parallel Programming With Spark: Matei Zaharia
40 pages
Palantir Price List
No ratings yet
Palantir Price List
2 pages
Apach Spark With Scala Slides
No ratings yet
Apach Spark With Scala Slides
187 pages
Spark Interview Questions PDF 2
No ratings yet
Spark Interview Questions PDF 2
19 pages
Big Data Computing Spark Basics and RDD: Ke Yi
No ratings yet
Big Data Computing Spark Basics and RDD: Ke Yi
43 pages
Apache Spark: CS240A Winter 2016. T Yang
No ratings yet
Apache Spark: CS240A Winter 2016. T Yang
36 pages
Just Enough Scala For Spark
No ratings yet
Just Enough Scala For Spark
62 pages
Next Js Ebook
No ratings yet
Next Js Ebook
54 pages
Srs
0% (1)
Srs
14 pages
13 SparkBuildingAndDeploying
No ratings yet
13 SparkBuildingAndDeploying
53 pages
BSC User Guide
No ratings yet
BSC User Guide
76 pages
R Programming Language Notes
No ratings yet
R Programming Language Notes
8 pages
Apache Spark Installation
No ratings yet
Apache Spark Installation
4 pages
Scala and Spark Training: Objective
No ratings yet
Scala and Spark Training: Objective
4 pages
Scala PDF
No ratings yet
Scala PDF
29 pages
A CRM Application To Handle The Clients and Their Property Related Requirements
No ratings yet
A CRM Application To Handle The Clients and Their Property Related Requirements
70 pages
Cloud Demo
No ratings yet
Cloud Demo
30 pages
AVE 19.3 Installation and Upgrade Guide
No ratings yet
AVE 19.3 Installation and Upgrade Guide
122 pages
Spark Python Install
No ratings yet
Spark Python Install
3 pages
How To Start Taking Notes in A Text Editor Like Obsidian
100% (2)
How To Start Taking Notes in A Text Editor Like Obsidian
12 pages
Illustration & Photo Editing - PW - Draft
No ratings yet
Illustration & Photo Editing - PW - Draft
119 pages
Drop Box
No ratings yet
Drop Box
7 pages
Directory Connector With Sso 4 1 Admin Guide
No ratings yet
Directory Connector With Sso 4 1 Admin Guide
86 pages
Spark and Scala Course
No ratings yet
Spark and Scala Course
5 pages
imx8MM Hands On Lab Guide 1.5
No ratings yet
imx8MM Hands On Lab Guide 1.5
26 pages
526.98 Win10 Win11 NSD Release Notes
No ratings yet
526.98 Win10 Win11 NSD Release Notes
38 pages
Lecture Plan BFC 43201 Civil Engineering Software
No ratings yet
Lecture Plan BFC 43201 Civil Engineering Software
5 pages
Billing Software System-Individual Project (Vipul Y S-4AD19CS103)
No ratings yet
Billing Software System-Individual Project (Vipul Y S-4AD19CS103)
33 pages
Verifiable Build Acsac2014
No ratings yet
Verifiable Build Acsac2014
10 pages
Chapter 1
No ratings yet
Chapter 1
33 pages
Bi-Directional Integration Using Import Set API
No ratings yet
Bi-Directional Integration Using Import Set API
17 pages
VMware ELS - Course Catalog (Nov. 2023)
No ratings yet
VMware ELS - Course Catalog (Nov. 2023)
5 pages
Manager Lead
No ratings yet
Manager Lead
2 pages
MimioConnect Blended Learning SpecSheet
No ratings yet
MimioConnect Blended Learning SpecSheet
4 pages
UCU 2105 Fundamentals of ICT Year 1 Semester II
No ratings yet
UCU 2105 Fundamentals of ICT Year 1 Semester II
3 pages
Siri Commands Sierra PDF
No ratings yet
Siri Commands Sierra PDF
2 pages
Cognos Analytics - How To Modify A Filter in A BI Report
No ratings yet
Cognos Analytics - How To Modify A Filter in A BI Report
4 pages
Threaddump 20180831 160844
No ratings yet
Threaddump 20180831 160844
9 pages
OPL 07 ZSE16 Reporting and Tables
No ratings yet
OPL 07 ZSE16 Reporting and Tables
12 pages
CMP1103 Information and Communications Technology
No ratings yet
CMP1103 Information and Communications Technology
3 pages
Citrix Education - ALL-ACCESS
No ratings yet
Citrix Education - ALL-ACCESS
2 pages
Programming in Pascal: From simple Pascal programs to current desktop applications with Database DEV-PASCAL, LAZARUS AND PASCAL N-IDE
From Everand
Programming in Pascal: From simple Pascal programs to current desktop applications with Database DEV-PASCAL, LAZARUS AND PASCAL N-IDE
Olga Maria Stefania Cucaro
No ratings yet
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet
Spark: Big Data Cluster Computing in Production
From Everand
Spark: Big Data Cluster Computing in Production
Ilya Ganelin
No ratings yet
Learn Java Programming in 24 Hours
From Everand
Learn Java Programming in 24 Hours
PublishDrive
No ratings yet
Hands-on React Native
From Everand
Hands-on React Native
Ahmed Bouchefra
1/5 (1)
JAVASCRIPT FRONT END PROGRAMMING: Crafting Dynamic and Interactive User Interfaces with JavaScript (2024 Guide for Beginners)
From Everand
JAVASCRIPT FRONT END PROGRAMMING: Crafting Dynamic and Interactive User Interfaces with JavaScript (2024 Guide for Beginners)
DAISY JOHNSTON
No ratings yet
Game and Graphics Programming for iOS and Android with OpenGL ES 2.0
From Everand
Game and Graphics Programming for iOS and Android with OpenGL ES 2.0
Romain Marucchi-Foino
No ratings yet
C# For Beginners: An Introduction to C# Programming with Tutorials and Hands-On Examples
From Everand
C# For Beginners: An Introduction to C# Programming with Tutorials and Hands-On Examples
Nathan Metzler
5/5 (1)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sumit Kothari Apache Spark and Scala Practical 17

Uploaded by

Sumit Kothari Apache Spark and Scala Practical 17

Uploaded by

R.A.

Podar College of Commerce and

Name : Sumit Kothari

This is to certify that the journal of course Apache

Date: 27th September, 2024

Sr Title Date of Signature

2 To setup and configure 06.08.2024

Downloading and Installing Scala:

5. Now, run the command given below.

12. The folder can be accessed on the C drive.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.