0% found this document useful (0 votes)

32 views24 pages

Bda PJ Report

Uploaded by

sadurlajayanth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views24 pages

Bda PJ Report

Uploaded by

sadurlajayanth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

1

A Project Report for BIG DATA Lab(22CS307PC)

BIG DATA STOCK ANALYSIS USING HADOOP

Submitted
to
CMR Technical Campus, Hyderabad

In Partial fulfillment for the requirement of the Award of the Degree

BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE & ENGINEERING

by
S.JAYANTH
(227R1A67H4)

Under the esteemed guidance of

Mr G Pavan Kumar
Assistant Professor

DEPARTMENT OF COMPUTER
CMR TECHNICAL CAMPUS
An UGC Autonomous Institute
Accredited by NBA & NAAC with A Grade
(Approved by AICTE, Affiliated to JNTU, Hyderabad)
Kandlakoya (V), Medchal (M), Hyderabad-501 401
(2024-2025)

1
2

CERTIFICATE
This to certify that, the Presentation entitled “ BIG DATA STOCK ANALYSIS

USING HADOOP ” is submitted by S.JAYANTH bearing the Roll Number

227R1A67H4 of B.Tech Computer Science and Engineering,In Partial fulfillment for
the requirement of the Presentation and for the award of the Degree of Bachelor of
Technology during the academic year 2024-25.

Subject Faculty
Mrs.B.Sangamithra

2
3

3
4

TABLE OF CONTENTS: PAGE NO

 ABSTRACT 5
 INTRODUCTION 6
 PURPOSE 7
 OBJECTIVES 8
 APACHE HADOOP 9
 MAPREDUCE 10
 PIG 11
 HIVE 15 12
 STOCK DATA ANALYSIS 13-14
 TECHNICAL INDICATORS 15-16
 SAMPLECODE 17-18
 OUTPUT 19-20
 FUTURE SCOPE 21
 CONCLUSION 22
 REFERENCES 23-24


4
5

1.ABSTRACT

Abstract: Big Data Stock Analysis Using Hadoop

In today’s fast-paced financial world, analyzing stock market trends is essential for making
informed investment decisions. The vast and complex nature of stock market data, including
historical prices, real-time transactions, and market sentiment, demands advanced tools and
frameworks for effective processing and analysis. Big Data technologies offer a promising
solution to tackle this challenge.
This paper presents a system for Big Data Stock Analysis using Hadoop, a powerful open-
source framework designed for distributed storage and processing of massive datasets. By
leveraging Hadoop’s core components—HDFS (Hadoop Distributed File System) for data
storage and MapReduce for parallel data processing—we develop an efficient architecture to
manage and analyze extensive stock market datasets.
The system processes historical and real-time stock data to generate insights, such as
predicting trends, detecting anomalies, and identifying profitable investment opportunities.
Additional integration with tools like Apache Hive and Apache Spark facilitates querying,
data visualization, and enhanced analytics. The framework also incorporates sentiment
analysis by processing social media data and news articles, thereby correlating market
sentiments with stock performance.
The proposed solution demonstrates scalability, fault tolerance, and efficiency, making it
suitable for handling the dynamic and high-volume nature of financial data. Our experiments
show that the Hadoop-based approach significantly reduces data processing time and
enhances prediction accuracy compared to traditional methods. This work highlights the
potential of Big Data technologies in transforming stock market analytics and improving
decision-making in the financial domain.
Keywords: Big Data, Hadoop, Stock Analysis, HDFS, MapReduce, Financial Analytics,
Sentiment Analysis

5
6

2.INTRODUCTION

Big data exceeds the reach of commonly used hardware environments and software tools to
capture, manage, and process it with in a tolerable elapsed time for its user population [1].
Big data refers to data sets whose size is beyond the ability of typical database software tools
to capture, store, manage and analyze [2]. Big data is a collection of data sets so large and
complex that it becomes difficult to process using on-hand database management tools [3].
Big Data encompasses everything from click stream data from the web to genomic and
proteomic data from biological research and medicines. Big Data is a heterogeneous mix of
data both structured (traditional datasets –in rows and columns like DBMS tables, CSV's and
XLS's) and unstructured data like e-mail attachments, manuals, images, PDF documents,
medical records such as x-rays, ECG and MRI images, forms, rich media like graphics, video
and audio, contacts,

forms and documents. Businesses are primarily concerned with managing unstructured data,
because over 80 percent of enterprise data is unstructured and require significant storage
space and effort to manage.―Big data‖ refers to datasets whose size is beyond the ability of
typical database software tools to capture, store, manage, and analyze [3]. Big data has the
following characteristics:[3] Volume – The first important characteristics of big data. It is the
size of the data which determines whether it can actually be considered Big Data or not. The
name ‗Big Data‘ itself indicates the data is huge.

6
7

3.PURPOSE

1. Efficient Processing of Large-Scale Financial Data: To handle and analyze massive

volumes of stock market data, including historical and real-time data, by leveraging
Hadoop's distributed computing framework.
2. Performance Optimization: To process complex computations involved in stock
price predictions, trends, and behavior analysis with enhanced speed and efficiency
using Hadoop’s parallel processing capabilities.
3. Pattern Detection and Insights Extraction: To uncover patterns, anomalies, and
trends from stock market data that can guide investment decisions, reduce risks, and
enhance returns.
4. Cost-Effective Data Management: To enable scalable and cost-efficient storage and
analysis of financial data using Hadoop's ecosystem, including HDFS (Hadoop
Distributed File System) and MapReduce.
5. Real-Time Analytics: To facilitate near real-time analysis of stock market data for
high-frequency trading, sentiment analysis, or algorithmic trading.
6. Decision Support: To empower financial analysts, investors, and businesses with
actionable insights by leveraging advanced analytics and machine learning models on
the Hadoop framework.
7. Scalability and Flexibility: To ensure the stock analysis system can scale as data
grows and adapt to new sources of data or changing market dynamics.

7
8

4.OBJECTIVES

1. Efficient Stock Data Handling: Develop a scalable and distributed system to process
and analyze large volumes of stock market data using Hadoop.
2. Real-Time Data Processing: Implement mechanisms to handle real-time streaming
stock data for timely insights.
3. Comprehensive Data Analysis: Perform historical and trend analysis on stock data
to identify patterns and predict future market movements.
4. Scalability and Performance Optimization: Utilize Hadoop's distributed file system
(HDFS) and MapReduce framework to ensure fast processing of vast datasets while
maintaining high performance.
5. Data Storage and Retrieval: Efficiently store and retrieve structured and
unstructured stock data across a distributed environment.
6. Visualization and Insights: Generate user-friendly reports and visualizations for
stock performance metrics, enabling better decision-making.
7. Integration with Analytical Tools: Integrate Hadoop with data analysis tools (e.g.,
Hive, Pig, Spark) for enhanced querying and machine learning capabilities.
8. Data Security and Reliability: Ensure the security and reliability of sensitive stock
market data throughout the processing pipeline.
9. Automation of Analysis Pipelines: Develop automated workflows for data ingestion,
cleaning, processing, and visualization.
10. Cost-Effective Solution: Leverage Hadoop's open-source framework to provide a
cost-efficient system for handling large-scale stock analysis.

8
9

5. APACHE HADOOP
:

Apache Hadoop is an open-source software framework written in Java for distributed storage
and distributed processing of very large data sets on computer clusters. All the modules in
Hadoop are designed with a fundamental assumption that hardware failures are common
place and thus should be automatically handled in software by the framework[6]. The core of
Apache Hadoop consists of a storage part (Hadoop Distributed File System (HDFS)) and a
processing part (MapReduce). Hadoop splits files into large blocks and distributes them
amongst the nodes in the cluster. To process the data, Hadoop MapReduce transfers packaged
code for nodes to process in parallel, based on the data each node needs to process. This
approach takes advantage of data locality– nodes manipulating the data that they have on-
hand – to allow the data to be processed faster and more efficiently than it would be in a more
conventional supercomputer architecture that relies on a parallel file system where
computation and data are connected via high-speed networking.

The base Apache Hadoop framework is composed of the following modules:  Hadoop
Common – contains libraries and utilities needed by other Hadoop modules;  Hadoop
Distributed File System (HDFS) – a distributed file-system that stores data on commodity
machines, providing very high aggregate bandwidth across the cluster;  Hadoop YARN – a
resource-management platform responsible for managing compute resources in clusters and
using them for scheduling of users' applications and  Hadoop MapReduce – a programming
model for large scale data processing. The term "Hadoop" has come to refer not just to the
base modules above, but also to the collection of additional software packages, such as
Apache Pig, Apache Hive and others.

9
10

Fig 1: Hadoop ecosystem installed over EXT-4 file system.

6,MAPREDUCE

MapReduce is a programming model for expressing distributed calculation on huge amount of

data and an execution framework for large-scale data processing on clusters of article of trade
servers. It was originally developed by Google and built on well-known principles in parallel
and distributed processing. Hadoop implements the open source MapReduce written in java
which provides reliable, scalable and fault tolerant distributed computing. Key-value pair
forms the basic data structure in MapReduce. Keys and values are primitives such as integers,
floating point values, strings, and raw bytes or they may be arbitrary complex structures (lists,
tuples, associative array, etc.). Programmers can define their own data types. The map function
will take the input data and will generate intermediate key and value pairs. The reduce function
then takes an intermediate key and a set of values to form a smaller set of values. Typically the
reducer produces only zero or one output value. MapReduce framework is responsible for
automatically splitting the input, distributing each chunk to mappers on multiple machines,
grouping and arrangement all intermediate values related with the intermediate key, passing
these values to reducers on multiple resources. Master monitors the functioning of the mapper
and reducer and re-executes them on failure. The MapReduce jobs have thousands of
individual tasks which have to be assigned to nodes in the cluster.

10
11

Fig 2: Simplified view of MapReduce

7.PIG
Apache Pig is a platform for analyzing Big-Data that consists of a high-level language for
expressing data analysis programs, along with infrastructure for evaluating these programs.
Pig's architecture consists of a compiler which produces sequences of Map-Reduce programs,
for already existing large-scale parallel implementation. In Pig data workers can write
complex data transformations independent of Java knowledge. Pig‘s simple SQL-like
scripting language is called Pig Latin, and is easily understood by developers who are
familiar with scripting languages and SQL. Pig is complete, therefore all required data
manipulations can be done in Apache Hadoop with Pig. Using the User Defined Functions
(UDF) that are available in Pig, it can invoke code in many languages like JRuby, Jython and
Java. Pig scripts can be embedded in other languages. The advantage of Pig is that it can be
used as a key to build larger and more complex applications that handle real business
problems. Pig works with data from many sources, and stores the results into the HDFS.
Important features of Pig are: • Ease of programming. It is easy to achieve parallel execution
of data analysis tasks. Complex tasks consisting of multiple co-related data transformations
are explicitly encoded as data flow sequences, thus is easy to write, understand, and maintain

11
12

8.HIVE

The Apache Hive data warehouse infrastructure built on top of Hadoop facilitates querying
and managing large datasets residing in distributed storage. Hive provides a mechanism to
project structure onto this data and query the data using a SQL-like language called HiveQL.

It also allows traditional map/reduce programmers to plug in their custom mappers and
reducers when it is inconvenient or inefficient to express this logic in HiveQL. Hadoop was
built to organize and store massive amounts of data of various shapes, sizes and formats.
Because of its ―schema on read‖ architecture, a Hadoop cluster is a perfect reservoir of
heterogeneous data— structured and unstructured—from a multitude of sources. Data
analysts use Hive to explore structure and analyze that data, then turn it into business insight.
Hive is similar to traditional database code with SQL access. However, since Hive is based
on Hadoop and MapReduce operations, there are several key differences. The first is that
Hadoop is intended for long sequential scans, and since it is based on Hadoop, queries may
have a very high latency (many minutes). Therefore hive cannot be used for applications
which need fast response times. Finally, Hive is read-based and therefore not appropriate for
application that requires a high percentage of write operations.

How Hive Works :

The tables in Hive are similar to tables in a relational database, and data units are organized
in taxonomy from larger to more granular units. Databases consist of tables, which are made
up of partitions. Data can be accessed via a simple query language and Hive supports
overwriting or appending data. Within a particular database, data in the tables is serialized
and each table has a corresponding Hadoop Distributed File System (HDFS) directory.

12
13

9.STOCK DATA ANALYSIS

Steps to Perform Stock Data Analysis with Hadoop :

1. Data Collection

 Collect stock data from sources like Yahoo Finance, Google Finance, or APIs such as
Alpha Vantage, Quandl, or Bloomberg.

 Data types include historical prices, intraday trading data, and news sentiment.

2. Data Preprocessing

 Cleaning: Handle missing values, inconsistent formats, and anomalies.

 Transformation: Normalize prices, convert timestamps, and format the data for
compatibility with Hadoop.

3. Data Storage in Hadoop

 Use HDFS (Hadoop Distributed File System) to store large volumes of stock data.

 Structure data files as:

o CSV

o JSON

o Parquet (for better efficiency).

4. Processing Framework

 MapReduce:

o Use Mapper for parallel processing of stock data (e.g., calculating moving
averages).

o Use Reducer for aggregation tasks (e.g., computing total trading volumes).

 Apache Hive:

o Set up tables in Hive for querying stock data with SQL-like syntax.

 Apache Pig:

o Use Pig scripts for semi-structured or unstructured stock data analysis.

13
14

 Apache Spark (optional):

o If real-time processing or faster computation is required, use Spark with

Hadoop.

5. Stock Data Analysis Techniques

 Descriptive Analytics:

o Compute averages, variances, and standard deviations of stock prices.

 Time Series Analysis:

o Use rolling averages, Bollinger bands, or ARIMA for trend analysis.

 Volume Analysis:

o Analyze trading volumes to detect unusual activities.

 Predictive Analytics:

o Integrate machine learning frameworks like Apache Mahout or TensorFlow

for price predictions.

 Sentiment Analysis:

o Combine Hadoop with NLP libraries to assess the impact of news on stock
trends.

6. Visualization and Reporting

 Export processed data from Hadoop to visualization tools like Tableau or Power BI.

 Use libraries such as Matplotlib or D3.js for custom charts and graphs.

7. Workflow Automation

 Schedule recurring data analysis tasks with Apache Oozie.

 Implement data pipelines using Apache NiFi for real-time updates.

8. Performance Optimization

 Optimize HDFS block size for large files.

 Tune the Hadoop cluster configuration to handle high-frequency data efficiently.

14
15

10.TECHNICAL INDICATORS
1. Moving Averages
a. Simple Moving Average (SMA):

 Description: Calculates the average stock price over a fixed number of periods.

 Implementation in Hadoop:

o Use MapReduce or Hive to calculate averages on a sliding window of stock

prices.

o For each stock, the mapper processes the price data, and the reducer calculates
the SMA for each time window.

b. Exponential Moving Average (EMA):

 Description: Similar to SMA but gives more weight to recent prices.

 Implementation:

o Use a recursive formula in Spark for efficient computation across time-series

data.

2. Relative Strength Index (RSI):

 Description: Measures the magnitude of recent price changes to evaluate overbought
or oversold conditions.

 Implementation:

o Calculate daily gains and losses using Pig or Spark SQL.

o Apply the RSI formula using UDFs (User Defined Functions).

3. Bollinger Bands:
 Description: Composed of a moving average (middle band) and two standard
deviations above and below it (upper and lower bands).

 Implementation:

15
16

o Use Hive or Spark to compute SMA and standard deviation for the desired
window size.

o Generate bands as additional fields in the dataset.

4. Moving Average Convergence Divergence (MACD):

 Description: Highlights changes in the stock's momentum by comparing two moving

averages (short-term and long-term).

 Implementation:

o Compute the 12-day EMA (short) and 26-day EMA (long).

o Subtract the two EMAs to find the MACD line, then calculate the signal line
as a 9-day EMA of the MACD.

5. Volume Weighted Average Price (VWAP):

 Description: Measures the average price weighted by trading volume.

 Implementation:

o Use MapReduce or Spark to sum (Price × Volume) and Volume, and then
divide the two.

6. Average True Range (ATR):

 Description: Measures market volatility by considering recent high, low, and closing
prices.

 Implementation:

o Use Hive queries or Spark functions to calculate the true range for each day
and then average it over a time window.

16
17

11.SAMPLECODE
Steps to Execute in Hadoop :
1.Upload Input Data to HDFS:

hdfs dfs -mkdir /stockdata

hdfs dfs -put stock_data.csv /stockdata/

2.Run the MapReduce Job:

hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-*.jar \

-mapper mapper.py \

-reducer reducer.py \

-input /stockdata/stock_data.csv \

-output /stockdata/output

3.View Results:

hdfs dfs -cat /stockdata/output/part-00000

Mapper Code (mapper.py) :

#!/usr/bin/env python3

import sys

for line in sys.stdin:

# Skip header

if line.startswith("Date"):

continue

try:

fields = line.strip().split(",")

stock_symbol = fields[1]

close_price = float(fields[3])

17
18

print(f"{stock_symbol}\t{close_price}")

except Exception as e:

continue

Reducer Code (reducer.py) :

#!/usr/bin/env python3

import sys

current_symbol = None

sum_price = 0

count = 0

for line in sys.stdin:

try:

stock_symbol, close_price = line.strip().split("\t")

close_price = float(close_price)

if stock_symbol == current_symbol:

sum_price += close_price

count += 1

else:

if current_symbol:

# Emit result for previous stock

avg_price = sum_price / count

print(f"{current_symbol}\t{avg_price:.2f}")

current_symbol = stock_symbol

sum_price = close_price

count = 1

18
19

except Exception as e:

continue

# Emit result for the last stock

if current_symbol:

avg_price = sum_price / count

print(f"{current_symbol}\t{avg_price:.2f}")

11.OUTPUT

Fig 3: Status of Pig query getting executed

19
20

Fig 4: Snapshot of file system over the HDFS.

Using various such queries the visualization of data is done for various attributes of stock
data and one such graph for price change over a period of last 7 days, last 30 days, last 6
months and moving average of the previous year is shown in Fig 5

Fig#7: Graph for some calculated attributes.

20
21

12.Future Scope

Big Data Stock Analysis using Hadoop has immense potential for growth and evolution.
Some key areas of future development include:

1. Advanced Predictive Analytics: Leveraging machine learning and AI with Hadoop for
more accurate stock price predictions and market trend analysis. This can provide
better decision-making tools for investors.

2. Real-Time Stock Analysis: Integration of Hadoop with real-time data streaming tools
like Apache Kafka to perform instantaneous analysis, offering immediate insights into
market movements.

3. Integration with Blockchain: Combining Hadoop's analytical power with blockchain's

secure, decentralized data storage could enhance the reliability and transparency of
stock transactions.

4. Personalized Investment Advice: Customizing investment recommendations based on

user behavior and portfolio patterns analyzed through Hadoop.

5. Scalability: As datasets grow larger, Hadoop's distributed architecture will enable

handling even more extensive and complex data efficiently, making it ideal for
evolving stock markets.

21
22

12.Conclusion

Big Data Stock Analysis using Hadoop marks a significant advancement in the field of
financial analytics, offering a powerful and scalable solution for processing the immense
volume of data generated by stock markets. Traditional systems often struggle with the sheer
size, velocity, and variety of data in the financial domain, but Hadoop's distributed computing
model overcomes these challenges efficiently.

This approach enables the seamless integration of structured and unstructured data, making it
possible to derive actionable insights from diverse sources such as market feeds, social
media, and financial reports. The project showcases Hadoop’s potential to deliver high-speed
data processing, predictive analytics, and visualization, helping traders, investors, and
analysts make informed decisions.

Moreover, by employing Hadoop’s ecosystem, businesses can reduce costs, enhance

operational efficiency, and gain a competitive edge in stock market analysis. The ability to
handle data across different nodes ensures reliability, even in the face of hardware failures,
making it a dependable choice for critical financial systems.

The success of this methodology emphasizes the transformative role of Big Data in modern
finance, highlighting its potential to redefine stock market analysis. As financial markets
continue to grow in complexity and data size, solutions like Hadoop will be indispensable for
driving innovation, improving decision-making, and ensuring a more data-centric approach to
investment strategies.

This project not only underlines the current capabilities of Big Data tools but also paves the
way for future advancements in financial technologies.

22
23

13.REFERENCES

1. Books:

o White, T. (2015). Hadoop: The Definitive Guide: Storage and Analysis at

Internet Scale. O'Reilly Media.

o Miner, D., & Shook, A. (2012). MapReduce Design Patterns: Building

Effective Algorithms and Analytics for Hadoop and Other Systems. O'Reilly
Media.

2. Research Papers and Journals:

o Chen, M., Mao, S., & Liu, Y. (2014). Big Data: A Survey. Mobile Networks
and Applications, 19(2), 171–209.

o Jain, V., & Reddy, K. (2017). Big Data and Predictive Analytics in Stock
Market Decision Making. International Journal of Computer Applications,
162(7), 34–38.

3. Websites and Articles:

o Apache Hadoop Official Documentation: https://hadoop.apache.org/

o Investopedia: Big Data in Stock Market Analysis:

https://www.investopedia.com/

o Hortonworks Blog on Hadoop and Financial Services:

https://hortonworks.com/blog/

4. Case Studies and Reports:

o Gartner Report on Big Data Trends in Finance

o McKinsey Global Institute: Big Data: The Next Frontier for Innovation,
Competition, and Productivity

23
24

5.Online Tutorials and Videos:

o Hadoop tutorials on platforms like Coursera, edX, and Udemy.

o YouTube Channels: Data Engineering and Big Data-focused channels often

include practical examples and projects.

6.Datasets and Tools:

o Yahoo Finance API for stock data.

o Kaggle Datasets: Stock Market Data and Big Data Analysis.

o Apache Hive, HBase, and Spark for Hadoop-related analytics.

Traffic Analysis - LMC-01
67% (3)
Traffic Analysis - LMC-01
15 pages
SK Mapa Linear Algebra
100% (2)
SK Mapa Linear Algebra
163 pages
Case Study On Hadoop
100% (1)
Case Study On Hadoop
6 pages
Big Data & Hadoop Training Material 0 1 PDF
50% (2)
Big Data & Hadoop Training Material 0 1 PDF
168 pages
IOT and Comp - Architecture
No ratings yet
IOT and Comp - Architecture
17 pages
Big Data ANAlysis Short
No ratings yet
Big Data ANAlysis Short
114 pages
Week 5 Researchpaper
No ratings yet
Week 5 Researchpaper
7 pages
Hadoop in Bigdata Processing Concept
No ratings yet
Hadoop in Bigdata Processing Concept
2 pages
I Am Preparing For A Big Data Analytics University...
No ratings yet
I Am Preparing For A Big Data Analytics University...
15 pages
Big Data
No ratings yet
Big Data
27 pages
Notes Big Data
No ratings yet
Notes Big Data
106 pages
Big Data Analytics Presentation
No ratings yet
Big Data Analytics Presentation
30 pages
Alteryx Hadoop Whitepaper Final1
No ratings yet
Alteryx Hadoop Whitepaper Final1
6 pages
Cloud Security UNIT 5
No ratings yet
Cloud Security UNIT 5
4 pages
Introduction To Big Dat1
No ratings yet
Introduction To Big Dat1
6 pages
Hadoop PPT
No ratings yet
Hadoop PPT
25 pages
Big Data Analytics Overview
No ratings yet
Big Data Analytics Overview
17 pages
Last Min Preparation - Big Data
No ratings yet
Last Min Preparation - Big Data
5 pages
Ashish Presentation Stage1 Modify LR
No ratings yet
Ashish Presentation Stage1 Modify LR
24 pages
Big Data and Hadoop Developer
No ratings yet
Big Data and Hadoop Developer
7 pages
Big Data Hadoop Complete Final Spaced
No ratings yet
Big Data Hadoop Complete Final Spaced
15 pages
Assignment 1 Spec
No ratings yet
Assignment 1 Spec
5 pages
Bda U2
No ratings yet
Bda U2
68 pages
HADOOP
No ratings yet
HADOOP
55 pages
Big Data Analytics
No ratings yet
Big Data Analytics
12 pages
Sdcbdasparkweek1 1
No ratings yet
Sdcbdasparkweek1 1
9 pages
MA - VaishuAchini - VIT - 24 - ICT703 - A3
No ratings yet
MA - VaishuAchini - VIT - 24 - ICT703 - A3
21 pages
Big Data Analytics Using Apache Hadoop
No ratings yet
Big Data Analytics Using Apache Hadoop
33 pages
Bda Unit 2
No ratings yet
Bda Unit 2
57 pages
Introduction To Big DAta
No ratings yet
Introduction To Big DAta
2 pages
Module 2. 16974328568170
No ratings yet
Module 2. 16974328568170
113 pages
Big Data
No ratings yet
Big Data
8 pages
BDA Unit 3
No ratings yet
BDA Unit 3
6 pages
Gag PDF
No ratings yet
Gag PDF
15 pages
Introduction Big Data With Hadoop
No ratings yet
Introduction Big Data With Hadoop
3 pages
BDA Unit 3
No ratings yet
BDA Unit 3
7 pages
An Insight On Big Data Analytics Using Pig Script
No ratings yet
An Insight On Big Data Analytics Using Pig Script
7 pages
BD by Maaz
No ratings yet
BD by Maaz
19 pages
Chap3 OverviewOfBigDataEcosystem
No ratings yet
Chap3 OverviewOfBigDataEcosystem
91 pages
ABSTRACT
No ratings yet
ABSTRACT
9 pages
Big Data Analytics 1-5
100% (1)
Big Data Analytics 1-5
63 pages
Assignment Group 3
No ratings yet
Assignment Group 3
21 pages
BDA Notes Unit-2
No ratings yet
BDA Notes Unit-2
27 pages
Bda Test1 Key Answers
No ratings yet
Bda Test1 Key Answers
7 pages
Big Data Problems: Understanding Hadoop Framework: G S Aditya Rao, Palak Pandey
No ratings yet
Big Data Problems: Understanding Hadoop Framework: G S Aditya Rao, Palak Pandey
3 pages
Big Data Analytics
No ratings yet
Big Data Analytics
20 pages
Big Data Testing
100% (1)
Big Data Testing
34 pages
4 A Review Paper On Big Data and Hadoop
No ratings yet
4 A Review Paper On Big Data and Hadoop
3 pages
Chapter - 2 Hadoop
No ratings yet
Chapter - 2 Hadoop
32 pages
Chapter 2 Hadoop Eco System
No ratings yet
Chapter 2 Hadoop Eco System
34 pages
Data Analysis PHASE
No ratings yet
Data Analysis PHASE
14 pages
Big Data
No ratings yet
Big Data
4 pages
Cloud - UNIT V
No ratings yet
Cloud - UNIT V
18 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
BDA Module 2
No ratings yet
BDA Module 2
40 pages
Hadoop V.01
No ratings yet
Hadoop V.01
24 pages
Hadoop-Use - Cases
No ratings yet
Hadoop-Use - Cases
28 pages
Big Data Analytics On Large Scale Shared Storage System: First Seminar
No ratings yet
Big Data Analytics On Large Scale Shared Storage System: First Seminar
22 pages
Hadoop Ecosystem for Big Data
From Everand
Hadoop Ecosystem for Big Data
Dr. Zemelak Goraga
No ratings yet
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Hadoop Blueprints
From Everand
Hadoop Blueprints
Anurag Shrivastava
No ratings yet
Comparison of Sound Insulation of Windows With Double Glass Units
No ratings yet
Comparison of Sound Insulation of Windows With Double Glass Units
5 pages
Ahmet Ozan HATİPOĞLU Cansu Çalişir Mehmet Özgür TEMUÇİN
100% (1)
Ahmet Ozan HATİPOĞLU Cansu Çalişir Mehmet Özgür TEMUÇİN
14 pages
ECF1100 Individual Assignment 2 Sem 1 2022
No ratings yet
ECF1100 Individual Assignment 2 Sem 1 2022
4 pages
F-Rock Mass Classification
No ratings yet
F-Rock Mass Classification
9 pages
ML Module 1
No ratings yet
ML Module 1
52 pages
PHY210 CHAPTER 5 - THERMAL PHYSICS Students PDF
No ratings yet
PHY210 CHAPTER 5 - THERMAL PHYSICS Students PDF
34 pages
Pollens
No ratings yet
Pollens
13 pages
Apptitude + HR Qa
No ratings yet
Apptitude + HR Qa
252 pages
Meq Model Questions
0% (1)
Meq Model Questions
4 pages
Engineering Mathematics-I: Diploma Course in Engineering First Semester
No ratings yet
Engineering Mathematics-I: Diploma Course in Engineering First Semester
160 pages
A Few TEQC Tips For Getting Started: Beth Pratt-Sitaula (UNAVCO)
No ratings yet
A Few TEQC Tips For Getting Started: Beth Pratt-Sitaula (UNAVCO)
2 pages
Picrosiriusred Protocol
No ratings yet
Picrosiriusred Protocol
8 pages
Tablice 1 PDF
No ratings yet
Tablice 1 PDF
1 page
Distance Calculator
No ratings yet
Distance Calculator
201 pages
The Influence of Foreign Currency Volatility On Stok Return Axvjdoscwj
No ratings yet
The Influence of Foreign Currency Volatility On Stok Return Axvjdoscwj
40 pages
Lecture 5-PCP
No ratings yet
Lecture 5-PCP
41 pages
Introduction To Electricity Magnetism and Circuits 1536849524
100% (1)
Introduction To Electricity Magnetism and Circuits 1536849524
995 pages
DasSIDirect 3.0
No ratings yet
DasSIDirect 3.0
192 pages
Earle Brown - Compositional Process
100% (1)
Earle Brown - Compositional Process
19 pages
SpecificationsMotor 3176c PDF
No ratings yet
SpecificationsMotor 3176c PDF
107 pages
A Brief History of Connectionism
No ratings yet
A Brief History of Connectionism
42 pages
Fermiones, Bosones
No ratings yet
Fermiones, Bosones
10 pages
EastWestAirlines Cluster
100% (1)
EastWestAirlines Cluster
6 pages
Mechanical Parts of Domestic Refrigerator
No ratings yet
Mechanical Parts of Domestic Refrigerator
53 pages
Guia Ilagan Sauz Balinado Pedal Power Generation
No ratings yet
Guia Ilagan Sauz Balinado Pedal Power Generation
5 pages
Design and Development of Hand Operate Milk Churn Machine: Tandin Wangdi, Chenga Dorji, Namgay Dorji and Norbu Tshering
No ratings yet
Design and Development of Hand Operate Milk Churn Machine: Tandin Wangdi, Chenga Dorji, Namgay Dorji and Norbu Tshering
4 pages
Capstone Project-Naan Mudlvan
No ratings yet
Capstone Project-Naan Mudlvan
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.