Hello folks, today we will
explore some basic and important things about Apache Spark. In this post we
will only focus on Introduction part about Apache Spark. I am very much excited
to include this post into this blog and hoping you also get some good information.
For each and every topic, we will follow the below strategy to understand about
topic:
Q1. What is this?
Q2. Why does it need?
So let’s start…..
Q What is Apache Spark?
Apache Spark is a fast and
general-purpose data processing engine. Spark is basically use for computation
intensive algorithms over the cluster. It works on top of Apache Hadoop
platform. It is one of the famous ecosystems of Hadoop, Spark is 100 times
faster than
Big data Hadoop and 10 times faster than
accessing data from disk.
Spark comes with several
programming languages like java, Python, Scale & R for data Processing.
Q Why does it need?
As we know that we cannot do
the data analysis on a single machine and also we never do the data analysis on
huge amount of data. If we have huge amount of data and we want to do some
computation on data then we need to use cluster computing concept. Before going forward let us
discuss little bit about cluster computing.
Cluster: Cluster is nothing but network of machine/commodity hardware.
Cluster Computing: cluster computing is a set of loosely or tightly connected
computers that work together so that, in many respects, they can be viewed as a
single system. Unlike grid computers, computer clusters have each node set to
perform the same task, controlled and scheduled by software.
Apache
Spark is a powerful open source engine that provides real-time stream
processing, interactive processing, graph processing, in-memory processing as
well as batch processing with very fast speed, ease of use and standard
interface.
Who Use the
Spark?
There
are 2 Kind of people use the Spark:
1) Data
Engineers
2) Data
Scientists
Data Scientists analysis the data on the top
of Big data and they want a value out of this data by using machine learning
algorithms.
Data Engineers are processing the application
data for the specific requirement.
Wow Very Nice Post I really like This Post. Please share more post. Apache Spark Training Institute in Pune We provide Blog, LMS, Regular Course, Self Paced Course, Webinar Session: Marketing, Project: Sell, Technical Support, Lab Service: Sell.
ReplyDelete