Hello
data lover! Today we are going to discuss about the most popular ecosystem
which is Apache Nifi or Niagara files. If you are from data Science
background you must heard about it once. I had discussion about Apache Nifi with
the many people and most of people said to me it is very high level and quite
complex technology, but I am going to prove them wrong through some demos. So,
without wasting time let us start.
Before starting the Apache Nifi, we need to be clear about
some basic concepts like Data Flow, Data Pipeline & ETL.
Let us discuss one by one.
Data Flow:
·
Moving data/content from Source to destination.
·
Data can be csv, JSON, XML, HTTP data, Image,
Videos, Telemetry data, etc.
Data Pipeline:
Movement and transformation of data/content from source to
destination.
ETL:
·
E: E stands for Extract.
·
T: T stands for Transformation.
·
L: L stands for Load.
·
Installation:
We will not discuss more about the installation process, it
is quite simple Just need to download the tar file and put into the desire directory
(Win/Linux/Unix). After decompressed, it will be look like below Screenshot.
How to start
the Apache Nifi?
In my case, I installed Apache Nifi on Window machine. If
you want to start the Nifi then we need to go \bin directory. Here you can see
there are 6 files some files are .bat files and others are .sh files. As we know
that .bat files are for windows and .sh files for Linux/Unix environments.
For window Machine:
Path: ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\bin
Command: run-nifi
For Linux/Unix:
Path: ApacheNifi/nifi-1.10.0-bin/nifi-1.10.0/bin
Command: nifi start
Apache Nifi Console Overview:
I am not going make this article more theorical that’s why I
am not going to describe all components here. Click to read more about it.
Now we have enough information to create our first data flow.
I am so excited for it. 😊
Lab1: In this activity, we will get Real-Time data
from weather website and will it on desire directory.
Here we are using three processors
for this activity:
1) InvokeHTTP 1.10.0: This is for fetch the data from
api of website.
2) UpdateAttribute 1.10.0: this is for convert the data into
specific format, here I am storing the files in JSON format.
3) PutFile 1.10.0: This is used for saving the data
into destination directory.
Note: Every Processors have three configuration tabs; Settings,
Scheduling, Properties and Comments. According to your requirement you can configure
the processor.
Lab2:
Customize the Logo in Apache Nifi:
As per organization or Project we can changes in Apache Nifi
and Logo customization is one of them. Here I am going to put my photo as logo.
Steps:
1)
Modify the pixel specification of the logo picture
using paint (61x90) and save it as png file.
2)
Go to the path \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\lib
and find nifi-framework-nar-1.x.x.nar.
3)
Use 7-Zip to open the nifi-framework-nar-1.x.x.nar
and it will look like below
4)
Go to C: \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\lib\nifi-framework-nar-1.10.0.nar\META-INF\bundled-dependencies\
and find nifi-web-ui-1.10.0.war file
and open it.
5)
Go to image directory and past the logo png
file.
6)
Now go to C: \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\lib\nifi-framework-nar-1.10.0.nar\META-INF\bundled-dependencies\nifi-web-ui-1.10.0.war\css\
and find nf-canvas-all.css.gz and find the logo image path link and replace original
nifi logo to customize logo png file and save it.
Lab3: Customize the Title in
Apache Nifi:
Steps:
1)
Go to the C: \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\Conf
2)
Edit the flow.xml file
3)
Find the Flow-Nifi and replace with your own
customize text and save it.
4)
Restart the Apache Nifi.
Data Provenance Page Overview:
We can enable tracking data flows from beginning to end. You
can find the Data provenance page at Top Right corner.
Security in Nifi:
Apache Nifi supports SSL, SSH, HTTPS, encrypted content, and
more. Provides multi-tenant authorization and internal policy management. You
can enable security according to your requirement in Apache Nifi. We will discuss
about the Security in separate post because important.
Happy Learning!
0 comments:
Post a Comment