Monday 8 June 2020

Understanding about the Apache Nifi through the Practical Approach


Hello data lover! Today we are going to discuss about the most popular ecosystem which is Apache Nifi or Niagara files. If you are from data Science background you must heard about it once. I had discussion about Apache Nifi with the many people and most of people said to me it is very high level and quite complex technology, but I am going to prove them wrong through some demos. So, without wasting time let us start.


Before starting the Apache Nifi, we need to be clear about some basic concepts like Data Flow, Data Pipeline & ETL.
Let us discuss one by one.
Data Flow:
·         Moving data/content from Source to destination.
·         Data can be csv, JSON, XML, HTTP data, Image, Videos, Telemetry data, etc.
Data Pipeline:
Movement and transformation of data/content from source to destination.
ETL:
·         E: E stands for Extract.
·         T: T stands for Transformation.

·         L: L stands for Load.
·


Installation:
We will not discuss more about the installation process, it is quite simple Just need to download the tar file and put into the desire directory (Win/Linux/Unix). After decompressed, it will be look like below Screenshot.

How to start the Apache Nifi?
In my case, I installed Apache Nifi on Window machine. If you want to start the Nifi then we need to go \bin directory. Here you can see there are 6 files some files are .bat files and others are .sh files. As we know that .bat files are for windows and .sh files for Linux/Unix environments.

For window Machine:
Path: ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\bin
Command: run-nifi
For Linux/Unix:
Path: ApacheNifi/nifi-1.10.0-bin/nifi-1.10.0/bin
Command: nifi start


Apache Nifi Console Overview:
I am not going make this article more theorical that’s why I am not going to describe all components here. Click to read more about it.



Now we have enough information to create our first data flow. I am so excited for it. 😊
Lab1: In this activity, we will get Real-Time data from weather website and will it on desire directory.
Here we are using three processors for this activity:
1)      InvokeHTTP 1.10.0: This is for fetch the data from api of website.

2)      UpdateAttribute 1.10.0: this is for convert the data into specific format, here I am storing the files in JSON format.
3)      PutFile 1.10.0: This is used for saving the data into destination directory.
Note: Every Processors have three configuration tabs; Settings, Scheduling, Properties and Comments. According to your requirement you can configure the processor.


Below is screenshot of the dataflow.




Lab2: Customize the Logo in Apache Nifi:

As per organization or Project we can changes in Apache Nifi and Logo customization is one of them. Here I am going to put my photo as logo.

Steps:

1)      Modify the pixel specification of the logo picture using paint (61x90) and save it as png file.

2)      Go to the path \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\lib and find nifi-framework-nar-1.x.x.nar.

3)      Use 7-Zip to open the nifi-framework-nar-1.x.x.nar and it will look like below









4)      Go to C: \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\lib\nifi-framework-nar-1.10.0.nar\META-INF\bundled-dependencies\  and find nifi-web-ui-1.10.0.war file and open it.

5)      Go to image directory and past the logo png file.

6)      Now go to C: \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\lib\nifi-framework-nar-1.10.0.nar\META-INF\bundled-dependencies\nifi-web-ui-1.10.0.war\css\ and find nf-canvas-all.css.gz and find the logo image path link and replace original nifi logo to customize logo png file and save it.

7)      Restart the Apache Nifi and you can able to see the customize logo.


Lab3: Customize the Title in Apache Nifi:

Steps:

1)      Go to the C: \ApacheNifi\nifi-1.10.0-bin\nifi-1.10.0\Conf

2)      Edit the flow.xml file

3)      Find the Flow-Nifi and replace with your own customize text and save it.

4)      Restart the Apache Nifi.





Data Provenance Page Overview:
We can enable tracking data flows from beginning to end. You can find the Data provenance page at Top Right corner.




Security in Nifi:


Apache Nifi supports SSL, SSH, HTTPS, encrypted content, and more. Provides multi-tenant authorization and internal policy management. You can enable security according to your requirement in Apache Nifi. We will discuss about the Security in separate post because important.
Happy Learning!


0 comments:

Post a Comment