Using Hadoop Analyse Retail Wifi Log File

Since a long time we are providing Big Data Hadoop training in Gurgaon to aspirant seeking a career in this domain.So, here our Hadoop experts are going to share a big data Hadoop case study.Think of the wider perspective, as various sensors produce data. Considering a real store we listed out these sensors- free WiFi access points, customer frequency counters located at the doors, smells, the cashier system, temperature, background music and video capturing etc.


big data hadoop


While many of the sensors required hardware and software, a few sensor options are around for the same. Our experts found out that WiFi points provide the most amazing sensor data that do not need any additional software or hardware. Many visitors have Wi- Fi-enabled smart phones. With these Wifi log files, we can easily find out the following-


  • Unique visits
  • Average duration of customer visit
  • Total number of customer visits
  • New visitors


We know that getting answers through the Big Data methodology will involve a new person called data scientist. Well, the answer is quite straightforward if we consider a high level architecture. A database management system is required with ingest, store and process capability.


With the Hadoop approach, a data scientist, without any programming language will be able to answer above questions.

Following setup was used:

  • 2 WiFi points to simulate 2 different stores
  • A virtual machine working as central daemon collecting all the log messages.
  • Flume was used to transfer all WIFI log messages to HDFS.
  • A 5 node CDH4 cluster of virtual machines (2 GB RAM ,200 GB HDD,
  • CentOS ).
  • Pentaho DI graphical designer for data filtering, parsing,
  • transformation along with loading data to the warehouse.
  • Hive as a data warehouse system over Hadoop to plan structure of data
  • Impala for querying data in Hive
  • Excel to visualize the results for better understanding.


To collect sample data we turned on the two Wi-Fi routers for a period of five days.Once Impala can execute the CREATE table query statement , a data scientist will have the access of all data from their analysis, reporting tools and BI. You can get more details of Hadoop architecture by getting enrolled in a reputed  Big Data Hadoop Training institute in Pune. You can go through various use cases to understand the big data management in Hadoop.


This is the latest solution around and having a good knowledge in this simply means better opportunities knocking at your doorstep for a great career. Go get enrolled, analyse case studies and sharpen your skills!!!


Interested in a career in Data Analyst?

To learn more about Machine Learning Using Python and Spark – click here.
To learn more about Data Analyst with Advanced excel course – click here.
To learn more about Data Analyst with SAS Course – click here.
To learn more about Data Analyst with R Course – click here.
To learn more about Big Data Course – click here.

January 12, 2016 10:13 am Published by , , , , , , , , , , , , ,

, , , , , , , , , , ,

Comments are closed here.


Call us to know more