Hive organizes data using Partitions. By use of Partition, data of a table is organized into related parts based on values of partitioned columns such as Country, Department. It becomes easier to query certain portions of data using partition.
Partitions are defined using command PARTITIONED BY at the time of the table creation.
We can create partitions on more than one column of the table. For Example, We can create partitions on Country and State.
CREATE [EXTERNAL] TABLE table_name (col_name_1 data_type_1, ….)
PARTITIONED BY (col_name_n data_type_n , …);
It can be used for log analysis, we can segregate the records based on timestamp or date value to see the results day wise / month wise.
Another use case can be, Sales records by Product –type , Country and month.
Interested in a career in Data Analyst?
To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.