Traditional Culture Encyclopedia - Tourist attractions - Hive: Partitioned table structure and data replication
Hive: Partitioned table structure and data replication
Summary: Hive, Shell
Hive replication table includes two types
For non-partitioned tables, if you want to completely copy one table to another table, directly Just use the CREATE TABLE ... AS statement. For example, copy two fields and field values ????of one table to another table as follows
For partitioned tables, if you use the CREATE TABLE ... AS statement, the partition will be invalid. , but it can be executed without error, and the fields and data can be completely copied
There is a partitioned table above, with the dt field as the partition, and CREATE TABLE... AS is used for full table replication
< p> Check that there are no problems with the table structure and table data volumeCheck the partition and report an error: This table is not a partitioned table, but the original partition field dt does exist in the table structure. At this time The partition function of the dt field fails, but the data is retained
To copy the full name of the partition table with partitions, you need to use the LIKE statement to copy the partition information. The specific steps are as follows
The first step is to copy an empty table, which has the table structure and partition information of the original table
The next step is to use the hdfs command to copy the storage path of the original table in hdfs to the path of the new table. Storage of a table The path is a directory, and there are subdirectories under the directory. Each subdirectory represents a partition. Under the partition directory is the data file. The data file is in the format starting with part. The data under the same partition is divided by Hive's bucketing strategy.
The copy statement uses the * wildcard character to copy all the files in the original table directory to the new table path, and view the data files in the hdfs path of the new table
At this time, although the new table corresponds to the data warehouse There are data files in the directory, but the data is still not found in the Hive client. The empty table is because each data partition does not exist in the metadata of the new table. The data is aggregated in units of partition directories. The new table cannot currently be found. If you reach a partition, you will naturally not be able to find the data
The next step is to repair the partition metadata of the table and use the MSCK REPAIR TABLE command
It can be seen from the output execution process that the MSCK REPAIR TABLE command has been checked first. Whether the partition information of the table exists in the metadata, and then repair the non-existent partition information. After the repair, the table can be used normally
The function of MSCK REPAIR TABLE is that you only need to use this command. Quickly and automatically add (repair) all partitions. In Hive, if you create a partition table first and copy the data to the corresponding HDFS directory as initialization, you need to manually add partitions before it can be used. If there are too many partitions, use ALTER TABLE ADD. PARTITION is extremely unchanged. Let's do a test to see whether ALTER TABLE ADD PARTITION can also complete the complete copy of the partition table
The next step is to manually add a partition dt='20201209'
It has been verified that manual partitioning can be completed. MSCK REPAIR TABLE only automatically scans the partition information in the data warehouse directory (dt='20201209' to dt='20210317'). If you write a Shell script, you can also achieve the following
< p> The same effect can be achieved after running this Shell script, but this script takes 15 minutes to execute and requires frequent startup and shutdown of the Hive process- Previous article:Top ten brands of stout.
- Next article:What music is suitable for swinging with your best friend?
- Related articles
- Is a big and cheap octavia station wagon worth choosing?
- What are the products of tourism enterprises?
- The most pothole tourist attractions in Dali
- What is the best driving route from Longmen, Guangdong to Jiangjin, Chongqing?
- Where is the fun in go on road trip, western Guangdong? Team activities, about ten people, including Yangjiang, Maoming and Zhan.
- Have you ever met a shopping group while traveling?
- /kloc-is there anything to pay attention to when traveling to Europe at the end of 0/0?
- Where is the best place to travel in December?
- Do I have to shop for a package tour?
- Is the epidemic serious when traveling to Japan in 2021 to support the Olympics?