TRUNCATE TABLE Statement (CDH 5.5 or higher only)
Removes the data from an Impala table while leaving the table itself.
Syntax:
TRUNCATE [TABLE] [IF EXISTS] [db_name.]table_name
Statement type: DDL
Usage notes:
Often used to empty tables that are used during ETL cycles, after the data has been copied to another table for the next stage of processing. This statement is a low-overhead alternative to dropping and recreating the table, or using INSERT OVERWRITE to replace the data during the next ETL cycle.
This statement removes all the data and associated data files in the table. It can remove data files from internal tables, external tables, partitioned tables, and tables mapped to HBase or the Amazon Simple Storage Service (S3). The data removal applies to the entire table, including all partitions of a partitioned table.
Any statistics produced by the COMPUTE STATS statement are reset when the data is removed.
Make sure that you are in the correct database before truncating a table, either by issuing a USE statement first or by using a fully qualified name db_name.table_name.
The optional TABLE keyword does not affect the behavior of the statement.
The optional IF EXISTS clause makes the statement succeed whether or not the table exists. If the table does exist, it is truncated; if it does not exist, the statement has no effect. This capability is useful in standardized setup scripts that are might be run both before and after some of the tables exist. This clause is available in CDH 5.7 / Impala 2.5 and higher.
For other tips about managing and reclaiming Impala disk space, see Managing Disk Space for Impala Data.
Amazon S3 considerations:
Although Impala cannot write new data to a table stored in the Amazon S3 filesystem, the TRUNCATE TABLE statement can remove data files from S3. See Using Impala with the Amazon S3 Filesystem for details about working with S3 tables.
Cancellation: Cannot be cancelled.
HDFS permissions:
The user ID that the impalad daemon runs under, typically the impala user, must have write permission for all the files and directories that make up the table.
Kudu considerations:
Currently, the TRUNCATE TABLE statement cannot be used with Kudu tables.
Examples:
The following example shows a table containing some data and with table and column statistics. After the TRUNCATE TABLE statement, the data is removed and the statistics are reset.
CREATE TABLE truncate_demo (x INT); INSERT INTO truncate_demo VALUES (1), (2), (4), (8); SELECT COUNT(*) FROM truncate_demo; +----------+ | count(*) | +----------+ | 4 | +----------+ COMPUTE STATS truncate_demo; +-----------------------------------------+ | summary | +-----------------------------------------+ | Updated 1 partition(s) and 1 column(s). | +-----------------------------------------+ SHOW TABLE STATS truncate_demo; +-------+--------+------+--------------+-------------------+--------+-------------------+ | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | +-------+--------+------+--------------+-------------------+--------+-------------------+ | 4 | 1 | 8B | NOT CACHED | NOT CACHED | TEXT | false | +-------+--------+------+--------------+-------------------+--------+-------------------+ SHOW COLUMN STATS truncate_demo; +--------+------+------------------+--------+----------+----------+ | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size | +--------+------+------------------+--------+----------+----------+ | x | INT | 4 | -1 | 4 | 4 | +--------+------+------------------+--------+----------+----------+ -- After this statement, the data and the table/column stats will be gone. TRUNCATE TABLE truncate_demo; SELECT COUNT(*) FROM truncate_demo; +----------+ | count(*) | +----------+ | 0 | +----------+ SHOW TABLE STATS truncate_demo; +-------+--------+------+--------------+-------------------+--------+-------------------+ | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | +-------+--------+------+--------------+-------------------+--------+-------------------+ | -1 | 0 | 0B | NOT CACHED | NOT CACHED | TEXT | false | +-------+--------+------+--------------+-------------------+--------+-------------------+ SHOW COLUMN STATS truncate_demo; +--------+------+------------------+--------+----------+----------+ | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size | +--------+------+------------------+--------+----------+----------+ | x | INT | -1 | -1 | 4 | 4 | +--------+------+------------------+--------+----------+----------+
The following example shows how the IF EXISTS clause allows the TRUNCATE TABLE statement to be run without error whether or not the table exists:
CREATE TABLE staging_table1 (x INT, s STRING); Fetched 0 row(s) in 0.33s SHOW TABLES LIKE 'staging*'; +----------------+ | name | +----------------+ | staging_table1 | +----------------+ Fetched 1 row(s) in 0.25s -- Our ETL process involves removing all data from several staging tables -- even though some might be already dropped, or not created yet. TRUNCATE TABLE IF EXISTS staging_table1; Fetched 0 row(s) in 5.04s TRUNCATE TABLE IF EXISTS staging_table2; Fetched 0 row(s) in 0.25s TRUNCATE TABLE IF EXISTS staging_table3; Fetched 0 row(s) in 0.25s
Related information:
Overview of Impala Tables, ALTER TABLE Statement, CREATE TABLE Statement, Partitioning for Impala Tables, Internal Tables, External Tables