Checking and Repairing HBase Tables
HBaseFsck (hbck) is a command-line tool that checks for region consistency and table integrity problems and repairs corruption. It works in two basic
modes — a read-only inconsistency identifying mode and a multi-phase read-write repair mode.
- Read-only inconsistency identification: In this mode, which is the default, a report is generated but no repairs are attempted.
- Read-write repair mode: In this mode, if errors are found, hbck attempts to repair them.
Always run HBase administrative commands such as the HBase Shell, hbck, or bulk-load commands as the HBase user (typically hbase).
Running hbck Manually
The hbck command is located in the bin directory of the HBase install.
- With no arguments, hbck checks HBase for inconsistencies and prints OK if no inconsistencies are found, or the number of inconsistencies otherwise.
- With the -details argument, hbck checks HBase for inconsistencies and prints a detailed report.
- To limit hbck to only checking specific tables, provide them as a space-separated list: hbck <table1> <table2>
- If region-level inconsistencies are found, use the -fix argument to direct hbck to try to fix them. The following sequence
of steps is followed:
- The standard check for inconsistencies is run.
- If needed, repairs are made to tables.
- If needed, repairs are made to regions. Regions are closed during repair.
- You can also fix individual region-level inconsistencies separately, rather than fixing them automatically with the -fix argument.
- -fixAssignments repairs unassigned, incorrectly assigned or multiply assigned regions.
- -fixMeta removes rows from hbase:meta when their corresponding regions are not present in HDFS and adds new meta rows if regions are present in HDFS but not in hbase:meta.
- -repairHoles creates HFiles for new empty regions on the filesystem and ensures that the new regions are consistent.
- -fixHdfsOrphans repairs a region directory that is missing a region metadata file (the .regioninfo file).
- -fixHdfsOverlaps fixes overlapping regions. You can further tune this argument using the following options:
- -maxMerge <n> controls the maximum number of regions to merge.
- -sidelineBigOverlaps attempts to sideline the regions which overlap the largest number of other regions.
- -maxOverlapsToSideline <n> limits the maximum number of regions to sideline.
- To try to repair all inconsistencies and corruption at once, use the -repair option, which includes all the region and table consistency options.
For more details about the hbck command, see Appendix C of the HBase Reference Guide.