hbase shell

基本概念

HBase是一个分布式的、面向列的开源数据库,源于google的一篇论文《bigtable:一个结构化数据的分布式存储系统》。HBase是Google Bigtable的开源实现,它利用Hadoop HDFS作为其文件存储系统,利用Hadoop MapReduce来处理HBase中的海量数据,利用Zookeeper作为协同服务。

HBase的表结构

HBase以表的形式存储数据。表有行和列组成。列划分为若干个列族/列簇(column family)。

RowKey column-family1 column-family2 column-family3
column1 column2 column1 column2 column3 column1
key1 t1.abc   t4:hi      
t2.efg   t3:hello   t2:hello2  
key2     t2:world   t3:world2  
key3   t2:123       t2:xxx
  t1:456       t1:zzz

如上图所示:

  1. key1,key2,key3是三条记录的唯一的row key值
  2. column-family1,column-family2,column-family3是三个列族,每个列族下又包括几列。
    • 比如column-family1这个列族下包括两列,名字是column1和column2,t1:abc,t2:efg是由row key1和column-family1-column1唯一确定的一个单元cell。这个cell中有两个数据,abc和efg。两个值的时间戳不一样,分别是t1,t2, hbase会返回最新时间的值给请求者。

这些名词的具体含义如下:

Row Key

与nosql数据库们一样,row key是用来检索记录的主键。访问hbase table中的行,只有三种方式:

  1. 通过单个row key访问
  2. 通过row key的range
  3. 全表扫描

Row key行键 (Row key)可以是任意字符串(最大长度是 64KB,实际应用中长度一般为 10-100bytes),在hbase内部,row key保存为字节数组。

存储时,数据按照Row key的字典序(byte order)排序存储。设计key时,要充分排序存储这个特性,将经常一起读取的行存储放到一起。(位置相关性)

注意:

字典序对int排序的结果是1,10,100,11,12,13,14,15,16,17,18,19,2,20,21,…,9,91,92,93,94,95,96,97,98,99。要保持整形的自然序,行键必须用0作左填充。

行的一次读写是原子操作 (不论一次读写多少列)。这个设计决策能够使用户很容易的理解程序在对同一个行进行并发更新操作时的行为。

列族column family

hbase表中的每个列,都归属与某个列族。列族是表的chema的一部分(而列不是),必须在使用表之前定义。列名都以列族作为前缀。例如courses:history,courses:math 都属于 courses 这个列族。

访问控制、磁盘和内存的使用统计都是在列族层面进行的。实际应用中,列族上的控制权限能帮助我们管理不同类型的应用:我们允许一些应用可以添加新的基本数据、一些应用可以读取基本数据并创建继承的列族、一些应用则只允许浏览数据(甚至可能因为隐私的原因不能浏览所有数据)。

单元Cell

HBase中通过row和columns确定的为一个存贮单元称为cell。由{row key, column( =<family> + <label>), version} 唯一确定的单元。cell中的数据是没有类型的,全部是字节码形式存贮。

时间戳timestamp

每个cell都保存着同一份数据的多个版本。版本通过时间戳来索引。时间戳的类型是 64位整型。时间戳可以由hbase(在数据写入时自动 )赋值,此时时间戳是精确到毫秒的当前系统时间。时间戳也可以由客户显式赋值。如果应用程序要避免数据版本冲突,就必须自己生成具有唯一性的时间戳。每个cell中,不同版本的数据按照时间倒序排序,即最新的数据排在最前面。

为了避免数据存在过多版本造成的的管理 (包括存贮和索引)负担,hbase提供了两种数据版本回收方式。一是保存数据的最后n个版本,二是保存最近一段时间内的版本(比如最近七天)。用户可以针对每个列族进行设置。

常用命令

hbase提供了一个shell的终端给用户交互。通过执行 help get 可以看到命令的帮助信息。

hbase shell常用的操作命令有create,describe,disable,drop,list,scan,put,get,delete,deleteall,count,status等,通过help可以看到详细的用法。

HBase的shell操作,一个大概顺序就是操作关键词后跟表名,行名,列名这样的一个顺序,如果有其他条件再用花括号加上。

表的管理
1)查看有哪些表
hbase(main)> list

2)创建表
# 语法:create <table>, {NAME => <family>, VERSIONS => <VERSIONS>}
# 例如:创建表t1,有两个family name:f1,f2,且版本数均为2
hbase(main)> create 't1',{NAME => 'f1', VERSIONS => 2},{NAME => 'f2', VERSIONS => 2}

3)删除表
分两步:首先disable,然后drop
例如:删除表t1
hbase(main)> disable 't1'
hbase(main)> drop 't1'

4)查看表的结构
# 语法:describe <table>
# 例如:查看表t1的结构
hbase(main)> describe 't1'

5)修改表结构
修改表结构必须先disable
# 语法:alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}
# 例如:修改表test1的cf的TTL为180天
hbase(main)> disable 'test1'
hbase(main)> alter 'test1',{NAME=>'body',TTL=>'15552000'},{NAME=>'meta', TTL=>'15552000'}
hbase(main)> enable 'test1'

权限管理
1)分配权限
# 语法 : grant <user> <permissions> <table> <column family> <column qualifier> 参数后面用逗号分隔
# 权限用五个字母表示: "RWXCA".
# READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A')
# 例如,给用户‘test'分配对表t1有读写的权限,
hbase(main)> grant 'test','RW','t1'

2)查看权限
# 语法:user_permission <table>
# 例如,查看表t1的权限列表
hbase(main)> user_permission 't1'

3)收回权限
# 与分配权限类似,语法:revoke <user> <table> <column family> <column qualifier>
# 例如,收回test用户在表t1上的权限
hbase(main)> revoke 'test','t1'

表数据的增删改查
1)添加数据
# 语法:put <table>,<rowkey>,<family:column>,<value>,<timestamp>
# 例如:给表t1的添加一行记录:rowkey是rowkey001,family name:f1,column name:col1,value:value01,timestamp:系统默认
hbase(main)> put 't1','rowkey001','f1:col1','value01'
用法比较单一。

2)查询数据
a)查询某行记录
# 语法:get <table>,<rowkey>,[<family:column>,....]
# 例如:查询表t1,rowkey001中的f1下的col1的值
hbase(main)> get 't1','rowkey001', 'f1:col1'
# 或者:
hbase(main)> get 't1','rowkey001', {COLUMN=>'f1:col1'}
# 查询表t1,rowke002中的f1下的所有列值
hbase(main)> get 't1','rowkey001'

b)扫描表
# 语法:scan <table>, {COLUMNS => [ <family:column>,.... ], LIMIT => num}
# 另外,还可以添加STARTROW、TIMERANGE和FITLER等高级功能
# 例如:扫描表t1的前5条数据
hbase(main)> scan 't1',{LIMIT=>5}

c)查询表中的数据行数
# 语法:count <table>, {INTERVAL => intervalNum, CACHE => cacheNum}
# INTERVAL设置多少行显示一次及对应的rowkey,默认1000;CACHE每次去取的缓存区大小,默认是10,调整该参数可提高查询速度
# 例如,查询表t1中的行数,每100条显示一次,缓存区为500
hbase(main)> count 't1', {INTERVAL => 100, CACHE => 500}

3)删除数据
a )删除行中的某个列值
# 语法:delete <table>, <rowkey>,  <family:column> , <timestamp>,必须指定列名
# 例如:删除表t1,rowkey001中的f1:col1的数据
hbase(main)> delete 't1','rowkey001','f1:col1'
注:将删除改行f1:col1列所有版本的数据

b )删除行
# 语法:deleteall <table>, <rowkey>,  <family:column> , <timestamp>,可以不指定列名,删除整行数据
# 例如:删除表t1,rowk001的数据
hbase(main)> deleteall 't1','rowkey001'

c)删除表中的所有数据
# 语法: truncate <table>
# 其具体过程是:disable table -> drop table -> create table
# 例如:删除表t1的所有数据
hbase(main)> truncate 't1'


* create/put

以网上的一个学生成绩表的例子来演示hbase的用法。

| name   | grad | course | >   |
|--------|------|--------|-----|
| \/     | \/   | math   | art |
| Tom    | 5    | 97     | 87  |
| Jerry  | 4    | 89     | 80  |

这里grad对于表来说是一个列,course对于表来说是一个列族,这个列族由两个列组成math和art,当然我们可以根据我们的需要在course中建立更多的列族,如computer,physics等相应的列添加入course列族。图中需要注意的是,列族下面的列也是可以没有名字的。

{{{class="brush:bash"
# 建立一个表格scores  具有两个列族grad 和courese
create'test_scores','grade', 'course'
list
desc 'test_scores'

scan 'test_scores'
ROW                                      COLUMN+CELL
 Jim                                     column=course:, timestamp=1464446407298, value=80
 Jim                                     column=grade:, timestamp=1464446406438, value=4
 Leo                                     column=course:math, timestamp=1464450250315, value=99
 Leo.Peng                                column=course:math, timestamp=1464451773801, value=70
 Tom                                     column=course:art, timestamp=1464446361576, value=87
 Tom                                     column=course:math, timestamp=1464446360394, value=97
 Tom                                     column=grade:, timestamp=1464446360354, value=5
4 row(s) in 0.0090 seconds

scan 'test_scores',{FILTER=>"SingleColumnValueFilter('course','art',<,'binary:0')"}
ROW                                      COLUMN+CELL
 Jim                                     column=course:, timestamp=1464446407298, value=80
 Jim                                     column=grade:, timestamp=1464446406438, value=4
 Leo                                     column=course:math, timestamp=1464450250315, value=99
 Leo.Peng                                column=course:math, timestamp=1464451773801, value=70
3 row(s) in 0.0080 seconds

scan 'test_scores',{FILTER=>"SingleColumnValueFilter('course','art',<,'binary:0')"}

# 添加数据
# put ‘'t1', 'r1', 'c1', 'value', ts1
# t1指表名,r1指行键名,c1指列名,value指单元格值。ts1指时间戳,一般都省略掉了。
put 'test_scores','Tom','grade:','5'
put 'test_scores','Tom','course:math','97'
put 'test_scores','Tom','course:art','87'

put 'test_scores','Jim','grade','4'
put 'test_scores','Jim','course:','89'
put 'test_scores','Jim','course:','80' 
get 't1', 'r1'
get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
get 't1', 'r1', {COLUMN => 'c1'}
get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
get 't1', 'r1', 'c1'
get 't1', 'r1', 'c1', 'c2'
get 't1', 'r1', ['c1', 'c2'] 

扫描所有数据。

也可以指定一些修饰词:TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,or COLUMNS。没任何修饰词,就是上边例句,就会显示所有数据行。

scan '.META.'
scan '.META.', {COLUMNS => 'info:regioninfo'}
scan 't1', {COLUMNS => ['c1','c2'], LIMIT => 10, STARTROW =>'xyz'}
scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804,1303668904]}
scan 't1', {FILTER => "(PrefixFilter('row2') AND (QualifierFilter(>=,'binary:xyz'))) AND (TimestampsFilter(123,456))"}
scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)} 

help

#HBase Shell, version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015
#Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
#Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.

COMMAND GROUPS:
  Group name: general
  Commands: status, table_help, version, whoami

  Group name: ddl
  Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, 
            enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters

  Group name: namespace
  Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, 
            list_namespace_tables

  Group name: dml
  Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, 
            truncate, truncate_preserve

  Group name: tools
  Commands: assign, balance_switch, balancer, balancer_enabled, 
            catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, 
            close_region, compact, compact_rs, flush, major_compact, merge_region, 
            move, split, trace, unassign, wal_roll, zk_dump

  Group name: replication
  Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, 
            enable_peer, enable_table_replication, list_peers, list_replicated_tables, 
            remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs

  Group name: snapshots
  Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, 
            restore_snapshot, snapshot

  Group name: configuration
  Commands: update_all_config, update_config

  Group name: quotas
  Commands: list_quotas, set_quota

  Group name: security
  Commands: grant, revoke, user_permission

  Group name: visibility labels
  Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

SHELL USAGE:
Quote all names in HBase Shell such as table and column names.  Commas delimit
command parameters.  Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:

  {'key1' => 'value1', 'key2' => 'value2', ...}

and are opened and closed with curley-braces.  Key/values are delimited by the
'=>' character combination.  Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc.  Constants do not need to be quoted.  Type
'Object.constants' to see a (messy) list of all constants in the environment.

If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:

  hbase> get 't1', "key\x03\x3f\xcd"
  hbase> get 't1', "key\003\023\011"
  hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html

namespace

Command: alter_namespace
Alter namespace properties.

To add/modify a property:

  hbase> alter_namespace 'ns1', {METHOD => 'set', 'PROERTY_NAME' => 'PROPERTY_VALUE'}

To delete a property:

  hbase> alter_namespace 'ns1', {METHOD => 'unset', NAME=>'PROERTY_NAME'}

Command: create_namespace
Create namespace; pass namespace name,
and optionally a dictionary of namespace configuration.
Examples:

  hbase> create_namespace 'ns1'
  hbase> create_namespace 'ns1', {'PROERTY_NAME'=>'PROPERTY_VALUE'}

Command: describe_namespace
Describe the named namespace. For example:
  hbase> describe_namespace 'ns1'

Command: drop_namespace
Drop the named namespace. The namespace must be empty.

Command: list_namespace
List all namespaces in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

  hbase> list_namespace
  hbase> list_namespace 'abc.*'

Command: list_namespace_tables
List all tables that are members of the namespace.
Examples:

  hbase> list_namespace_tables 'ns1'

ddl

hbase(main):025:0> help 'ddl'
Command: alter
Alter a table. If the "hbase.online.schema.update.enable" property is set to
false, then the table must be disabled (see help 'disable'). If the 
"hbase.online.schema.update.enable" property is set to true, tables can be 
altered without disabling them first. Altering enabled tables has caused problems 
in the past, so use caution and test it before using in production. 

You can use the alter command to add, 
modify or delete column families or change table configuration options.
Column families work in a similar way as the 'create' command. The column family
specification can either be a name string, or a dictionary with the NAME attribute.
Dictionaries are described in the output of the 'help' command, with no arguments.

For example, to change or add the 'f1' column family in table 't1' from 
current value to keep a maximum of 5 cell VERSIONS, do:

  hbase> alter 't1', NAME => 'f1', VERSIONS => 5

You can operate on several column families:

  hbase> alter 't1', 'f1', {NAME => 'f2', IN_MEMORY => true}, {NAME => 'f3', VERSIONS => 5}

To delete the 'f1' column family in table 'ns1:t1', use one of:

  hbase> alter 'ns1:t1', NAME => 'f1', METHOD => 'delete'
  hbase> alter 'ns1:t1', 'delete' => 'f1'

You can also change table-scope attributes like MAX_FILESIZE, READONLY, 
MEMSTORE_FLUSHSIZE, DURABILITY, etc. These can be put at the end;
for example, to change the max size of a region to 128MB, do:

  hbase> alter 't1', MAX_FILESIZE => '134217728'

You can add a table coprocessor by setting a table coprocessor attribute:

  hbase> alter 't1',
    'coprocessor'=>'hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2'

Since you can have multiple coprocessors configured for a table, a
sequence number will be automatically appended to the attribute name
to uniquely identify it.

The coprocessor attribute must match the pattern below in order for
the framework to understand how to load the coprocessor classes:

  [coprocessor jar file location] | class name | [priority] | [arguments]

You can also set configuration settings specific to this table or column family:

  hbase> alter 't1', CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}
  hbase> alter 't1', {NAME => 'f2', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

You can also remove a table-scope attribute:

  hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE'

  hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'coprocessor$1'

You can also set REGION_REPLICATION:

  hbase> alter 't1', {REGION_REPLICATION => 2}

There could be more than one alteration in one command:

  hbase> alter 't1', { NAME => 'f1', VERSIONS => 3 }, 
   { MAX_FILESIZE => '134217728' }, { METHOD => 'delete', NAME => 'f2' },
   OWNER => 'johndoe', METADATA => { 'mykey' => 'myvalue' }

Command: alter_async
Alter column family schema, does not wait for all regions to receive the
schema changes. Pass table name and a dictionary specifying new column
family schema. Dictionaries are described on the main help command output.
Dictionary must include name of column family to alter. For example,

To change or add the 'f1' column family in table 't1' from defaults
to instead keep a maximum of 5 cell VERSIONS, do:

  hbase> alter_async 't1', NAME => 'f1', VERSIONS => 5

To delete the 'f1' column family in table 'ns1:t1', do:

  hbase> alter_async 'ns1:t1', NAME => 'f1', METHOD => 'delete'

or a shorter version:

  hbase> alter_async 'ns1:t1', 'delete' => 'f1'

You can also change table-scope attributes like MAX_FILESIZE
MEMSTORE_FLUSHSIZE, READONLY, and DEFERRED_LOG_FLUSH.

For example, to change the max size of a family to 128MB, do:

  hbase> alter 't1', METHOD => 'table_att', MAX_FILESIZE => '134217728'

There could be more than one alteration in one command:

  hbase> alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}

To check if all the regions have been updated, use alter_status <table_name>

Command: alter_status
Get the status of the alter command. Indicates the number of regions of the
table that have received the updated schema
Pass table name.

hbase> alter_status 't1'
hbase> alter_status 'ns1:t1'

Command: create
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily 
including NAME attribute. 
Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
  
Table configuration options can be put at the end.
Examples:

  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
  hbase> # Optionally pre-split the table into NUMREGIONS, using
  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', 
  REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

You can also keep around a reference to the created table:

  hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then
call methods.

Command: describe
Describe the named table. For example:
  hbase> describe 't1'
  hbase> describe 'ns1:t1'

Alternatively, you can use the abbreviated 'desc' for the same thing.
  hbase> desc 't1'
  hbase> desc 'ns1:t1'

Command: disable
Start disable of named table:
  hbase> disable 't1'
  hbase> disable 'ns1:t1'

Command: disable_all
Disable all of tables matching the given regex:

hbase> disable_all 't.*'
hbase> disable_all 'ns:t.*'
hbase> disable_all 'ns:.*'

Command: drop
Drop the named table. Table must first be disabled:
  hbase> drop 't1'
  hbase> drop 'ns1:t1'

Command: drop_all
Drop all of the tables matching the given regex:

hbase> drop_all 't.*'
hbase> drop_all 'ns:t.*'
hbase> drop_all 'ns:.*'

Command: enable
Start enable of named table:
  hbase> enable 't1'
  hbase> enable 'ns1:t1'

Command: enable_all
Enable all of the tables matching the given regex:

hbase> enable_all 't.*'
hbase> enable_all 'ns:t.*'
hbase> enable_all 'ns:.*'

Command: exists
Does the named table exist?
  hbase> exists 't1'
  hbase> exists 'ns1:t1'

Command: get_table
Get the given table name and return it as an actual object to
be manipulated by the user. See table.help for more information
on how to use the table.
Eg.

  hbase> t1 = get_table 't1'
  hbase> t1 = get_table 'ns1:t1'

returns the table named 't1' as a table object. You can then do

  hbase> t1.help

which will then print the help for that table.

Command: is_disabled
Is named table disabled? For example:
  hbase> is_disabled 't1'
  hbase> is_disabled 'ns1:t1'

Command: is_enabled
Is named table enabled? For example:
  hbase> is_enabled 't1'
  hbase> is_enabled 'ns1:t1'

Command: list
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

  hbase> list
  hbase> list 'abc.*'
  hbase> list 'ns:abc.*'
  hbase> list 'ns:.*'

Command: show_filters
Show all the filters in hbase. Example:
  hbase> show_filters

  ColumnPrefixFilter
  TimestampsFilter
  PageFilter
  .....
  KeyOnlyFilter

dml

hbase(main):037:0* help 'dml'
Command: append
Appends a cell 'value' at specified table/row/column coordinates.

  hbase> append 't1', 'r1', 'c1', 'value', ATTRIBUTES=>{'mykey'=>'myvalue'}
  hbase> append 't1', 'r1', 'c1', 'value', {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.append 'r1', 'c1', 'value', ATTRIBUTES=>{'mykey'=>'myvalue'}
  hbase> t.append 'r1', 'c1', 'value', {VISIBILITY=>'PRIVATE|SECRET'}

Command: count
Count the number of rows in a table.  Return value is the number of rows.
This operation may take a LONG time (Run '$HADOOP_HOME/bin/hadoop jar
hbase.jar rowcount' to run a counting mapreduce job). Current count is shown
every 1000 rows by default. Count interval may be optionally specified. Scan
caching is enabled on count scans by default. Default cache size is 10 rows.
If your rows are small in size, you may want to increase this
parameter. Examples:

 hbase> count 'ns1:t1'
 hbase> count 't1'
 hbase> count 't1', INTERVAL => 100000
 hbase> count 't1', CACHE => 1000
 hbase> count 't1', INTERVAL => 10, CACHE => 1000

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding commands would be:

 hbase> t.count
 hbase> t.count INTERVAL => 100000
 hbase> t.count CACHE => 1000
 hbase> t.count INTERVAL => 10, CACHE => 1000

Command: delete
Put a delete cell value at specified table/row/column and optionally
timestamp coordinates.  Deletes must match the deleted cell's
coordinates exactly.  When scanning, a delete cell suppresses older
versions. To delete a cell from  't1' at row 'r1' under column 'c1'
marked with the time 'ts1', do:

  hbase> delete 'ns1:t1', 'r1', 'c1', ts1
  hbase> delete 't1', 'r1', 'c1', ts1
  hbase> delete 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

The same command can also be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.delete 'r1', 'c1',  ts1
  hbase> t.delete 'r1', 'c1',  ts1, {VISIBILITY=>'PRIVATE|SECRET'}

Command: deleteall
Delete all cells in a given row; pass a table name, row, and optionally
a column and timestamp. Examples:

  hbase> deleteall 'ns1:t1', 'r1'
  hbase> deleteall 't1', 'r1'
  hbase> deleteall 't1', 'r1', 'c1'
  hbase> deleteall 't1', 'r1', 'c1', ts1
  hbase> deleteall 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.deleteall 'r1'
  hbase> t.deleteall 'r1', 'c1'
  hbase> t.deleteall 'r1', 'c1', ts1
  hbase> t.deleteall 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

Command: get
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:

  hbase> get 'ns1:t1', 'r1'
  hbase> get 't1', 'r1'
  hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
  hbase> get 't1', 'r1', {COLUMN => 'c1'}
  hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
  hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
  hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
  hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
  hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
  hbase> get 't1', 'r1', 'c1'
  hbase> get 't1', 'r1', 'c1', 'c2'
  hbase> get 't1', 'r1', ['c1', 'c2']
  hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
  hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
  hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
  hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}

Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column.  A user can define a FORMATTER by adding it to the column name in the get
specification.  The FORMATTER can be stipulated: 

 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: 
  hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
    'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] } 

Note that you can specify a FORMATTER by column only (cf:qualifier).  You cannot specify
a FORMATTER for all columns of a column family.
    
The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:

  hbase> t.get 'r1'
  hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
  hbase> t.get 'r1', {COLUMN => 'c1'}
  hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
  hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
  hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
  hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
  hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
  hbase> t.get 'r1', 'c1'
  hbase> t.get 'r1', 'c1', 'c2'
  hbase> t.get 'r1', ['c1', 'c2']
  hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'}
  hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}

Command: get_counter
Return a counter cell value at specified table/row/column coordinates.
A counter cell should be managed with atomic increment functions on HBase
and the data should be binary encoded (as long value). Example:

  hbase> get_counter 'ns1:t1', 'r1', 'c1'
  hbase> get_counter 't1', 'r1', 'c1'

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.get_counter 'r1', 'c1'

Command: get_splits
Get the splits of the named table:
  hbase> get_splits 't1'
  hbase> get_splits 'ns1:t1'

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.get_splits

Command: incr
Increments a cell 'value' at specified table/row/column coordinates.
To increment a cell value in table 'ns1:t1' or 't1' at row 'r1' under column
'c1' by 1 (can be omitted) or 10 do:

  hbase> incr 'ns1:t1', 'r1', 'c1'
  hbase> incr 't1', 'r1', 'c1'
  hbase> incr 't1', 'r1', 'c1', 1
  hbase> incr 't1', 'r1', 'c1', 10
  hbase> incr 't1', 'r1', 'c1', 10, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> incr 't1', 'r1', 'c1', {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> incr 't1', 'r1', 'c1', 10, {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.incr 'r1', 'c1'
  hbase> t.incr 'r1', 'c1', 1
  hbase> t.incr 'r1', 'c1', 10, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> t.incr 'r1', 'c1', 10, {VISIBILITY=>'PRIVATE|SECRET'}

Command: put
Put a cell 'value' at specified table/row/column and optionally
timestamp coordinates.  To put a cell value into table 'ns1:t1' or 't1'
at row 'r1' under column 'c1' marked with the time 'ts1', do:

  hbase> put 'ns1:t1', 'r1', 'c1', 'value'
  hbase> put 't1', 'r1', 'c1', 'value'
  hbase> put 't1', 'r1', 'c1', 'value', ts1
  hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}

Command: scan
Scan a table; pass table name and optionally a dictionary of scanner
specifications.  Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP,
MAXLENGTH or COLUMNS, CACHE or RAW, VERSIONS

If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family:'.

The filter can be specified in two ways:
1. Using a filterString - more information on this is available in the
Filter Language document attached to the HBASE-4176 JIRA
2. Using the entire package name of the filter.

Some examples:

  hbase> scan 'hbase:meta'
  hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
  hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
  hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
  hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
  hbase> scan 't1', {REVERSED => true}
  hbase> scan 't1', {ROWPREFIXFILTER => 'row2', FILTER => "
    (QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"}
  hbase> scan 't1', {FILTER =>
    org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
  hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
For setting the Operation Attributes 
  hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
  hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false).  By
default it is enabled.  Examples:

  hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}

Also for experts, there is an advanced option -- RAW -- which instructs the
scanner to return all cells (including delete markers and uncollected deleted
cells). This option cannot be combined with requesting specific COLUMNS.
Disabled by default.  Example:

  hbase> scan 't1', {RAW => true, VERSIONS => 10}

Besides the default 'toStringBinary' format, 'scan' supports custom formatting
by column.  A user can define a FORMATTER by adding it to the column name in
the scan specification.  The FORMATTER can be stipulated: 

 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: 
  hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt',
    'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] } 

Note that you can specify a FORMATTER by column only (cf:qualifier).  You cannot
specify a FORMATTER for all columns of a column family.

Scan can also be used directly from a table, by first getting a reference to a
table, like such:

  hbase> t = get_table 't'
  hbase> t.scan

Note in the above situation, you can still provide all the filtering, columns,
options, etc as described above.


Command: truncate
  Disables, drops and recreates the specified table.

Command: truncate_preserve
  Disables, drops and recreates the specified table while still maintaing the previous region boundaries.

tools

Command: assign
Assign a region. Use with caution. If region already assigned,
this command will do a force reassign. For experts only.
Examples:

  hbase> assign 'REGIONNAME'
  hbase> assign 'ENCODED_REGIONNAME'

Command: balance_switch
Enable/Disable balancer. Returns previous balancer state.
Examples:

  hbase> balance_switch true
  hbase> balance_switch false

Command: balancer
Trigger the cluster balancer. Returns true if balancer ran and was able to
tell the region servers to unassign all the regions to balance  (the re-assignment itself is async). 
Otherwise false (Will not run if regions in transition).

Command: balancer_enabled
Query the balancer's state.
Examples:

  hbase> balancer_enabled

Command: catalogjanitor_enabled
Query for the CatalogJanitor state (enabled/disabled?)
Examples:

  hbase> catalogjanitor_enabled

Command: catalogjanitor_run
Catalog janitor command to run the (garbage collection) scan from command line.

  hbase> catalogjanitor_run


Command: catalogjanitor_switch
Enable/Disable CatalogJanitor. Returns previous CatalogJanitor state.
Examples:

  hbase> catalogjanitor_switch true
  hbase> catalogjanitor_switch false

Command: close_region
Close a single region.  Ask the master to close a region out on the cluster
or if 'SERVER_NAME' is supplied, ask the designated hosting regionserver to
close the region directly.  Closing a region, the master expects 'REGIONNAME'
to be a fully qualified region name.  When asking the hosting regionserver to
directly close a region, you pass the regions' encoded name only. A region
name looks like this:
 
 TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396.
or
 Namespace:TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396.

The trailing period is part of the regionserver name. A region's encoded name
is the hash at the end of a region name; e.g. 527db22f95c8a9e0116f0cc13c680396 
(without the period).  A 'SERVER_NAME' is its host, port plus startcode. For
example: host187.example.com,60020,1289493121758 (find servername in master ui
or when you do detailed status in shell).  This command will end up running
close on the region hosting regionserver.  The close is done without the
master's involvement (It will not know of the close).  Once closed, region will
stay closed.  Use assign to reopen/reassign.  Use unassign or move to assign
the region elsewhere on cluster. Use with caution.  For experts only.
Examples:

  hbase> close_region 'REGIONNAME'
  hbase> close_region 'REGIONNAME', 'SERVER_NAME'
  hbase> close_region 'ENCODED_REGIONNAME'
  hbase> close_region 'ENCODED_REGIONNAME', 'SERVER_NAME'

Command: compact
          Compact all regions in passed table or pass a region row
          to compact an individual region. You can also compact a single column
          family within a region.
          Examples:
          Compact all regions in a table:
          hbase> compact 'ns1:t1'
          hbase> compact 't1'
          Compact an entire region:
          hbase> compact 'r1'
          Compact only a column family within a region:
          hbase> compact 'r1', 'c1'
          Compact a column family within a table:
          hbase> compact 't1', 'c1'

Command: compact_rs
          Compact all regions on passed regionserver.
          Examples:
          Compact all regions on a regionserver:
          hbase> compact_rs 'host187.example.com,60020'
          or
          hbase> compact_rs 'host187.example.com,60020,1289493121758'
          Major compact all regions on a regionserver:
          hbase> compact_rs 'host187.example.com,60020,1289493121758', true

Command: flush
Flush all regions in passed table or pass a region row to
flush an individual region.  For example:

  hbase> flush 'TABLENAME'
  hbase> flush 'REGIONNAME'
  hbase> flush 'ENCODED_REGIONNAME'

Command: major_compact
          Run major compaction on passed table or pass a region row
          to major compact an individual region. To compact a single
          column family within a region specify the region name
          followed by the column family name.
          Examples:
          Compact all regions in a table:
          hbase> major_compact 't1'
          hbase> major_compact 'ns1:t1'
          Compact an entire region:
          hbase> major_compact 'r1'
          Compact a single column family within a region:
          hbase> major_compact 'r1', 'c1'
          Compact a single column family within a table:
          hbase> major_compact 't1', 'c1'

Command: merge_region
Merge two regions. Passing 'true' as the optional third parameter will force
a merge ('force' merges regardless else merge will fail unless passed
adjacent regions. 'force' is for expert use only).

NOTE: You must pass the encoded region name, not the full region name so
this command is a little different from other region operations.  The encoded
region name is the hash suffix on region names: e.g. if the region name were
TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then
the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396

Examples:

  hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME'
  hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME', true

Command: move
Move a region.  Optionally specify target regionserver else we choose one
at random.  NOTE: You pass the encoded region name, not the region name so
this command is a little different to the others.  The encoded region name
is the hash suffix on region names: e.g. if the region name were
TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then
the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396
A server name is its host, port plus startcode. For example:
host187.example.com,60020,1289493121758
Examples:

  hbase> move 'ENCODED_REGIONNAME'
  hbase> move 'ENCODED_REGIONNAME', 'SERVER_NAME'

Command: split
Split entire table or pass a region to split individual region.  With the 
second parameter, you can specify an explicit split key for the region.  
Examples:
    split 'tableName'
    split 'namespace:tableName'
    split 'regionName' # format: 'tableName,startKey,id'
    split 'tableName', 'splitKey'
    split 'regionName', 'splitKey'

Command: trace
Start or Stop tracing using HTrace.
Always returns true if tracing is running, otherwise false.
If the first argument is 'start', new span is started.
If the first argument is 'stop', current running span is stopped.
('stop' returns false on success.)
If the first argument is 'status', just returns if or not tracing is running.
On 'start'-ing, you can optionally pass the name of span as the second argument.
The default name of span is 'HBaseShell'.
Repeating 'start' does not start nested span.

Examples:

  hbase> trace 'start'
  hbase> trace 'status'
  hbase> trace 'stop'

  hbase> trace 'start', 'MySpanName'
  hbase> trace 'stop'


Command: unassign
Unassign a region. Unassign will close region in current location and then
reopen it again.  Pass 'true' to force the unassignment ('force' will clear
all in-memory state in master before the reassign. If results in
double assignment use hbck -fix to resolve. To be used by experts).
Use with caution.  For expert use only.  Examples:

  hbase> unassign 'REGIONNAME'
  hbase> unassign 'REGIONNAME', true
  hbase> unassign 'ENCODED_REGIONNAME'
  hbase> unassign 'ENCODED_REGIONNAME', true

Command: wal_roll
Roll the log writer. That is, start writing log messages to a new file.
The name of the regionserver should be given as the parameter.  A
'server_name' is the host, port plus startcode of a regionserver. For
example: host187.example.com,60020,1289493121758 (find servername in
master ui or when you do detailed status in shell)

Command: zk_dump
Dump status of HBase cluster as seen by ZooKeeper.

--------------------------------------------------------------------------------

WARNING: Above commands are for 'experts'-only as misuse can damage an install

replication

Command: add_peer
A peer can either be another HBase cluster or a custom replication endpoint. In either case an id
must be specified to identify the peer.

For a HBase cluster peer, a cluster key must be provided and is composed like this:
hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent
This gives a full path for HBase to connect to another HBase cluster. An optional parameter for
table column families identifies which column families will be replicated to the peer cluster.
Examples:

  hbase> add_peer '1', "server1.cie.com:2181:/hbase"
  hbase> add_peer '2', "zk1,zk2,zk3:2182:/hbase-prod"
  hbase> add_peer '3', "zk4,zk5,zk6:11000:/hbase-test", "table1; table2:cf1; table3:cf1,cf2"
  hbase> add_peer '4', CLUSTER_KEY => "server1.cie.com:2181:/hbase"
  hbase> add_peer '5', CLUSTER_KEY => "server1.cie.com:2181:/hbase",
    TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] }

For a custom replication endpoint, the ENDPOINT_CLASSNAME can be provided. Two optional arguments
are DATA and CONFIG which can be specified to set different either the peer_data or configuration
for the custom replication endpoint. Table column families is optional and can be specified with
the key TABLE_CFS.

  hbase> add_peer '6', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint'
  hbase> add_peer '7', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
    DATA => { "key1" => 1 }
  hbase> add_peer '8', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
    CONFIG => { "config1" => "value1", "config2" => "value2" }
  hbase> add_peer '9', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
    DATA => { "key1" => 1 }, CONFIG => { "config1" => "value1", "config2" => "value2" },
  hbase> add_peer '10', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
    TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] }
  hbase> add_peer '11', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
    DATA => { "key1" => 1 }, CONFIG => { "config1" => "value1", "config2" => "value2" },
    TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] }

Note: Either CLUSTER_KEY or ENDPOINT_CLASSNAME must be specified but not both.

Command: append_peer_tableCFs
Append a replicable table-cf config for the specified peer
Examples:

  # append a table / table-cf to be replicable for a peer
  hbase> append_peer_tableCFs '2', "table4:cfA,cfB"


Command: disable_peer
Stops the replication stream to the specified cluster, but still
keeps track of new edits to replicate.

Examples:

  hbase> disable_peer '1'

Command: disable_table_replication
Disable a table's replication switch.

Examples:

  hbase> disable_table_replication 'table_name'

Command: enable_peer
Restarts the replication to the specified peer cluster,
continuing from where it was disabled.

Examples:

  hbase> enable_peer '1'

Command: enable_table_replication
Enable a table's replication switch.

Examples:

  hbase> enable_table_replication 'table_name'

Command: list_peers
List all replication peer clusters.

  hbase> list_peers

Command: list_replicated_tables
List all the tables and column families replicated from this cluster

  hbase> list_replicated_tables
  hbase> list_replicated_tables 'abc.*'

Command: remove_peer
Stops the specified replication stream and deletes all the meta
information kept about it. Examples:

  hbase> remove_peer '1'

Command: remove_peer_tableCFs
Remove a table / table-cf from the table-cfs config for the specified peer
Examples:

  # Remove a table / table-cf from the replicable table-cfs for a peer
  hbase> remove_peer_tableCFs '2', "table1"
  hbase> remove_peer_tableCFs '2', "table1:cf1"


Command: set_peer_tableCFs
  Set the replicable table-cf config for the specified peer
  Examples:

    # set all tables to be replicable for a peer
    hbase> set_peer_tableCFs '1', ""
    hbase> set_peer_tableCFs '1'
    # set table / table-cf to be replicable for a peer, for a table without
    # an explicit column-family list, all replicable column-families (with
    # replication_scope == 1) will be replicated
    hbase> set_peer_tableCFs '2', "table1; table2:cf1,cf2; table3:cfA,cfB"


Command: show_peer_tableCFs
  Show replicable table-cf config for the specified peer.

    hbase> show_peer_tableCFs

--------------------------------------------------------------------------------

In order to use these tools, hbase.replication must be true.

snapshots

Command: clone_snapshot
Create a new table by cloning the snapshot content. 
There are no copies of data involved.
And writing on the newly created table will not influence the snapshot data.

Examples:
  hbase> clone_snapshot 'snapshotName', 'tableName'
  hbase> clone_snapshot 'snapshotName', 'namespace:tableName'

Command: delete_all_snapshot
Delete all of the snapshots matching the given regex. Examples:

  hbase> delete_all_snapshot 's.*'


Command: delete_snapshot
Delete a specified snapshot. Examples:

  hbase> delete_snapshot 'snapshotName',

Command: list_snapshots
List all snapshots taken (by printing the names and relative information).
Optional regular expression parameter could be used to filter the output
by snapshot name.

Examples:
  hbase> list_snapshots
  hbase> list_snapshots 'abc.*'

Command: restore_snapshot
Restore a specified snapshot.
The restore will replace the content of the original table,
bringing back the content to the snapshot state.
The table must be disabled.

Examples:
  hbase> restore_snapshot 'snapshotName'

Command: snapshot
Take a snapshot of specified table. Examples:

  hbase> snapshot 'sourceTable', 'snapshotName'
  hbase> snapshot 'namespace:sourceTable', 'snapshotName', {SKIP_FLUSH => true}

configuration

Command: update_all_config
Reload a subset of configuration on all servers in the cluster.  See
http://hbase.apache.org/book.html?dyn_config for more details. Here is how
you would run the command in the hbase shell:
  hbase> update_all_config

Command: update_config
Reload a subset of configuration on server 'servername' where servername is
host, port plus startcode. For example: host187.example.com,60020,1289493121758
See http://hbase.apache.org/book.html?dyn_config for more details. Here is how
you would run the command in the hbase shell:
  hbase> update_config 'servername'

quotas

Command: list_quotas
List the quota settings added to the system.
You can filter the result based on USER, TABLE, or NAMESPACE.

For example:

    hbase> list_quotas
    hbase> list_quotas USER => 'bob.*'
    hbase> list_quotas USER => 'bob.*', TABLE => 't1'
    hbase> list_quotas USER => 'bob.*', NAMESPACE => 'ns.*'
    hbase> list_quotas TABLE => 'myTable'
    hbase> list_quotas NAMESPACE => 'ns.*'

Command: set_quota
Set a quota for a user, table, or namespace.
Syntax : set_quota TYPE => <type>, <args>

TYPE => THROTTLE
The request limit can be expressed using the form 100req/sec, 100req/min
and the size limit can be expressed using the form 100k/sec, 100M/min
with (B, K, M, G, T, P) as valid size unit and (sec, min, hour, day) as valid time unit.
Currently the throttle limit is per machine - a limit of 100req/min
means that each machine can execute 100req/min.

For example:

    hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => '10req/sec'
    hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => '10M/sec'
    hbase> set_quota TYPE => THROTTLE, USER => 'u1', TABLE => 't2', LIMIT => '5K/min'
    hbase> set_quota TYPE => THROTTLE, USER => 'u1', NAMESPACE => 'ns2', LIMIT => NONE
    hbase> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '10req/sec'
    hbase> set_quota TYPE => THROTTLE, TABLE => 't1', LIMIT => '10M/sec'
    hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => NONE
    hbase> set_quota USER => 'u1', GLOBAL_BYPASS => true

security

Command: grant
Grant users specific rights.
Syntax : grant <user> <permissions> [<@namespace> [<table> [<column family> [<column qualifier>]]]

permissions is either zero or more letters from the set "RWXCA".
READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A')

Note: Groups and users are granted access in the same way, but groups are prefixed with an '@' 
      character. In the same way, tables and namespaces are specified, but namespaces are 
      prefixed with an '@' character.

For example:

    hbase> grant 'bobsmith', 'RWXCA'
    hbase> grant '@admins', 'RWXCA'
    hbase> grant 'bobsmith', 'RWXCA', '@ns1'
    hbase> grant 'bobsmith', 'RW', 't1', 'f1', 'col1'
    hbase> grant 'bobsmith', 'RW', 'ns1:t1', 'f1', 'col1'

Command: revoke
Revoke a user's access rights.
Syntax : revoke <user> [<@namespace> [<table> [<column family> [<column qualifier>]]]]

Note: Groups and users access are revoked in the same way, but groups are prefixed with an '@' 
      character. In the same way, tables and namespaces are specified, but namespaces are 
      prefixed with an '@' character.

For example:

    hbase> revoke 'bobsmith'
    hbase> revoke '@admins'
    hbase> revoke 'bobsmith', '@ns1'
    hbase> revoke 'bobsmith', 't1', 'f1', 'col1'
    hbase> revoke 'bobsmith', 'ns1:t1', 'f1', 'col1'

Command: user_permission
Show all permissions for the particular user.
Syntax : user_permission <table>

Note: A namespace must always precede with '@' character.

For example:

    hbase> user_permission
    hbase> user_permission '@ns1'
    hbase> user_permission 'table1'
    hbase> user_permission 'namespace1:table1'
    hbase> user_permission '.*'
    hbase> user_permission '^[A-C].*'

--------------------------------------------------------------------------------

NOTE: Above commands are only applicable if running with the AccessController coprocessor

visibility labels

Command: add_labels
Add a set of visibility labels.
Syntax : add_labels [label1, label2]

For example:

    hbase> add_labels ['SECRET','PRIVATE']

Command: clear_auths
Clear visibility labels from a user or group
Syntax : clear_auths 'user',[label1, label2]

For example:

    hbase> clear_auths 'user1', ['SECRET','PRIVATE']
    hbase> clear_auths '@group1', ['SECRET','PRIVATE']

Command: get_auths
Get the visibility labels set for a particular user or group
Syntax : get_auths 'user'

For example:

    hbase> get_auths 'user1'
    hbase> get_auths '@group1'

Command: list_labels
List the visibility labels defined in the system.
Optional regular expression parameter could be used to filter the labels being returned.
Syntax : list_labels

For example:

    hbase> list_labels 'secret.*'
    hbase> list_labels

Command: set_auths
Add a set of visibility labels for a user or group
Syntax : set_auths 'user',[label1, label2]

For example:

    hbase> set_auths 'user1', ['SECRET','PRIVATE']
    hbase> set_auths '@group1', ['SECRET','PRIVATE']

Command: set_visibility
Set the visibility expression on one or more existing cells.

Pass table name, visibility expression, and a dictionary containing
scanner specifications.  Scanner specifications may include one or more
of: TIMERANGE, FILTER, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP, or COLUMNS

If no columns are specified, all columns will be included.
To include all members of a column family, leave the qualifier empty as in
'col_family:'.

The filter can be specified in two ways:
1. Using a filterString - more information on this is available in the
Filter Language document attached to the HBASE-4176 JIRA
2. Using the entire package name of the filter.

Examples:

    hbase> set_visibility 't1', 'A|B', {COLUMNS => ['c1', 'c2']}
    hbase> set_visibility 't1', '(A&B)|C', {COLUMNS => 'c1',
        TIMERANGE => [1303668804, 1303668904]}
    hbase> set_visibility 't1', 'A&B&C', {ROWPREFIXFILTER => 'row2',
        FILTER => "(QualifierFilter (>=, 'binary:xyz')) AND
        (TimestampsFilter ( 123, 456))"}

This command will only affect existing cells and is expected to be mainly
useful for feature testing and functional verification.

--------------------------------------------------------------------------------

NOTE: Above commands are only applicable if running with the VisibilityController coprocessor

HBase内置过滤器

3. KeyOnlyFilter:这个过滤器唯一的功能就是只返回每行的行键,值全部为空,这对于只关注于行键的应用场景来说非常合适,这样忽略掉其值就可以减少传递到客户端的数据量,能起到一定的优化作用: KeyOnlyFilter: 只要key,不要value

Filter kof = new KeyOnlyFilter(); // OK 返回所有的行,但值全是空

6. FirstKeyOnlyFilter:如果你只想返回的结果集中只包含第一列的数据,那么这个过滤器能够满足你的要求。它在找到每行的第一列之后会停止扫描,从而使扫描的性能也得到了一定的提升: FirstKeyOnlyFilter: 一个rowkey可以有多个version,同一个rowkey的同一个column也会有多个的值, 只拿出key中的第一个column的第一个version

Filter fkof = new FirstKeyOnlyFilter(); // OK 筛选出第一个每个第一个单元格

scan 'toplist_ware_ios_1009_201231',{COLUMNS=>'info',FILTER=>"(FirstKeyOnlyFilter())"}
scan 'test1', FILTER=>"FirstKeyOnlyFilter() AND ValueFilter(=,'binary:sku188') AND KeyOnlyFilter()"
scan 'test1',{FILTER=>"FirstKeyOnlyFilter() AND KeyOnlyFilter()"}
count 'table_name',{ INTERVAL =>1}
get "LOCATION","FOTON_LMGFE1G88D1022014_\x7F\xFF\xFE\xAD\x1B\xFBn\x0F","d:direction"
get "LOCATION","FOTON_LMGFE1G88D1022014_\x7F\xFF\xFE\xAD\x1B\xFBn\x0F","d:direction:toDouble"