hbase shell
-
参考
- HBase基本概念和hbase shell常用命令用法: http://blog.csdn.net/smcwwh/article/details/7468672
- hbase shell命令的使用: http://www.cnblogs.com/mbailing/p/3365769.html
- HBASE shell 常用命令大全: http://itindex.net/detail/46101-hbase-shell-%E5%91%BD%E4%BB%A4
- HBase 常用Shell命令: http://www.cnblogs.com/nexiyi/p/hbase_shell.html
- HBase总结(十三)HBase Shell 常用命令及例子: http://blog.csdn.net/lifuxiangcaohui/article/details/40042117
- hbase shell基础和常用命令详解: http://www.jb51.net/article/31172.htm
- HBase内置过滤器的一些总结:http://blog.csdn.net/cnweike/article/details/42920547
- hbase的内容查询:http://blog.csdn.net/zreodown/article/details/7917538/
- Filters In Hbase Shell: http://www.hadooptpoint.com/filters-in-hbase-shell/
- hbase-shell-commands: https://learnhbase.wordpress.com/2013/03/02/hbase-shell-commands/
- HBase Filter及对应Shell: http://www.cnblogs.com/skyl/p/4807793.html
基本概念
HBase是一个分布式的、面向列的开源数据库,源于google的一篇论文《bigtable:一个结构化数据的分布式存储系统》。HBase是Google Bigtable的开源实现,它利用Hadoop HDFS作为其文件存储系统,利用Hadoop MapReduce来处理HBase中的海量数据,利用Zookeeper作为协同服务。
HBase的表结构
HBase以表的形式存储数据。表有行和列组成。列划分为若干个列族/列簇(column family)。
RowKey | column-family1 | column-family2 | column-family3 | |||
column1 | column2 | column1 | column2 | column3 | column1 | |
key1 | t1.abc | t4:hi | ||||
t2.efg | t3:hello | t2:hello2 | ||||
key2 | t2:world | t3:world2 | ||||
key3 | t2:123 | t2:xxx | ||||
t1:456 | t1:zzz |
如上图所示:
- key1,key2,key3是三条记录的唯一的row key值
-
column-family1,column-family2,column-family3是三个列族,每个列族下又包括几列。
- 比如column-family1这个列族下包括两列,名字是column1和column2,t1:abc,t2:efg是由row key1和column-family1-column1唯一确定的一个单元cell。这个cell中有两个数据,abc和efg。两个值的时间戳不一样,分别是t1,t2, hbase会返回最新时间的值给请求者。
这些名词的具体含义如下:
Row Key
与nosql数据库们一样,row key是用来检索记录的主键。访问hbase table中的行,只有三种方式:
- 通过单个row key访问
- 通过row key的range
- 全表扫描
Row key行键 (Row key)可以是任意字符串(最大长度是 64KB,实际应用中长度一般为 10-100bytes),在hbase内部,row key保存为字节数组。
存储时,数据按照Row key的字典序(byte order)排序存储。设计key时,要充分排序存储这个特性,将经常一起读取的行存储放到一起。(位置相关性)
注意:
字典序对int排序的结果是1,10,100,11,12,13,14,15,16,17,18,19,2,20,21,…,9,91,92,93,94,95,96,97,98,99。要保持整形的自然序,行键必须用0作左填充。
行的一次读写是原子操作 (不论一次读写多少列)。这个设计决策能够使用户很容易的理解程序在对同一个行进行并发更新操作时的行为。
列族column family
hbase表中的每个列,都归属与某个列族。列族是表的chema的一部分(而列不是),必须在使用表之前定义。列名都以列族作为前缀。例如courses:history,courses:math 都属于 courses 这个列族。
访问控制、磁盘和内存的使用统计都是在列族层面进行的。实际应用中,列族上的控制权限能帮助我们管理不同类型的应用:我们允许一些应用可以添加新的基本数据、一些应用可以读取基本数据并创建继承的列族、一些应用则只允许浏览数据(甚至可能因为隐私的原因不能浏览所有数据)。
单元Cell
HBase中通过row和columns确定的为一个存贮单元称为cell。由{row key, column( =<family> + <label>), version} 唯一确定的单元。cell中的数据是没有类型的,全部是字节码形式存贮。
时间戳timestamp
每个cell都保存着同一份数据的多个版本。版本通过时间戳来索引。时间戳的类型是 64位整型。时间戳可以由hbase(在数据写入时自动 )赋值,此时时间戳是精确到毫秒的当前系统时间。时间戳也可以由客户显式赋值。如果应用程序要避免数据版本冲突,就必须自己生成具有唯一性的时间戳。每个cell中,不同版本的数据按照时间倒序排序,即最新的数据排在最前面。
为了避免数据存在过多版本造成的的管理 (包括存贮和索引)负担,hbase提供了两种数据版本回收方式。一是保存数据的最后n个版本,二是保存最近一段时间内的版本(比如最近七天)。用户可以针对每个列族进行设置。
常用命令
hbase提供了一个shell的终端给用户交互。通过执行 help get 可以看到命令的帮助信息。
hbase shell常用的操作命令有create,describe,disable,drop,list,scan,put,get,delete,deleteall,count,status等,通过help可以看到详细的用法。
HBase的shell操作,一个大概顺序就是操作关键词后跟表名,行名,列名这样的一个顺序,如果有其他条件再用花括号加上。
表的管理 1)查看有哪些表 hbase(main)> list 2)创建表 # 语法:create <table>, {NAME => <family>, VERSIONS => <VERSIONS>} # 例如:创建表t1,有两个family name:f1,f2,且版本数均为2 hbase(main)> create 't1',{NAME => 'f1', VERSIONS => 2},{NAME => 'f2', VERSIONS => 2} 3)删除表 分两步:首先disable,然后drop 例如:删除表t1 hbase(main)> disable 't1' hbase(main)> drop 't1' 4)查看表的结构 # 语法:describe <table> # 例如:查看表t1的结构 hbase(main)> describe 't1' 5)修改表结构 修改表结构必须先disable # 语法:alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'} # 例如:修改表test1的cf的TTL为180天 hbase(main)> disable 'test1' hbase(main)> alter 'test1',{NAME=>'body',TTL=>'15552000'},{NAME=>'meta', TTL=>'15552000'} hbase(main)> enable 'test1' 权限管理 1)分配权限 # 语法 : grant <user> <permissions> <table> <column family> <column qualifier> 参数后面用逗号分隔 # 权限用五个字母表示: "RWXCA". # READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') # 例如,给用户‘test'分配对表t1有读写的权限, hbase(main)> grant 'test','RW','t1' 2)查看权限 # 语法:user_permission <table> # 例如,查看表t1的权限列表 hbase(main)> user_permission 't1' 3)收回权限 # 与分配权限类似,语法:revoke <user> <table> <column family> <column qualifier> # 例如,收回test用户在表t1上的权限 hbase(main)> revoke 'test','t1' 表数据的增删改查 1)添加数据 # 语法:put <table>,<rowkey>,<family:column>,<value>,<timestamp> # 例如:给表t1的添加一行记录:rowkey是rowkey001,family name:f1,column name:col1,value:value01,timestamp:系统默认 hbase(main)> put 't1','rowkey001','f1:col1','value01' 用法比较单一。 2)查询数据 a)查询某行记录 # 语法:get <table>,<rowkey>,[<family:column>,....] # 例如:查询表t1,rowkey001中的f1下的col1的值 hbase(main)> get 't1','rowkey001', 'f1:col1' # 或者: hbase(main)> get 't1','rowkey001', {COLUMN=>'f1:col1'} # 查询表t1,rowke002中的f1下的所有列值 hbase(main)> get 't1','rowkey001' b)扫描表 # 语法:scan <table>, {COLUMNS => [ <family:column>,.... ], LIMIT => num} # 另外,还可以添加STARTROW、TIMERANGE和FITLER等高级功能 # 例如:扫描表t1的前5条数据 hbase(main)> scan 't1',{LIMIT=>5} c)查询表中的数据行数 # 语法:count <table>, {INTERVAL => intervalNum, CACHE => cacheNum} # INTERVAL设置多少行显示一次及对应的rowkey,默认1000;CACHE每次去取的缓存区大小,默认是10,调整该参数可提高查询速度 # 例如,查询表t1中的行数,每100条显示一次,缓存区为500 hbase(main)> count 't1', {INTERVAL => 100, CACHE => 500} 3)删除数据 a )删除行中的某个列值 # 语法:delete <table>, <rowkey>, <family:column> , <timestamp>,必须指定列名 # 例如:删除表t1,rowkey001中的f1:col1的数据 hbase(main)> delete 't1','rowkey001','f1:col1' 注:将删除改行f1:col1列所有版本的数据 b )删除行 # 语法:deleteall <table>, <rowkey>, <family:column> , <timestamp>,可以不指定列名,删除整行数据 # 例如:删除表t1,rowk001的数据 hbase(main)> deleteall 't1','rowkey001' c)删除表中的所有数据 # 语法: truncate <table> # 其具体过程是:disable table -> drop table -> create table # 例如:删除表t1的所有数据 hbase(main)> truncate 't1' * create/put 以网上的一个学生成绩表的例子来演示hbase的用法。 | name | grad | course | > | |--------|------|--------|-----| | \/ | \/ | math | art | | Tom | 5 | 97 | 87 | | Jerry | 4 | 89 | 80 | 这里grad对于表来说是一个列,course对于表来说是一个列族,这个列族由两个列组成math和art,当然我们可以根据我们的需要在course中建立更多的列族,如computer,physics等相应的列添加入course列族。图中需要注意的是,列族下面的列也是可以没有名字的。 {{{class="brush:bash" # 建立一个表格scores 具有两个列族grad 和courese create'test_scores','grade', 'course' list desc 'test_scores' scan 'test_scores' ROW COLUMN+CELL Jim column=course:, timestamp=1464446407298, value=80 Jim column=grade:, timestamp=1464446406438, value=4 Leo column=course:math, timestamp=1464450250315, value=99 Leo.Peng column=course:math, timestamp=1464451773801, value=70 Tom column=course:art, timestamp=1464446361576, value=87 Tom column=course:math, timestamp=1464446360394, value=97 Tom column=grade:, timestamp=1464446360354, value=5 4 row(s) in 0.0090 seconds scan 'test_scores',{FILTER=>"SingleColumnValueFilter('course','art',<,'binary:0')"} ROW COLUMN+CELL Jim column=course:, timestamp=1464446407298, value=80 Jim column=grade:, timestamp=1464446406438, value=4 Leo column=course:math, timestamp=1464450250315, value=99 Leo.Peng column=course:math, timestamp=1464451773801, value=70 3 row(s) in 0.0080 seconds scan 'test_scores',{FILTER=>"SingleColumnValueFilter('course','art',<,'binary:0')"} # 添加数据 # put ‘'t1', 'r1', 'c1', 'value', ts1 # t1指表名,r1指行键名,c1指列名,value指单元格值。ts1指时间戳,一般都省略掉了。 put 'test_scores','Tom','grade:','5' put 'test_scores','Tom','course:math','97' put 'test_scores','Tom','course:art','87' put 'test_scores','Jim','grade','4' put 'test_scores','Jim','course:','89' put 'test_scores','Jim','course:','80'
- get
get 't1', 'r1' get 't1', 'r1', {TIMERANGE => [ts1, ts2]} get 't1', 'r1', {COLUMN => 'c1'} get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']} get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1} get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4} get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4} get 't1', 'r1', 'c1' get 't1', 'r1', 'c1', 'c2' get 't1', 'r1', ['c1', 'c2']
- scan
扫描所有数据。
也可以指定一些修饰词:TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,or COLUMNS。没任何修饰词,就是上边例句,就会显示所有数据行。
scan '.META.' scan '.META.', {COLUMNS => 'info:regioninfo'} scan 't1', {COLUMNS => ['c1','c2'], LIMIT => 10, STARTROW =>'xyz'} scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804,1303668904]} scan 't1', {FILTER => "(PrefixFilter('row2') AND (QualifierFilter(>=,'binary:xyz'))) AND (TimestampsFilter(123,456))"} scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
-
将文本串转为二进制查询
scan 'VEH_REPAIR_HISTORY_ONE', {FILTER => org.apache.hadoop.hbase.filter.PrefixFilter.new(org.apache.hadoop.hbase.util.Bytes.toBytes('LFV2A1150B3666808'))}
help
#HBase Shell, version 1.1.2, rcc2b70cf03e3378800661ec5cab11eb43fafe0fc, Wed Aug 26 20:11:27 PDT 2015 #Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command. #Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group. COMMAND GROUPS: Group name: general Commands: status, table_help, version, whoami Group name: ddl Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters Group name: namespace Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables Group name: dml Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve Group name: tools Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, split, trace, unassign, wal_roll, zk_dump Group name: replication Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs Group name: snapshots Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot Group name: configuration Commands: update_all_config, update_config Group name: quotas Commands: list_quotas, set_quota Group name: security Commands: grant, revoke, user_permission Group name: visibility labels Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility SHELL USAGE: Quote all names in HBase Shell such as table and column names. Commas delimit command parameters. Type <RETURN> after entering a command to run it. Dictionaries of configuration used in the creation and alteration of tables are Ruby Hashes. They look like this: {'key1' => 'value1', 'key2' => 'value2', ...} and are opened and closed with curley-braces. Key/values are delimited by the '=>' character combination. Usually keys are predefined constants such as NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type 'Object.constants' to see a (messy) list of all constants in the environment. If you are using binary keys or values and need to enter them in the shell, use double-quote'd hexadecimal representation. For example: hbase> get 't1', "key\x03\x3f\xcd" hbase> get 't1', "key\003\023\011" hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40" The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added. For more on the HBase Shell, see http://hbase.apache.org/book.html
namespace
Command: alter_namespace Alter namespace properties. To add/modify a property: hbase> alter_namespace 'ns1', {METHOD => 'set', 'PROERTY_NAME' => 'PROPERTY_VALUE'} To delete a property: hbase> alter_namespace 'ns1', {METHOD => 'unset', NAME=>'PROERTY_NAME'} Command: create_namespace Create namespace; pass namespace name, and optionally a dictionary of namespace configuration. Examples: hbase> create_namespace 'ns1' hbase> create_namespace 'ns1', {'PROERTY_NAME'=>'PROPERTY_VALUE'} Command: describe_namespace Describe the named namespace. For example: hbase> describe_namespace 'ns1' Command: drop_namespace Drop the named namespace. The namespace must be empty. Command: list_namespace List all namespaces in hbase. Optional regular expression parameter could be used to filter the output. Examples: hbase> list_namespace hbase> list_namespace 'abc.*' Command: list_namespace_tables List all tables that are members of the namespace. Examples: hbase> list_namespace_tables 'ns1'
ddl
hbase(main):025:0> help 'ddl' Command: alter Alter a table. If the "hbase.online.schema.update.enable" property is set to false, then the table must be disabled (see help 'disable'). If the "hbase.online.schema.update.enable" property is set to true, tables can be altered without disabling them first. Altering enabled tables has caused problems in the past, so use caution and test it before using in production. You can use the alter command to add, modify or delete column families or change table configuration options. Column families work in a similar way as the 'create' command. The column family specification can either be a name string, or a dictionary with the NAME attribute. Dictionaries are described in the output of the 'help' command, with no arguments. For example, to change or add the 'f1' column family in table 't1' from current value to keep a maximum of 5 cell VERSIONS, do: hbase> alter 't1', NAME => 'f1', VERSIONS => 5 You can operate on several column families: hbase> alter 't1', 'f1', {NAME => 'f2', IN_MEMORY => true}, {NAME => 'f3', VERSIONS => 5} To delete the 'f1' column family in table 'ns1:t1', use one of: hbase> alter 'ns1:t1', NAME => 'f1', METHOD => 'delete' hbase> alter 'ns1:t1', 'delete' => 'f1' You can also change table-scope attributes like MAX_FILESIZE, READONLY, MEMSTORE_FLUSHSIZE, DURABILITY, etc. These can be put at the end; for example, to change the max size of a region to 128MB, do: hbase> alter 't1', MAX_FILESIZE => '134217728' You can add a table coprocessor by setting a table coprocessor attribute: hbase> alter 't1', 'coprocessor'=>'hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2' Since you can have multiple coprocessors configured for a table, a sequence number will be automatically appended to the attribute name to uniquely identify it. The coprocessor attribute must match the pattern below in order for the framework to understand how to load the coprocessor classes: [coprocessor jar file location] | class name | [priority] | [arguments] You can also set configuration settings specific to this table or column family: hbase> alter 't1', CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'} hbase> alter 't1', {NAME => 'f2', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}} You can also remove a table-scope attribute: hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE' hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'coprocessor$1' You can also set REGION_REPLICATION: hbase> alter 't1', {REGION_REPLICATION => 2} There could be more than one alteration in one command: hbase> alter 't1', { NAME => 'f1', VERSIONS => 3 }, { MAX_FILESIZE => '134217728' }, { METHOD => 'delete', NAME => 'f2' }, OWNER => 'johndoe', METADATA => { 'mykey' => 'myvalue' } Command: alter_async Alter column family schema, does not wait for all regions to receive the schema changes. Pass table name and a dictionary specifying new column family schema. Dictionaries are described on the main help command output. Dictionary must include name of column family to alter. For example, To change or add the 'f1' column family in table 't1' from defaults to instead keep a maximum of 5 cell VERSIONS, do: hbase> alter_async 't1', NAME => 'f1', VERSIONS => 5 To delete the 'f1' column family in table 'ns1:t1', do: hbase> alter_async 'ns1:t1', NAME => 'f1', METHOD => 'delete' or a shorter version: hbase> alter_async 'ns1:t1', 'delete' => 'f1' You can also change table-scope attributes like MAX_FILESIZE MEMSTORE_FLUSHSIZE, READONLY, and DEFERRED_LOG_FLUSH. For example, to change the max size of a family to 128MB, do: hbase> alter 't1', METHOD => 'table_att', MAX_FILESIZE => '134217728' There could be more than one alteration in one command: hbase> alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'} To check if all the regions have been updated, use alter_status <table_name> Command: alter_status Get the status of the alter command. Indicates the number of regions of the table that have received the updated schema Pass table name. hbase> alter_status 't1' hbase> alter_status 'ns1:t1' Command: create Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: Create a table with namespace=ns1 and table qualifier=t1 hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5} Create a table with namespace=default and table qualifier=t1 hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'} hbase> # The above in shorthand would be the following: hbase> create 't1', 'f1', 'f2', 'f3' hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true} hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}} Table configuration options can be put at the end. Examples: hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40'] hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40'] hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe' hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' } hbase> # Optionally pre-split the table into NUMREGIONS, using hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname) hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'} hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}} You can also keep around a reference to the created table: hbase> t1 = create 't1', 'f1' Which gives you a reference to the table named 't1', on which you can then call methods. Command: describe Describe the named table. For example: hbase> describe 't1' hbase> describe 'ns1:t1' Alternatively, you can use the abbreviated 'desc' for the same thing. hbase> desc 't1' hbase> desc 'ns1:t1' Command: disable Start disable of named table: hbase> disable 't1' hbase> disable 'ns1:t1' Command: disable_all Disable all of tables matching the given regex: hbase> disable_all 't.*' hbase> disable_all 'ns:t.*' hbase> disable_all 'ns:.*' Command: drop Drop the named table. Table must first be disabled: hbase> drop 't1' hbase> drop 'ns1:t1' Command: drop_all Drop all of the tables matching the given regex: hbase> drop_all 't.*' hbase> drop_all 'ns:t.*' hbase> drop_all 'ns:.*' Command: enable Start enable of named table: hbase> enable 't1' hbase> enable 'ns1:t1' Command: enable_all Enable all of the tables matching the given regex: hbase> enable_all 't.*' hbase> enable_all 'ns:t.*' hbase> enable_all 'ns:.*' Command: exists Does the named table exist? hbase> exists 't1' hbase> exists 'ns1:t1' Command: get_table Get the given table name and return it as an actual object to be manipulated by the user. See table.help for more information on how to use the table. Eg. hbase> t1 = get_table 't1' hbase> t1 = get_table 'ns1:t1' returns the table named 't1' as a table object. You can then do hbase> t1.help which will then print the help for that table. Command: is_disabled Is named table disabled? For example: hbase> is_disabled 't1' hbase> is_disabled 'ns1:t1' Command: is_enabled Is named table enabled? For example: hbase> is_enabled 't1' hbase> is_enabled 'ns1:t1' Command: list List all tables in hbase. Optional regular expression parameter could be used to filter the output. Examples: hbase> list hbase> list 'abc.*' hbase> list 'ns:abc.*' hbase> list 'ns:.*' Command: show_filters Show all the filters in hbase. Example: hbase> show_filters ColumnPrefixFilter TimestampsFilter PageFilter ..... KeyOnlyFilter
dml
hbase(main):037:0* help 'dml' Command: append Appends a cell 'value' at specified table/row/column coordinates. hbase> append 't1', 'r1', 'c1', 'value', ATTRIBUTES=>{'mykey'=>'myvalue'} hbase> append 't1', 'r1', 'c1', 'value', {VISIBILITY=>'PRIVATE|SECRET'} The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.append 'r1', 'c1', 'value', ATTRIBUTES=>{'mykey'=>'myvalue'} hbase> t.append 'r1', 'c1', 'value', {VISIBILITY=>'PRIVATE|SECRET'} Command: count Count the number of rows in a table. Return value is the number of rows. This operation may take a LONG time (Run '$HADOOP_HOME/bin/hadoop jar hbase.jar rowcount' to run a counting mapreduce job). Current count is shown every 1000 rows by default. Count interval may be optionally specified. Scan caching is enabled on count scans by default. Default cache size is 10 rows. If your rows are small in size, you may want to increase this parameter. Examples: hbase> count 'ns1:t1' hbase> count 't1' hbase> count 't1', INTERVAL => 100000 hbase> count 't1', CACHE => 1000 hbase> count 't1', INTERVAL => 10, CACHE => 1000 The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding commands would be: hbase> t.count hbase> t.count INTERVAL => 100000 hbase> t.count CACHE => 1000 hbase> t.count INTERVAL => 10, CACHE => 1000 Command: delete Put a delete cell value at specified table/row/column and optionally timestamp coordinates. Deletes must match the deleted cell's coordinates exactly. When scanning, a delete cell suppresses older versions. To delete a cell from 't1' at row 'r1' under column 'c1' marked with the time 'ts1', do: hbase> delete 'ns1:t1', 'r1', 'c1', ts1 hbase> delete 't1', 'r1', 'c1', ts1 hbase> delete 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'} The same command can also be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.delete 'r1', 'c1', ts1 hbase> t.delete 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'} Command: deleteall Delete all cells in a given row; pass a table name, row, and optionally a column and timestamp. Examples: hbase> deleteall 'ns1:t1', 'r1' hbase> deleteall 't1', 'r1' hbase> deleteall 't1', 'r1', 'c1' hbase> deleteall 't1', 'r1', 'c1', ts1 hbase> deleteall 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'} The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.deleteall 'r1' hbase> t.deleteall 'r1', 'c1' hbase> t.deleteall 'r1', 'c1', ts1 hbase> t.deleteall 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'} Command: get Get row or cell contents; pass table name, row, and optionally a dictionary of column(s), timestamp, timerange and versions. Examples: hbase> get 'ns1:t1', 'r1' hbase> get 't1', 'r1' hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]} hbase> get 't1', 'r1', {COLUMN => 'c1'} hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']} hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1} hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4} hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4} hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"} hbase> get 't1', 'r1', 'c1' hbase> get 't1', 'r1', 'c1', 'c2' hbase> get 't1', 'r1', ['c1', 'c2'] hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}} hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']} hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'} hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1} Besides the default 'toStringBinary' format, 'get' also supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the get specification. The FORMATTER can be stipulated: 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString) 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt', 'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] } Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot specify a FORMATTER for all columns of a column family. The same commands also can be run on a reference to a table (obtained via get_table or create_table). Suppose you had a reference t to table 't1', the corresponding commands would be: hbase> t.get 'r1' hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]} hbase> t.get 'r1', {COLUMN => 'c1'} hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']} hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1} hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4} hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4} hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"} hbase> t.get 'r1', 'c1' hbase> t.get 'r1', 'c1', 'c2' hbase> t.get 'r1', ['c1', 'c2'] hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'} hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1} Command: get_counter Return a counter cell value at specified table/row/column coordinates. A counter cell should be managed with atomic increment functions on HBase and the data should be binary encoded (as long value). Example: hbase> get_counter 'ns1:t1', 'r1', 'c1' hbase> get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.get_counter 'r1', 'c1' Command: get_splits Get the splits of the named table: hbase> get_splits 't1' hbase> get_splits 'ns1:t1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.get_splits Command: incr Increments a cell 'value' at specified table/row/column coordinates. To increment a cell value in table 'ns1:t1' or 't1' at row 'r1' under column 'c1' by 1 (can be omitted) or 10 do: hbase> incr 'ns1:t1', 'r1', 'c1' hbase> incr 't1', 'r1', 'c1' hbase> incr 't1', 'r1', 'c1', 1 hbase> incr 't1', 'r1', 'c1', 10 hbase> incr 't1', 'r1', 'c1', 10, {ATTRIBUTES=>{'mykey'=>'myvalue'}} hbase> incr 't1', 'r1', 'c1', {ATTRIBUTES=>{'mykey'=>'myvalue'}} hbase> incr 't1', 'r1', 'c1', 10, {VISIBILITY=>'PRIVATE|SECRET'} The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.incr 'r1', 'c1' hbase> t.incr 'r1', 'c1', 1 hbase> t.incr 'r1', 'c1', 10, {ATTRIBUTES=>{'mykey'=>'myvalue'}} hbase> t.incr 'r1', 'c1', 10, {VISIBILITY=>'PRIVATE|SECRET'} Command: put Put a cell 'value' at specified table/row/column and optionally timestamp coordinates. To put a cell value into table 'ns1:t1' or 't1' at row 'r1' under column 'c1' marked with the time 'ts1', do: hbase> put 'ns1:t1', 'r1', 'c1', 'value' hbase> put 't1', 'r1', 'c1', 'value' hbase> put 't1', 'r1', 'c1', 'value', ts1 hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}} hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}} hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'} The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase> t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}} Command: scan Scan a table; pass table name and optionally a dictionary of scanner specifications. Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP, MAXLENGTH or COLUMNS, CACHE or RAW, VERSIONS If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in 'col_family:'. The filter can be specified in two ways: 1. Using a filterString - more information on this is available in the Filter Language document attached to the HBASE-4176 JIRA 2. Using the entire package name of the filter. Some examples: hbase> scan 'hbase:meta' hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'} hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'} hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'} hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]} hbase> scan 't1', {REVERSED => true} hbase> scan 't1', {ROWPREFIXFILTER => 'row2', FILTER => " (QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"} hbase> scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)} hbase> scan 't1', {CONSISTENCY => 'TIMELINE'} For setting the Operation Attributes hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}} hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']} For experts, there is an additional option -- CACHE_BLOCKS -- which switches block caching for the scanner on (true) or off (false). By default it is enabled. Examples: hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false} Also for experts, there is an advanced option -- RAW -- which instructs the scanner to return all cells (including delete markers and uncollected deleted cells). This option cannot be combined with requesting specific COLUMNS. Disabled by default. Example: hbase> scan 't1', {RAW => true, VERSIONS => 10} Besides the default 'toStringBinary' format, 'scan' supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the scan specification. The FORMATTER can be stipulated: 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString) 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt', 'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] } Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot specify a FORMATTER for all columns of a column family. Scan can also be used directly from a table, by first getting a reference to a table, like such: hbase> t = get_table 't' hbase> t.scan Note in the above situation, you can still provide all the filtering, columns, options, etc as described above. Command: truncate Disables, drops and recreates the specified table. Command: truncate_preserve Disables, drops and recreates the specified table while still maintaing the previous region boundaries.
tools
Command: assign Assign a region. Use with caution. If region already assigned, this command will do a force reassign. For experts only. Examples: hbase> assign 'REGIONNAME' hbase> assign 'ENCODED_REGIONNAME' Command: balance_switch Enable/Disable balancer. Returns previous balancer state. Examples: hbase> balance_switch true hbase> balance_switch false Command: balancer Trigger the cluster balancer. Returns true if balancer ran and was able to tell the region servers to unassign all the regions to balance (the re-assignment itself is async). Otherwise false (Will not run if regions in transition). Command: balancer_enabled Query the balancer's state. Examples: hbase> balancer_enabled Command: catalogjanitor_enabled Query for the CatalogJanitor state (enabled/disabled?) Examples: hbase> catalogjanitor_enabled Command: catalogjanitor_run Catalog janitor command to run the (garbage collection) scan from command line. hbase> catalogjanitor_run Command: catalogjanitor_switch Enable/Disable CatalogJanitor. Returns previous CatalogJanitor state. Examples: hbase> catalogjanitor_switch true hbase> catalogjanitor_switch false Command: close_region Close a single region. Ask the master to close a region out on the cluster or if 'SERVER_NAME' is supplied, ask the designated hosting regionserver to close the region directly. Closing a region, the master expects 'REGIONNAME' to be a fully qualified region name. When asking the hosting regionserver to directly close a region, you pass the regions' encoded name only. A region name looks like this: TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. or Namespace:TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. The trailing period is part of the regionserver name. A region's encoded name is the hash at the end of a region name; e.g. 527db22f95c8a9e0116f0cc13c680396 (without the period). A 'SERVER_NAME' is its host, port plus startcode. For example: host187.example.com,60020,1289493121758 (find servername in master ui or when you do detailed status in shell). This command will end up running close on the region hosting regionserver. The close is done without the master's involvement (It will not know of the close). Once closed, region will stay closed. Use assign to reopen/reassign. Use unassign or move to assign the region elsewhere on cluster. Use with caution. For experts only. Examples: hbase> close_region 'REGIONNAME' hbase> close_region 'REGIONNAME', 'SERVER_NAME' hbase> close_region 'ENCODED_REGIONNAME' hbase> close_region 'ENCODED_REGIONNAME', 'SERVER_NAME' Command: compact Compact all regions in passed table or pass a region row to compact an individual region. You can also compact a single column family within a region. Examples: Compact all regions in a table: hbase> compact 'ns1:t1' hbase> compact 't1' Compact an entire region: hbase> compact 'r1' Compact only a column family within a region: hbase> compact 'r1', 'c1' Compact a column family within a table: hbase> compact 't1', 'c1' Command: compact_rs Compact all regions on passed regionserver. Examples: Compact all regions on a regionserver: hbase> compact_rs 'host187.example.com,60020' or hbase> compact_rs 'host187.example.com,60020,1289493121758' Major compact all regions on a regionserver: hbase> compact_rs 'host187.example.com,60020,1289493121758', true Command: flush Flush all regions in passed table or pass a region row to flush an individual region. For example: hbase> flush 'TABLENAME' hbase> flush 'REGIONNAME' hbase> flush 'ENCODED_REGIONNAME' Command: major_compact Run major compaction on passed table or pass a region row to major compact an individual region. To compact a single column family within a region specify the region name followed by the column family name. Examples: Compact all regions in a table: hbase> major_compact 't1' hbase> major_compact 'ns1:t1' Compact an entire region: hbase> major_compact 'r1' Compact a single column family within a region: hbase> major_compact 'r1', 'c1' Compact a single column family within a table: hbase> major_compact 't1', 'c1' Command: merge_region Merge two regions. Passing 'true' as the optional third parameter will force a merge ('force' merges regardless else merge will fail unless passed adjacent regions. 'force' is for expert use only). NOTE: You must pass the encoded region name, not the full region name so this command is a little different from other region operations. The encoded region name is the hash suffix on region names: e.g. if the region name were TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396 Examples: hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME' hbase> merge_region 'ENCODED_REGIONNAME', 'ENCODED_REGIONNAME', true Command: move Move a region. Optionally specify target regionserver else we choose one at random. NOTE: You pass the encoded region name, not the region name so this command is a little different to the others. The encoded region name is the hash suffix on region names: e.g. if the region name were TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396 A server name is its host, port plus startcode. For example: host187.example.com,60020,1289493121758 Examples: hbase> move 'ENCODED_REGIONNAME' hbase> move 'ENCODED_REGIONNAME', 'SERVER_NAME' Command: split Split entire table or pass a region to split individual region. With the second parameter, you can specify an explicit split key for the region. Examples: split 'tableName' split 'namespace:tableName' split 'regionName' # format: 'tableName,startKey,id' split 'tableName', 'splitKey' split 'regionName', 'splitKey' Command: trace Start or Stop tracing using HTrace. Always returns true if tracing is running, otherwise false. If the first argument is 'start', new span is started. If the first argument is 'stop', current running span is stopped. ('stop' returns false on success.) If the first argument is 'status', just returns if or not tracing is running. On 'start'-ing, you can optionally pass the name of span as the second argument. The default name of span is 'HBaseShell'. Repeating 'start' does not start nested span. Examples: hbase> trace 'start' hbase> trace 'status' hbase> trace 'stop' hbase> trace 'start', 'MySpanName' hbase> trace 'stop' Command: unassign Unassign a region. Unassign will close region in current location and then reopen it again. Pass 'true' to force the unassignment ('force' will clear all in-memory state in master before the reassign. If results in double assignment use hbck -fix to resolve. To be used by experts). Use with caution. For expert use only. Examples: hbase> unassign 'REGIONNAME' hbase> unassign 'REGIONNAME', true hbase> unassign 'ENCODED_REGIONNAME' hbase> unassign 'ENCODED_REGIONNAME', true Command: wal_roll Roll the log writer. That is, start writing log messages to a new file. The name of the regionserver should be given as the parameter. A 'server_name' is the host, port plus startcode of a regionserver. For example: host187.example.com,60020,1289493121758 (find servername in master ui or when you do detailed status in shell) Command: zk_dump Dump status of HBase cluster as seen by ZooKeeper. -------------------------------------------------------------------------------- WARNING: Above commands are for 'experts'-only as misuse can damage an install
replication
Command: add_peer A peer can either be another HBase cluster or a custom replication endpoint. In either case an id must be specified to identify the peer. For a HBase cluster peer, a cluster key must be provided and is composed like this: hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent This gives a full path for HBase to connect to another HBase cluster. An optional parameter for table column families identifies which column families will be replicated to the peer cluster. Examples: hbase> add_peer '1', "server1.cie.com:2181:/hbase" hbase> add_peer '2', "zk1,zk2,zk3:2182:/hbase-prod" hbase> add_peer '3', "zk4,zk5,zk6:11000:/hbase-test", "table1; table2:cf1; table3:cf1,cf2" hbase> add_peer '4', CLUSTER_KEY => "server1.cie.com:2181:/hbase" hbase> add_peer '5', CLUSTER_KEY => "server1.cie.com:2181:/hbase", TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] } For a custom replication endpoint, the ENDPOINT_CLASSNAME can be provided. Two optional arguments are DATA and CONFIG which can be specified to set different either the peer_data or configuration for the custom replication endpoint. Table column families is optional and can be specified with the key TABLE_CFS. hbase> add_peer '6', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint' hbase> add_peer '7', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint', DATA => { "key1" => 1 } hbase> add_peer '8', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint', CONFIG => { "config1" => "value1", "config2" => "value2" } hbase> add_peer '9', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint', DATA => { "key1" => 1 }, CONFIG => { "config1" => "value1", "config2" => "value2" }, hbase> add_peer '10', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint', TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] } hbase> add_peer '11', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint', DATA => { "key1" => 1 }, CONFIG => { "config1" => "value1", "config2" => "value2" }, TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] } Note: Either CLUSTER_KEY or ENDPOINT_CLASSNAME must be specified but not both. Command: append_peer_tableCFs Append a replicable table-cf config for the specified peer Examples: # append a table / table-cf to be replicable for a peer hbase> append_peer_tableCFs '2', "table4:cfA,cfB" Command: disable_peer Stops the replication stream to the specified cluster, but still keeps track of new edits to replicate. Examples: hbase> disable_peer '1' Command: disable_table_replication Disable a table's replication switch. Examples: hbase> disable_table_replication 'table_name' Command: enable_peer Restarts the replication to the specified peer cluster, continuing from where it was disabled. Examples: hbase> enable_peer '1' Command: enable_table_replication Enable a table's replication switch. Examples: hbase> enable_table_replication 'table_name' Command: list_peers List all replication peer clusters. hbase> list_peers Command: list_replicated_tables List all the tables and column families replicated from this cluster hbase> list_replicated_tables hbase> list_replicated_tables 'abc.*' Command: remove_peer Stops the specified replication stream and deletes all the meta information kept about it. Examples: hbase> remove_peer '1' Command: remove_peer_tableCFs Remove a table / table-cf from the table-cfs config for the specified peer Examples: # Remove a table / table-cf from the replicable table-cfs for a peer hbase> remove_peer_tableCFs '2', "table1" hbase> remove_peer_tableCFs '2', "table1:cf1" Command: set_peer_tableCFs Set the replicable table-cf config for the specified peer Examples: # set all tables to be replicable for a peer hbase> set_peer_tableCFs '1', "" hbase> set_peer_tableCFs '1' # set table / table-cf to be replicable for a peer, for a table without # an explicit column-family list, all replicable column-families (with # replication_scope == 1) will be replicated hbase> set_peer_tableCFs '2', "table1; table2:cf1,cf2; table3:cfA,cfB" Command: show_peer_tableCFs Show replicable table-cf config for the specified peer. hbase> show_peer_tableCFs -------------------------------------------------------------------------------- In order to use these tools, hbase.replication must be true.
snapshots
Command: clone_snapshot Create a new table by cloning the snapshot content. There are no copies of data involved. And writing on the newly created table will not influence the snapshot data. Examples: hbase> clone_snapshot 'snapshotName', 'tableName' hbase> clone_snapshot 'snapshotName', 'namespace:tableName' Command: delete_all_snapshot Delete all of the snapshots matching the given regex. Examples: hbase> delete_all_snapshot 's.*' Command: delete_snapshot Delete a specified snapshot. Examples: hbase> delete_snapshot 'snapshotName', Command: list_snapshots List all snapshots taken (by printing the names and relative information). Optional regular expression parameter could be used to filter the output by snapshot name. Examples: hbase> list_snapshots hbase> list_snapshots 'abc.*' Command: restore_snapshot Restore a specified snapshot. The restore will replace the content of the original table, bringing back the content to the snapshot state. The table must be disabled. Examples: hbase> restore_snapshot 'snapshotName' Command: snapshot Take a snapshot of specified table. Examples: hbase> snapshot 'sourceTable', 'snapshotName' hbase> snapshot 'namespace:sourceTable', 'snapshotName', {SKIP_FLUSH => true}
configuration
Command: update_all_config Reload a subset of configuration on all servers in the cluster. See http://hbase.apache.org/book.html?dyn_config for more details. Here is how you would run the command in the hbase shell: hbase> update_all_config Command: update_config Reload a subset of configuration on server 'servername' where servername is host, port plus startcode. For example: host187.example.com,60020,1289493121758 See http://hbase.apache.org/book.html?dyn_config for more details. Here is how you would run the command in the hbase shell: hbase> update_config 'servername'
quotas
Command: list_quotas List the quota settings added to the system. You can filter the result based on USER, TABLE, or NAMESPACE. For example: hbase> list_quotas hbase> list_quotas USER => 'bob.*' hbase> list_quotas USER => 'bob.*', TABLE => 't1' hbase> list_quotas USER => 'bob.*', NAMESPACE => 'ns.*' hbase> list_quotas TABLE => 'myTable' hbase> list_quotas NAMESPACE => 'ns.*' Command: set_quota Set a quota for a user, table, or namespace. Syntax : set_quota TYPE => <type>, <args> TYPE => THROTTLE The request limit can be expressed using the form 100req/sec, 100req/min and the size limit can be expressed using the form 100k/sec, 100M/min with (B, K, M, G, T, P) as valid size unit and (sec, min, hour, day) as valid time unit. Currently the throttle limit is per machine - a limit of 100req/min means that each machine can execute 100req/min. For example: hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => '10req/sec' hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => '10M/sec' hbase> set_quota TYPE => THROTTLE, USER => 'u1', TABLE => 't2', LIMIT => '5K/min' hbase> set_quota TYPE => THROTTLE, USER => 'u1', NAMESPACE => 'ns2', LIMIT => NONE hbase> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '10req/sec' hbase> set_quota TYPE => THROTTLE, TABLE => 't1', LIMIT => '10M/sec' hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => NONE hbase> set_quota USER => 'u1', GLOBAL_BYPASS => true
security
Command: grant Grant users specific rights. Syntax : grant <user> <permissions> [<@namespace> [<table> [<column family> [<column qualifier>]]] permissions is either zero or more letters from the set "RWXCA". READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') Note: Groups and users are granted access in the same way, but groups are prefixed with an '@' character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an '@' character. For example: hbase> grant 'bobsmith', 'RWXCA' hbase> grant '@admins', 'RWXCA' hbase> grant 'bobsmith', 'RWXCA', '@ns1' hbase> grant 'bobsmith', 'RW', 't1', 'f1', 'col1' hbase> grant 'bobsmith', 'RW', 'ns1:t1', 'f1', 'col1' Command: revoke Revoke a user's access rights. Syntax : revoke <user> [<@namespace> [<table> [<column family> [<column qualifier>]]]] Note: Groups and users access are revoked in the same way, but groups are prefixed with an '@' character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an '@' character. For example: hbase> revoke 'bobsmith' hbase> revoke '@admins' hbase> revoke 'bobsmith', '@ns1' hbase> revoke 'bobsmith', 't1', 'f1', 'col1' hbase> revoke 'bobsmith', 'ns1:t1', 'f1', 'col1' Command: user_permission Show all permissions for the particular user. Syntax : user_permission <table> Note: A namespace must always precede with '@' character. For example: hbase> user_permission hbase> user_permission '@ns1' hbase> user_permission 'table1' hbase> user_permission 'namespace1:table1' hbase> user_permission '.*' hbase> user_permission '^[A-C].*' -------------------------------------------------------------------------------- NOTE: Above commands are only applicable if running with the AccessController coprocessor
visibility labels
Command: add_labels Add a set of visibility labels. Syntax : add_labels [label1, label2] For example: hbase> add_labels ['SECRET','PRIVATE'] Command: clear_auths Clear visibility labels from a user or group Syntax : clear_auths 'user',[label1, label2] For example: hbase> clear_auths 'user1', ['SECRET','PRIVATE'] hbase> clear_auths '@group1', ['SECRET','PRIVATE'] Command: get_auths Get the visibility labels set for a particular user or group Syntax : get_auths 'user' For example: hbase> get_auths 'user1' hbase> get_auths '@group1' Command: list_labels List the visibility labels defined in the system. Optional regular expression parameter could be used to filter the labels being returned. Syntax : list_labels For example: hbase> list_labels 'secret.*' hbase> list_labels Command: set_auths Add a set of visibility labels for a user or group Syntax : set_auths 'user',[label1, label2] For example: hbase> set_auths 'user1', ['SECRET','PRIVATE'] hbase> set_auths '@group1', ['SECRET','PRIVATE'] Command: set_visibility Set the visibility expression on one or more existing cells. Pass table name, visibility expression, and a dictionary containing scanner specifications. Scanner specifications may include one or more of: TIMERANGE, FILTER, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP, or COLUMNS If no columns are specified, all columns will be included. To include all members of a column family, leave the qualifier empty as in 'col_family:'. The filter can be specified in two ways: 1. Using a filterString - more information on this is available in the Filter Language document attached to the HBASE-4176 JIRA 2. Using the entire package name of the filter. Examples: hbase> set_visibility 't1', 'A|B', {COLUMNS => ['c1', 'c2']} hbase> set_visibility 't1', '(A&B)|C', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]} hbase> set_visibility 't1', 'A&B&C', {ROWPREFIXFILTER => 'row2', FILTER => "(QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"} This command will only affect existing cells and is expected to be mainly useful for feature testing and functional verification. -------------------------------------------------------------------------------- NOTE: Above commands are only applicable if running with the VisibilityController coprocessor
HBase内置过滤器
- 原文:HBase内置过滤器的一些总结:http://blog.csdn.net/cnweike/article/details/42920547
- 作者:cnweike
- scan只返回rowkey
3. KeyOnlyFilter:这个过滤器唯一的功能就是只返回每行的行键,值全部为空,这对于只关注于行键的应用场景来说非常合适,这样忽略掉其值就可以减少传递到客户端的数据量,能起到一定的优化作用: KeyOnlyFilter: 只要key,不要value
Filter kof = new KeyOnlyFilter(); // OK 返回所有的行,但值全是空
6. FirstKeyOnlyFilter:如果你只想返回的结果集中只包含第一列的数据,那么这个过滤器能够满足你的要求。它在找到每行的第一列之后会停止扫描,从而使扫描的性能也得到了一定的提升: FirstKeyOnlyFilter: 一个rowkey可以有多个version,同一个rowkey的同一个column也会有多个的值, 只拿出key中的第一个column的第一个version
Filter fkof = new FirstKeyOnlyFilter(); // OK 筛选出第一个每个第一个单元格
scan 'toplist_ware_ios_1009_201231',{COLUMNS=>'info',FILTER=>"(FirstKeyOnlyFilter())"} scan 'test1', FILTER=>"FirstKeyOnlyFilter() AND ValueFilter(=,'binary:sku188') AND KeyOnlyFilter()" scan 'test1',{FILTER=>"FirstKeyOnlyFilter() AND KeyOnlyFilter()"} count 'table_name',{ INTERVAL =>1} get "LOCATION","FOTON_LMGFE1G88D1022014_\x7F\xFF\xFE\xAD\x1B\xFBn\x0F","d:direction" get "LOCATION","FOTON_LMGFE1G88D1022014_\x7F\xFF\xFE\xAD\x1B\xFBn\x0F","d:direction:toDouble"