hbase的regionserver可以管理多少region?

記錄一次ambari集群報警信息解決方案

查看ambari管理平臺發現hbase Ave Load 變成紅色發出報警


hbase的regionserver可以管理多少region?

查詢253.44為什麼會爆紅,繼續排查理解此數字代表的含義,查看hbase監控界面,發現次數字代表hbase regionserver可以管理的region的數量,因此繼續查看資料


hbase的regionserver可以管理多少region?


In production scenarios, where you have a lot of data, you are normally concerned with the maximum number of regions you can have per server. too many regions has technical discussion on the subject. Basically, the maximum number of regions is mostly determined by memstore memory usage. Each region has its own memstores; these grow up to a configurable size; usually in 128-256 MB range, see hbase.hregion.memstore.flush.size. One memstore exists per column family (so there’s only one per region if there’s one CF in the table). The RS dedicates some fraction of total memory to its memstores (see hbase.regionserver.global.memstore.size). If this memory is exceeded (too much memstore usage), it can cause undesirable consequences such as unresponsive server or compaction storms. A good starting point for the number of regions per RS (assuming one table) is:

<code>((RS memory) * (total memstore fraction)) / ((memstore size)*(# column families))/<code>

This formula is pseudo-code. Here are two formulas using the actual tunable parameters, first for HBase 0.98+ and second for HBase 0.94.x.

HBase 0.98.x

<code>((RS Xmx) * hbase.regionserver.global.memstore.size) / (hbase.hregion.memstore.flush.size * (# column families))/<code>

HBase 0.94.x

<code>((RS Xmx) * hbase.regionserver.global.memstore.upperLimit) / (hbase.hregion.memstore.flush.size * (# column families))+/<code>

If a given RegionServer has 16 GB of RAM, with default settings, the formula works out to 16384*0.4/128 ~ 51 regions per RS is a starting point. The formula can be extended to multiple tables; if they all have the same configuration, just use the total number of families.

This number can be adjusted; the formula above assumes all your regions are filled at approximately the same rate. If only a fraction of your regions are going to be actively written to, you can divide the result by that fraction to get a larger region count. Then, even if all regions are written to, all region memstores are not filled evenly, and eventually jitter appears even if they are (due to limited number of concurrent flushes). Thus, one can have as many as 2-3 times more regions than the starting point; however, increased numbers carry increased risk.

For write-heavy workload, memstore fraction can be increased in configuration at the expense of block cache; this will also allow one to have more regions.

大概表示的意思:

Region數目上限

RegionServer的region數目取決於memstore的內存使用,每個region擁有一組memstore(memstore的數量有hstore決定,hstore的數據由創建表時的指定的列族個數決定,所以 每個region的memstore的個數 = 表的列族的個數 ),可以通過配置來修改memstore佔用內存的大小,一般設置在 128 M – 256M之間。

RegionServer 分配一定比例的內存給它下面的所有memstore( 該比例大小 可通過hbase.regionserver.global.memstore.upperLimit 進行修改 ), 如果內存溢出(使用了太多的memstore),它可能會導致嚴重的後果,如服務器反應遲鈍 或compact風暴。比較好的計算每RS(假設一個表)region的數量的公式為:

((RS memory) * (total memstore fraction)) / ((memstore size)*(# column families))

例如: 如果 一個RegionServer配置的內存是16g,使用默認配置( hbase默認regionserver分給memstore的比例是0.4 , 默認的menstore的佔用128M內存 ), 一個CF,那麼這個regionServer下的region的個數大約為 16384 * 0.4 / (128*1) = 51個,實際測試大於這個數 一兩倍 也沒太大的問題。 一個HBase表包含一至多個region,那麼表的數目上限也是可以估算出來的。


分享到:


相關文章: