Apache Accumulo

Friday March 20, 2015

Balancing Groups of Tablets

This post was moved to the Accumulo project site.

Accumulo has a pluggable tablet balancer that decides where tablets should be placed.  Accumulo's default configuration spreads each tables tablets evenly and randomly across the tablet servers.  Each table can configure a custom balancer that does something different.

For some applications to perform optimally, sub-ranges of a table need to be spread evenly across the cluster.  Over the years I have run into multiple use cases for this situation.  The latest use case was bad performance on the Fluo Stress Test.  This test stores a tree in an Accumulo table and creates multiple tablets for each level in the tree.  In parallel, the test reads data from one level and writes it up to the next level.  Figure 1 below shows an example of tablet servers hosting tablets for different levels of the tree.  Under this scenario if many threads are reading data from level 2 and writing up to level 1, only Tserver 1 and Tserver 2 will be utilized.  So in this scenario 50% of the tablet servers are idle.

Figure 1

Figure 1

ACCUMULO-3439 remedied this situation with the introduction of the GroupBalancer and RegexGroupBalancer which will be available in Accumulo 1.7.0.  These balancers allow a user to arbitrarily group tablets.  Each group defined by the user will be evenly spread across the tablet servers.  Also, the total number of groups on each tablet server is minimized.   As tablets are added or removed from the table, the balancer will migrate tablets to satisfy these goals.  Much of the complexity in the GroupBalancer code comes from trying to minimize the number of migrations needed to reach a good state.

A GroupBalancer could be configured for the table in figure 1 in such a way that it grouped tablets by level.  If this were done, the result may look like Figure 2 below.  With this tablet to tablet server mapping, many threads reading from level 2 and writing data up to level 1 would utilize all of the tablet servers yielding better performance.

Figure 2

Figure 2

README.rgbalancer provides a good example of configuring and using the RegexGroupBalancer.  If a regular expression can not accomplish the needed grouping, then a grouping function can be written in Java.  Extend GroupBalancer to write a grouping function in java.  RegexGroupBalancer provides a good example of how to do this.

When using a GroupBalancer, how Accumulo automatically splits tablets must be kept in mind.  When Accumulo decides to split a tablet, it chooses the shortest possible row prefix from the tablet data that yields a good split point. Therefore its possible that a split point that is shorter than what is expected by a GroupBalancer could be chosen.  The best way to avoid this situation is to pre-split the table such that it precludes this possibility.

The Fluo Stress test is a very abstract use case.  A more concrete use case for the group balancer would be using it to ensure tablets storing geographic data were spread out evenly.  For example consider GeoWave's Accumulo Persistence Model.  Tablets could be balanced such that bins related to different regions are spread out evenly.  For example tablets related to each continent could be assigned a group ensuring data related to each continent is evenly spread across the cluster.  Alternatively, each Tier could spread evenly across the cluster.  

Comments:

thanks for posting this information, keep updating like this

Posted by snaptube on May 20, 2016 at 05:20 AM UTC #

Wow , you really post a very informative idea, i thik this will helpful for many people

Posted by showbox on June 29, 2016 at 08:58 AM UTC #

thanks for the information.

Posted by ravi gupta on July 06, 2017 at 05:04 AM UTC #

The SSLC is an examination which is held in many states like Kerala, Karnataka and Tamil Nadu. Kerala SSLC exams will be held according to Kerala State Education Board (KSEB) officials declared in March they will conduct public exams.

Posted by www.results.itschool.gov.in on April 18, 2018 at 06:33 AM UTC #

Spotify Premium APK will be available in the market to download and complete the installation process. This application can make you to stream free music as per your wish.

Posted by Spotify Premium APK on May 03, 2018 at 08:05 AM UTC #

Spotify Premium APK will be available in the market to download and complete the installation process. This application can make you to stream free music as per your wish.

Posted by Spotify Premium apk on May 03, 2018 at 01:29 PM UTC #

My Lowe’s Life is a representative login entrance that can be utilized by the worker and previous representative of the organization.

Posted by myloweslife on June 22, 2018 at 11:57 AM UTC #

Walmart Credit Card is a store card which can be just utilized at Walmart retail locations and its site through the Walmart MasterCard

Posted by credit on June 22, 2018 at 01:35 PM UTC #

Post a Comment:
  • HTML Syntax: NOT allowed

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation