Apache HBase

Thursday October 08, 2015

Imgur Notifications: From MySQL to HBase

This is the third in a series of posts on "Why We Use Apache HBase", in which we let HBase users and developers borrow our blog so they can showcase their successful HBase use cases, talk about why they use HBase, and discuss what worked and what didn't.

Carlos J. Espinoza is an engineer at Imgur.

An earlier version of the discussion in this post was published here on the Imgur Engineering blog.

- Andrew Purtell

Imgur Notifications: From MySQL to HBase

Imgur is a heavy user of MySQL. It has been a part of our stack since our beginning. However, with our scale it has become increasingly difficult to throw more features at it. For our latest feature upgrade, we re-implemented our notifications system and migrated it over from MySQL to HBase. In this post, I will talk about how HBase solved our use case and the features we exploited.

To add some context, previously we supported two types of notifications: messages and comment replies, all stored in MySQL. For this upgrade, we decided to support several additional notification types. We also introduced rules around when a notification should be delivered. This change in spec made it challenging to continue with our previous model, so we started from scratch.

Early in the design phase, we envisioned a world where MySQL remained the primary store. We put together some schemas, some queries and, for the better, stopped ourselves from making a huge mistake. We had to create a couple columns for every type of notification. Creating a new notification type afterwards would mean a schema change. Our select queries would require us to join against other application tables. We designed an architecture that could work, but we would sacrifice decoupling, simplicity, scalability, and extensibility.

Some of our notifications require that they only be delivered to the user once a milestone is crossed. For instance, if a post reaches 100 points, Imgur will notify the user at the 100 point mark. We don’t want to bother users with 99 notifications in between. So, at scale, how do we know that the post has reached 100 points?

A notification could have multiple milestones. We only want to deliver once the milestone is hit.

Considering MySQL is our primary store for posts, one way to do it is to increment the points counter in the posts table, then execute another query fetching the row and check if the points reached the threshold of 100. This approach has a few issues. Increment and then fetch is a race condition. Two clients could think they reached 100 points, delivering two notifications for the same event. Another problem is the extra query. For every increment, we must now fetch the volatile column, adding more stress to MySQL.

Though it is technically possible to do this in MySQL using transactions and consistent read locks, lock contention would make it possibly very expensive with votes as it’s one the most frequent operations on our site. Seeing as we already use HBase in other parts of our stack, we switched gears and we built our system on top of it. Here is how we use it to power notifications in real time and at scale.

Sparse Columns

At Imgur, each notification is composed of one or more events. The following image of our notifications dropdown illustrates how our notifications are modeled.

As illustrated, each notification is composed of one or more events. A notification maps to a row in HBase, and each event maps to multiple columns, one of which is a counter. This model makes our columns very sparse as different types of notifications have different types of events. In MySQL, this would mean a lot of NULL values. We are not tied to a strict table schema, so we can easily add other notification types using the same model.

Atomic Increments

HBase has an atomic increment operation that returns the new value in the same call. It’s a simple feature, but our implementation depends on this operation. This allows our notification delivery logic on the client to be lightweight: increment and only deliver the notification if and only if a milestone is crossed. No extra calls. In some cases, this means we now keep two counters. For instance, the points count in the MySQL table for posts, and the points count in HBase for notifications. Technically, they could get out of sync, but this is an edge case we optimize for.

Another benefit of doing increments in HBase is that it allows us to decouple the notifications logic from the application logic. All we need to know to deliver a notification is whether its counter has crossed a pre-defined threshold. For instance, we no longer need to know how to fetch a post and get its point count. HBase has allowed us to solve our problem in a more generic fashion.

Fast Table Scans

We also maintain a secondary order table. It stores notification references ordered by when they were last delivered using reversed timestamps. When users open their notifications dropdown, we fetch their most recent notifications by performing a table scan limited by their user ID. We can also support scanning for different types of notifications by using scan filters.

Other Benefits

With HBase we gain many other benefits, like linear scalability and replication. We can forward data to another cluster and run batch jobs on that data. We also get strong consistency. It is extremely important for notifications to be delivered exactly once when a milestone is crossed. We can make that guarantee knowing that HBase has strong row level consistency. It’s likely that we’ll use versioning in the future, but even without use of it, HBase is a great choice for our use case.

Imgur notifications is a young project, and we’ll continue to make improvements to it. As it matures and we learn more from it, I hope to share what we’ve built with the open source community.


Nice post! Clear explanation of use cases and well fit into HBase. Good luck Imgur! p.s. Have you tried to use snapshots for data analysis on the same cluster? As I read "We can forward data to another cluster and run batch jobs on that data".

Posted by Wojciech on October 08, 2015 at 09:20 AM GMT #

Thanks god for my life

Posted by oyebolaharzan on October 13, 2015 at 11:12 PM GMT #

Thanks for the post. It would be great if you can share the code bits as well to get more on how this is achieved.

Posted by Buntu on October 22, 2015 at 07:19 PM GMT #

Thanks for the post. It would be great if you can share the code bits as well to get more on how this is achieved.

Posted by Le Tchat on April 07, 2016 at 12:37 PM GMT #

Even after a lot of efforts you can not find your desktop save document, then come to our tutorial and know how to access all the device folder. this is the best service provider, here is no need account login and free of cost.

Posted by Recent Documents Windows 10 on June 09, 2018 at 07:11 AM GMT #

Imgur notifications have helped me quite a lot, as well. I remember that when I used to work for the essay writing company, it was tehse notifications that helped me out. This is the absolute truth and no one can deny it.

Posted by wizessays on June 14, 2018 at 10:06 PM GMT #

Base of the patch and entry is ensured with the support of the vital and e instruments for the people. All the visitors of the https://www.brillassignment.co.uk/ have been done for the production for all citizens. The arrangement is upheld for the use of the candidates for the humans in grouts.

Posted by Milla Stang on July 05, 2018 at 05:51 AM GMT #

The base of patches and entries is guaranteed by the importance of the people and the support of the e-instrument. All visitors are made for the production of all citizens. The arrangement is maintained for the use of candidates for people in the grout.

Posted by pay someone to do your assignment on September 13, 2018 at 09:21 AM GMT #

Do you know that you can get FREE fuel points, $5000 gift cards, grocery card and digital points just by giving Krogerfeedback on their official website? Yes, Have you recently visited Kroger in one if it’s Grocery, Pharmacy or any of their marketplace? The the only requirement you need to apply to this offer is just you have to be above 18yrs .

Posted by kroger feedback on September 28, 2018 at 05:54 AM GMT #


Posted by clinique de chirurgie esthetique tunisie on October 02, 2018 at 08:57 AM GMT #

Can anyone please help me to do that. I am confused , I don't understand where to get complete guide of this.i'll share this blog with All of the audience of https://paperown.com but i want complete guide of imgur notification so i can share with my social network too.

Posted by Julie carter on October 06, 2018 at 09:43 AM GMT #

Great informative site. I'm really impressed after reading this blog post. I really appreciate the time and effort you spend to share this with us! I do hope to read more updates from you. http://gethappybirthdaywishes.xyz/

Posted by nathan andrews on October 17, 2018 at 05:14 AM GMT #

It is the information I am looking for. I'm learning more about this. I'm really impressed after reading this blog post. Hope you bring more things related to it. https://standardtopics.com/

Posted by Johan William on October 17, 2018 at 05:15 AM GMT #


Posted by agence de tourisme médical on November 05, 2018 at 10:08 AM GMT #

très bien exposé, merci https://tinyurl.com/j36jkss, séjour chirurgie esthétique Tunisie

Posted by chirurgie esthétique on November 06, 2018 at 09:08 PM GMT #

You discussed really valid and good points in your post and It really happened some person do not have the right idea. But with your awesome tips, one can easily identify it. You should also check details of MS office setup for any query and problem at https://godonnybrook.com/microsoft-office-setup/

Posted by MS office setup on November 21, 2018 at 05:21 AM GMT #

Apache is open source and distributed database. It is great to know the ins and out of the language. This is new and high in demand now. Many students who are leaning these languages get confused at times. I read about a question asked in the forum about the HBase. Many replies he got but not gets satisfied with the answers. Because nothing was working. He was making a new project and he got stuck somewhere. And Mike Dik, a freelancer at BestOnlineAssignmentHelp given the answer which could readly crack the problem. The students announced in the forum that he is the best Hbase tutor. So sometimes it becomes very difficult to deal with problems and only experts can help.

Posted by william_rose on December 04, 2018 at 09:07 AM GMT #

Apache is an awesome place to provide solutions for sql queries they had a contribution on wikipedia with a partnership of https://wikimanagementinc.com/

Posted by Bessilbe on December 12, 2018 at 05:49 PM GMT #

Nice post.

Posted by Essay Writer on December 20, 2018 at 06:56 AM GMT #

Apache never ceases to surprise. That was needed for the feature upgrade that we had been looking to for a few months as an improved Imugr notification system could further benefit without compromising scalability and extensibility. http://www.essayarsenal.co.uk/best-electrical-engineering-assignment-help.aspx

Posted by Terry Shaw on December 26, 2018 at 11:33 AM GMT #

two player card games https://games.lol/card/

Posted by Jim Harxmon on December 29, 2018 at 06:04 AM GMT #

If you are a university student then finding a Professional Assignment Writers In UK might not be a problem for you. There are a lot of writing services available out there but it is a student’s job to check whether the services are true to their claims and can provide you with a flawless assignment that you can submit without hesitation and get good grades.

Posted by Professional Assignment Writers Service UK on January 09, 2019 at 05:05 AM GMT #

Programming innovation, as pretty much every other sort of development, requires the capacity to work together and share thoughts with other individuals and to take a seat and chat with clients and get their criticism and comprehend their necessities. Individuals who are extremely genuine about Software should make their very own equipment.

Posted by Assignment Writer on January 14, 2019 at 08:55 AM GMT #


Posted by nis on January 16, 2019 at 06:38 AM GMT #

Post a Comment:
  • HTML Syntax: NOT allowed



Hot Blogs (today's hits)

Tag Cloud