Entries tagged [sql]

Thursday May 31, 2018

Apache Ignite 2.5: Scaling to 1000s Nodes Clusters

Apache Ignite was always appreciated by its users for two primary things it delivers - scalability and performance. And now it grew to the point when the community decided to revisit its discovery subsystem that influences how well and far the database scales out. The goal was pretty clear - Ignite has to scale to 1000s of nodes as good as it scales to 100s now. Check what we did to solve the challenge.

[Read More]

Wednesday November 01, 2017

Apache Ignite 2.3 - More SQL and Persistence Capabilities

Putting aside the regular bug fixes and performance optimizations, the Apache Ignite 2.3 release brings new SQL capabilities and Ignite persistence improvements that are worth mentioning.


Let's start with SQL first.

Apache Ignite users have consistently told us that despite all of Ignite’s SQL capabilities, it’s been at times challenging trying to figure out how to start using Ignite as an SQL database.

This was mostly caused by scattered documentation pages, lack of “getting started” guides and tutorials. We’ve remedied this oversight! All related SQL knowledge has been curated in a single documentation domain.

Are you curious about the SQL scope? Go to the new SQL Reference Overview section!

Cannot wait to learn how the Ignite SQL engine runs internally? We’ve prepared an Architectural Overview section for you.

Simply need to know how to connect to an Ignite cluster from an SQL tool? Here is a tooling section for you.

Let’s take a look at some specific SQL features released in Ignite 2.3.

First, we’re proud to deliver support of ALTER TABLE command. Presently, the command allows adding new columns to an SQL schema in runtime -- avoiding any cluster restarts. Once a new column is added, it can be turned into an index. Again, in runtime. No restarts!

Another significant addition seen in Ignite 2.3 is the integration with SQLLine tool that is bundled with every Apache Ignite release and can be used as a default command line tool for SQL based interactions.

To prove that it's fairly simple to work with Ignite as with an SQL database using the tool, we recorded a short screencast for you:


Ignite Persistence

Ignite native persistence keeps getting more attention and installs -- which is why the community released a feature requested by at least a dozen users. The feature allows enabling the persistence for specific data sets. Before Ignite version 2.3, the persistence could be enabled globally only.

Now, it's up to you to decide which data to persist and which to store in RAM only. The persistence can be configured via data regions as shown below:


This data region will consume up to 500 MB of RAM and will store a superset of data on disk ensuring that no data loss happens in case of a crash or even if there is no more space left in RAM.

Anything else?

Flip through our release notes to see all the changes and improvements available in Apache Ignite 2.3 -- and, for sure, download and use this version in production.

Questions, comments? Let us know!

Thursday July 27, 2017

Apache Ignite 2.1 - A Leap from In-Memory to Memory-Centric Architecture

The power and beauty of in-memory computing projects are that they truly do what they state -- deliver outstanding performance improvements by moving data closer to the CPU, using RAM as a storage and spreading the data sets out across a cluster of machines relying on horizontal scalability.

However, there is an unspoken side of the story. No matter how fast a platform is, we do not want to lose the data and encounter cluster restarts or other outages. To guarantee this we need to somehow make data persistent on the disk.

Most in-memory computing projects address the persistence dilemma by giving a way to sync data back to a relational database (RDBMS). That sounds reasonable and undoubtedly works pretty well in practice, but if we dig deeper, you’ll likely encounter the following limitations:

  • RDBMS is a bottleneck. No matter how fast your in-memory technology project, you can accelerate read operations only because every write has to be persisted to the disk -- which is usually a single machine running an RDBMS instance.
  • RDBMS is a single point of failure. Your distributed in-memory cluster usually consists of dozens and even hundreds of nodes which means you can safely lose this node here or drop that node there without worrying about data consistency and availability. However, if the RDBMS used by the cluster fails, then what? The answer is obvious -- the cluster can no longer be utilized because the RAM and disk parts go out of sync.
  • SQL over RAM only. Apache® Ignite™ provides SQL database capabilities, however, you can only leverage them if all of the data and indexes are located in RAM. If a single piece of data, represented by a disk copy located in the RDBMS, then an Ignite-only SQL query will return an incomplete result set.
  • Requried RAM warmup. When your cluster goes down, you have to restart it and preload all of the data from the RDBMS to RAM. That’s essential if you use SQL or similar advanced querying languages. This dramatically increases the overall time of the downtime and can cost a lot of money.

  • If you use either Apache Ignite 1.x or 2.0 along with the RDBMS for disk storage, then you will hit these limitations. It’s just the way in-memory architectures are integrated with the disk.

    However, the limitations are no longer relevant for Apache Ignite 2.1! This version made a leap from in-memory to a memory-centric architecture that:

  • Keeps using RAM as a first memory tier for data and indexes -- giving all of the benefits you had before.

  • Supports durability criteria by treating disk as a secondary and larger tier that works in a distributed fashion and seamlessly integrates with the whole memory architecture.

  • Supports the instantaneous cluster restarts -- once your cluster is up and running there is no reason to wait for RAM's warmup, go ahead and turn on back your applications that can safely execute all operations including SQL. The data and indexes will be taken from disk.

  • Curious about how Ignite achieved these huge advantages? Lifting the curtain….

    Durable Memory Architecture

    The Apache Ignite memory-centric platform is based on the durable memory architecture that allows storing and processing data and indexes both in-memory and on disk when the Ignite Persistent Store is enabled. The memory architecture helps to achieve in-memory performance with the durability of the disk using all of the resources available in the cluster.

    The durable memory is built and operates in a way similar to the virtual memory of operating systems such as Linux. However, the one significant difference between these two types of architectures is that the durable memory one always keeps the whole data set and indexes on the disk -- if the Ignite Persistent Store is enabled -- while the virtual memory uses the disk for swapping purposes only.

    Ignite Persistent Store

    Persistent Store is a distributed ACID and SQL-compliant disk store that transparently integrates with the durable memory as an optional disk layer (SSD, Flash, 3D XPoint). Having the store enabled, you no longer need to keep all of the data in memory or warm-up the RAM after a whole cluster restart. The persistent store will keep the superset of data and all the SQL indexes on the disk making Ignite fully operational from the disk.

    The Upshot

    Tired of hooking up Ignite with an RDBMS? Go ahead and download Apache Ignite 2.1, enable Ignite Persistent Store, and launch your first durable Ignite cluster that distributes data sets and workloads relying on the performance of RAM and durability of the disk!

    Finally, Apache Ignite 2.1 can boast about another achievements in .NET, C++, SQL and Machine Learning. Go ahead and discover them!

    Friday May 05, 2017

    Apache Ignite 2.0: Redesigned Off-heap Memory, DDL and Machine Learning

    We released the long-awaited Apache Ignite version 2.0 on May 5. The community spent almost a year incorporating tremendous changes to the legacy Apache Ignite 1.x architecture. And all of that effort paid off. Our collective blood, sweat (and perhaps even a few tears) opened up new and exciting opportunities for the Apache Ignite project.

    Have I piqued your interest about this new release yet? Let's walk through some of the main new features that have appeared under the hood of Apache Ignite 2.0.

    Reengineered Off-Heap Memory Architecture.

    The platform’s entire memory architecture was reengineered from scratch. In a nutshell, all of the data and indexes are now stored in a completely new manageable off-heap memory that has no issues with memory fragmentation, accelerates SQL Grid significantly and helps your application easily tolerate Java GC pauses.

    Take a peek at the illustration below and try to guess what’s changed. Afterward, please read this documentation to see if your eye caught everything that’s new.

    Here’s something extremely noteworthy: the architecture now integrates seamlessly with disk drives. Why do we care about this? Stay tuned!

    Data Definition Language.

    This release introduces support for Data Definition Language (DDL) as a part of its SQL Grid functionality. Now you can define -- and, what’s more important, alter -- indexes in runtime without the need to restart your cluster. Apache Ignite users have long awaited this feature! Even more exciting news: users can leverage this with standard SQL commands like CREATE or DROP index. This is only the beginning! Go to this page to learn more about current DDL support.

    Machine Learning Grid Beta - Distributed Algebra.

    Apache Ignite is about more than in-memory storage. And it’s not just one more product for distributed computations or real-time streaming. It's much, much more than that. It's a hot blend of well-integrated distributed and highly concurrent modules that turned Apache Ignite into what is today: A robust data-fabric and framework with the goal of making your application thrive and outperform even the best of expectations.

    But there was one thing missing until now. Drumroll, please: machine-learning support!

    With Apache Ignite 2.0 you can check project’s own distributed algebra implementation. The distributed algebra is the foundation of the entire component. And soon you can expect to get distributed versions of widely used regression algorithms, decision trees and more.

    Spring Data Integration.

    Spring Data integration allows the interaction of an Apache Ignite cluster using the well-known and highly adopted Spring Data Framework. You can connect to the cluster by means of Spring Data repositories and start executing distributed SQL queries as well as simple CRUD operations.

    Rocket MQ

    Are you using Rocket MQ in your project and need to push data from the Rocket to Ignite? Here is an easy solution.

    Hibernate 5.

    Hibernate L2 cache users have been anticipating support of Hibernate 5 on Apache Ignite for quite a long time. Apache Ignite 2.0 grants this desire. The integration now supports Hibernate 5 and contains a number of bug fixes and improvements.


    Ignite.NET has been enhanced with an addition of a plugin system that allows the writing and embedding 3rd party .NET components into Ignite.NET.


    The Ignite.C++ part of the community finally came up with a way to execute arbitrary C++ code on remote cluster machines.

    This approach was initially tested for continuous queries. You can now register continuous queries' remote filters on any cluster node you like. Going forward you can expect support for the Ignite.C++ compute grid and more.

    Want to learn more? Please join me June 7 for a webinar titled, “Apache® Ignite™: What’s New in Version 2.0.” I hope to see you there!

    P.S. Just in case you can’t wait until June…  here's a full list of the changes inside Apache Ignite 2.0.



    Hot Blogs (today's hits)

    Tag Cloud