Apache Ignite 2.11: Stabilization First
The new Apache Ignite 2.11 was released on September 17, 2021. It can be considered to be a greater extent as a stabilization release that closed a number of technical debts of the internal architecture and bugs. Out of more than 200 completed tasks, 120 are bug fixes. However, some valuable improvements still exist, so let's take a quick look at them together.
Partition awareness is enabled by default in the 2.11 release and allows thin clients to send query requests directly to the node that owns the queried data. Without partition awareness, an application executes all queries and operations via a single server node that acts as a proxy for the incoming requests.
The Apache Ignite internals has the so-called switch (a part of Partition Map Exchange) process that is used to perform atomic execution of cluster-wide operations and move a cluster from one consistent state to another, for example, a cache creation/destroy, a node JOIN/LEFT/FAIL operations, snapshot creation, etc. During the switching process, all user transactions are parked for a small period of time which in turn increases the average latency and throughput of the overall cluster.
Splitting the cluster into virtual cells containing 4-8 nodes may increase the total cluster performance and minimize the influence of one cell on another in case of node fail events. Such a technique also significantly increases the recovery speed of transactions on cells not affected by failing nodes. The time when transactions are parked also decreases on non-affected cells which in turn decreases the worst latency for the cluster operations overall.
From now on, you can use the RendezvousAffinityFunction affinity function with ClusterNodeAttributeColocatedBackupFilter to group nodes into virtual cells. Since the node baseline attributes are used as cell markers the corresponding BASELINE_NODE_ATTRIBUTES system view was added.
See benchmarks below that represent the worst (max) latency, which happens in case of node left/failure/timeout events on broken and alive cells.
New Page Replacement Policies
When Native Persistence is on and the amount of data, which Ignite stores on the disk, is bigger than the off-heap memory amount allocated for the data region, another page should be evicted from the off-heap to the disk to preload a page from the disk to the completely full off-heap memory. This process is called page replacement. Previously, Apache Ignite used the Random-LRU page replacement algorithm which has a low maintenance cost, but it has many disadvantages and greatly affects the performance when the page replacement is started. On some deployments, administrators even force a cluster restart periodically to avoid page replacement. There are a few new algorithms available from now on:
- Segmented-LRU Algorithm
- CLOCK Algorithm
Page replacement algorithm can be configured by the PageReplacementMode property of DataRegionConfiguration. By default, the CLOCK algorithm is now used. You can check the Replacement Policies in the documentation for more details.
Snapshot Restore And Check Commands
All snapshots are fully consistent in terms of concurrent cluster-wide operations as well as ongoing changes with Ignite. However, in some cases and for your own peace of mind, it may be necessary to check the snapshot for completeness and for data consistency. The Apache Ignite is now delivered with a built-in snapshot consistency check commands that enable you to verify internal data consistency, calculate data partitions hashes and pages checksums, and print out the result if a problem is found. The check command also compares hashes calculated by containing keys of primary partitions with corresponding backup partitions and reports any differences.
# This procedure does not require the cluster to be in the idle state. control.(sh|bat) --snapshot check snapshot_name
Previously, only the manual snapshot restore procedure was available by fully copying persistence data files from the snapshot directory to the Apache Ignite work directory. The automatic restore procedure allows you to restore cache groups from a snapshot on an active cluster by using the Java API or command line script (using CLI is recommended). Currently, the restore procedure has several limitations, so please check the documentation pages for details.
Start restoring all user-created cache groups from the snapshot "snapshot_09062021". control.(sh|bat) --snapshot restore snapshot_09062021 --start # Start restoring only "cache-group1" and "cache-group2" from the snapshot "snapshot_09062021". control.(sh|bat) --snapshot restore snapshot_09062021 --start cache-group1,cache-group2 # Get the status of the restore operation for "snapshot_09062021". control.(sh|bat) --snapshot restore snapshot_09062021 --status # Cancel the restore operation for "snapshot_09062021". control.(sh|bat) --snapshot restore snapshot_09062021 --cancel
Posted at 04:53PM Sep 20, 2021 by Maxim Muzafarov in General | |
Apache Ignite Momentum: Highlights from 2020-2021
When Apache Ignite entered the Apache Software Foundation (ASF) Incubator in 2014, it took less than a year for the project and its community to graduate from the Incubator and become a top-level project for the ASF. Since then, Ignite has experienced a significant and steady growth in popularity, and it has been used by thousands of application developers and architects to create high-performance and scalable applications used by millions of people daily. In this article, we’ll recap the achievements of Ignite in 2020-2021.
Ignite is Ranked as a Top 5 Project
The ASF has ranked Apache ignite as a Top 5 project in various categories since 2017. That year, Ignite was in the Top 5 of Apache Project Repositories by Commits and most active Apache mailing lists. Today, the momentum continues, and Ignite continues to be ranked as a Top 5 project in multiple categories: second on the Top 5 big data user lists, third on the Top 5 big data dev lists, second on the Top 5 of all user lists, third on the Top 5 repos by size.
Of greatest significance, the continued Top 5 ranking on the “dev list” reflects an active community of contributors who are committed to keeping the code base growing, while the Top 5 ranking on the “user list” means that more and more Ignite application developers come to the community to ask questions – indicating continued growth in adoption.
The Worldwide Ignite community is Engaged
This broad and growing interest in Apache Ignite has continued over the last year and a half. However, faced with the pandemic and shelter-in-place orders around the world, the community sought ways to stay in touch and continue sharing experiences. The community naturally turned to a virtual format and established two new successful programs.
The first was a series of Ignite Virtual Meetups, where Apache Ignite users, developers, committers, contributors and architects worldwide could share experiences on a wide range of topics, ask questions, and help drive the project forward. Since these virtual meetups began, the community has already held 17 events, which were attended by hundreds of community members and developers.
The second new program was launched this May with the virtual Ignite Summit, the first global conference designed for the entire Ignite community. Twenty-five speakers from industry-leading companies including finance, biotech, health & fitness, construction and cloud computing led 15 hours of discussion about how Apache Ignite delivers the performance and scale required to address the world’s most challenging computational and hybrid transactional/analytical processing requirements. The Summit had attendees from North America, Latin America, EMEA and APAC. Remarkably, attendees spent an average of nearly 5 hours at the event!
Innovation Continues at a Rapid Pace
Over the last year and a half, the community has released five new versions of Ignite 2.x. The releases introduce numerous improvements and optimizations, including major features, such as new monitoring and profiling frameworks, cluster snapshots, encoding keys rotation for transparent data encryption, and more.
The community also put significant effort into contributing and releasing new documentation, which is now hosted on the Ignite website. Since the new documentation was posted, it has become the most visited resource on the website – a clear indication that it is helping Ignite developers make faster, easier progress on their Ignite development and optimization tasks.
Further, Igniters have begun working on the next major release, Ignite 3.0, which introduces significant usability improvements, a new SQL engine based on Apache Calcite, a Raft-based consistency protocol, and many other improvements. Users can already try the first two Alpha versions:
The payoff – Ignite Downloads Continue to Soar
The inherent benefits of Apache Ignite, combined with all the effort of a dedicated community, has resulted in a popular project that continues to see increasing adoption. Ignite Maven monthly downloads are skyrocketing, and we have seen a 65% year-over-year growth in downloads so far in 2021, resulting in hundreds of thousands of downloads each month.
We eagerly look forward to the full release of Apache ignite 3.0 and fully expect downloads, adoption and community enthusiasm to continue to soar. Good luck to the Ignite community!
Posted at 06:40PM Sep 14, 2021 by Denis Magda in General | |
Apache Ignite 2.10: Thin Client Expansion
Thin ClientsThin clients now support several important features which, previously were available only on the thick clients. Thin clients are always backward and forward compatible with the server nodes of the cluster, so the cluster upgrade process will be more convenient if the lack of these features prevented you from doing that.
See the list of what is changed for thin clients below:
- Service invocations
- Continuous Queries
- SQL API
- Cluster API
- Cache Async API
- Kubernetes Discovery (ThinClientKubernetesAddressFinder)
Cluster MonitoringApache Ignite self-monitoring and cluster health check subsystems are also extended by additional SQL-views and command line scripts.
New control-script Commands
Query any of the available system views.
control.sh --system-view views Command [SYSTEM-VIEW] started -------------------------------------------------------------------------------- name schema description SQL_QUERIES_HISTORY SYS SQL queries history. INDEXES SYS SQL indexes BASELINE_NODES SYS Baseline topology nodes STRIPED_THREADPOOL_QUEUE SYS Striped thread pool task queue SCAN_QUERIES SYS Scan queries PARTITION_STATES SYS Distribution of cache group partitions across cluster nodes Command [SYSTEM-VIEW] finished with code: 0 --------------------------------------------------------------------------------
Query any of the available system metrics.
-------------------------------------------------------------------------------- metric value sys.CurrentThreadCpuTime 17270000 Command [METRIC] finished with code: 0 --------------------------------------------------------------------------------control.sh --metric sysCurrentThreadCpuTime Command [METRIC] started
Managing Ignite System Properties
In addition to basic cluster configuration settings, you can perform some low-level cluster configuration and tuning via Ignite system properties. Run the command below to see the list of all available system properties for configuration:
$./ ignite.sh -systemProps -------------------------------------------------------------------------------- IGNITE_AFFINITY_HISTORY_SIZE - [Integer] Maximum size for affinity assignment history. Default is 25. IGNITE_ALLOW_ATOMIC_OPS_IN_TX - [Boolean] Allows atomic operations inside transactions. Default is true. IGNITE_ALLOW_START_CACHES_IN_PARALLEL - [Boolean] Allows to start multiple caches in parallel. Default is true. ... --------------------------------------------------------------------------------
Cluster ProfilingFrom now on, Apache Ignite is delivered with the cluster profiling tool. This tool collects and processes all cluster internal information about Queries, Compute Tasks, Cache operations, Checkpoint and WAL statistics, and so on for problem detection and cluster self-tuning purposes. Each cluster node collects performance statistics into a special binary file that is placed under the
[IGINTE_WORK_DIR]/perf_stat/directory with the template filename as
node-[nodeId]-[index].prf. All these files are consumed by offline-tool that builds the report in a human-readable format.
Transparent Data Encryption - Cache Key RotationPayment card industry data security standard (PCI DSS) requires that key-management procedures include a predefined crypto period for each key in use and define a process for key changes at the end of the defined crypto period. An expired key should not be used to encrypt new data, but it can be used for archived data, such keys should be strongly protected (section 3.5 - 3.6 of PCI DSS Requirements and Security Assessment Procedures).
Apache Ignite now supports full PCI DSS requirements:
- Transparent Data Encryption available since the 2.7 release.
- Master Key Rotation procedure available since the 2.9 release.
- Cache Key Rotation procedure available since the 2.10 release.
Posted at 10:20AM Mar 18, 2021 by Maxim Muzafarov in General | |
Apache Ignite 2.9 Released: Cluster snapshots and tracing
As of October 23, 2020, Apache Ignite 2.9 is available. Like every other Ignite release, release 2.9 includes many changes. Let's take a look at the major features of release 2.9.
Ignite 2.9 provides the ability to create full cluster snapshots for deployments that use Ignite Persistence. Snapshots can be taken online, when the cluster is active and accessible to users. An Ignite snapshot includes a cluster-wide copy of all data records that exist at the moment the snapshot is started. All snapshots are consistent — in terms of concurrent, cluster-wide operations as well as in terms of ongoing changes in Ignite Persistence data, index, schema, binary metadata, marshaller, and other files on nodes. See Ignite documentation to learn about this feature.
The Ignite monitoring system continues to improve. In Ignite 2.9, a new tracing subsystem became available. Tracing provides information that is useful for debugging — that helps with both regular, daily, basic system monitoring and with incident analysis. You can collect distributed traces of tasks that are executed in your cluster and use this information to diagnose latency problems. In the 2.9 release, the following Ignite components are instrumented for tracing:
See the documentation for more information.
In addition to snapshots and tracing, Ignite 2.9 provides the following new features:
- Cluster discovery, cluster API, compute API, and service invocation support for thin clients (Java and .Net)
- Cluster-wide, read-only mode
- Ability to run user-defined code inside the Ignite sandbox
- Transparent data encryption: master key rotation
- Management tools to cancel user tasks and queries
- Platform cache (.Net)
See the release notes to learn about all of the new features.
Ignite contributors and committers
Posted at 11:10PM Nov 05, 2020 by Denis Magda in General | |
Ignite 2.8 Released: Less Stress in Production and Advances in Machine Learning
With thousands of changes contributed to Apache Ignite 2.8 that enhanced almost all the components of the platform, it’s possible to overlook some of the improvements that can convince you to upgrade to this version sooner than later. While a quick check of the release notes will help to discover anticipated bug fixes, this article aims to guide through enhancements every Ignite developer should be aware of.
New Subsystem for Production Monitoring and Tracing
Several months of constant work on IEP-35: Monitoring & Profiling has resulted in the creation of a robust and elastic subsystem for production monitoring and diagnostic (aka. profiling). This was influenced by the needs of many developers who deployed Ignite in critical environments and were asking for a foundation that can be integrated with many external monitoring tools and be expanded easily.
The new subsystem consists of several registries that group individual metrics related to a specific Ignite component. For instance, you will find registries for cache, compute, or service grid APIs. Since the registries are designed to be generic, specific exporters can observe the state of Ignite via a myriad of tools supporting various protocols. By default, Ignite 2.8 introduces exporters for monitoring interfaces such as log files, JMX and SQL views, and contemporary ones such as OpenCensus.
Presently, this new subsystem is released in an experimental mode only to give Ignite users some time to check the new API and suggest any improvements. Since the developer community is already impatient to remove the experimental flag, don’t delay!
Advances in Ignite Machine Learning
Machine Learning (ML) capabilities of Ignite 2.8 are so drastically different from previous versions that if you’ve been waiting for the best moment to use the API, then the time has come. Let’s scratch the surface here and learn more details from the updated documentation pages.
A model training is usually a multi-step process that goes with preprocessing, training, and evaluation/valuation phases. A new pipelining API puts things in order by combining all the phases in a single workflow.
In addition to the pipelining APIs, Ignite 2.8 introduced ensemble methods, which allow combining several machine learning techniques into one predictive model to decrease variance (bagging) and bias (boosting), or improve predictions (stacking).
Furthermore, now you can import Apache Spark or XGBoost models to Ignite for further inference, pipelining other tasks. Feel free to keep training a model with your favorite framework and convert it to Ignite representation once the model needs to be deployed in production and executed at scale.
Beyond Java: Partition-Awareness and Other Changes
Even though Ignite is a Java middleware, it functions as a cross-platform database and compute platform that is used for applications developed in C#, C++, Python, and other programming languages.
Thin client protocol is a real enabler for other programming languages support, and with Ignite 2.8, it got a significant performance optimization by supporting partition-awareness. The latter allows thin clients to send query requests directly to nodes that own the queried data. Without partition awareness, an application that is connected to the cluster via a thin client executes all queries and operations via a single server node that acts as a proxy for the incoming requests.
Check the detailed blog post by Pavel Tupitsyn, Ignite committer and PMC, who elaborates on the partition-awareness feature and introduces other .NET-specific enhancements.
Less Stress in Production
This section lists top improvements that might not have striking or catchy names but can bring relief by automating and optimizing things, and by avoiding data inconsistencies when you are already in production.
The stop-the-world pauses triggered by Java garbage collectors impact performance, responsiveness, and throughput of our Java applications. Apache Ignite has a partition-map-exchange (PME) process that, as Java garbage collectors, has some phases that put on hold all running operations for the sake of cluster-wide consistency. For most of the Ignite usage scenarios, these phases complete promptly and are unnoticed. However, some low-latency or high-throughput use cases can detect a decline that might impact some business operations for a moment in time. This wiki page lists all the conditions that can trigger a distributed PME, and with Ignite 2.8, some of them were taken off the list -- the blocking PME no longer happens if a node belonging to the current baseline topology leaves the cluster or a thick client connects to it.
Next, we all know that things break, and what really matters is how a system handles failures. With Ignite 2.8, we revisited the way the cluster handles crash recoveries on restarts while replaying write-ahead-logs (check IGNITE-7196 and IGNITE-9420). Also, the read-repair feature was added to manage data inconsistencies between primary and backups copies of the cluster on-the-fly.
Furthermore, it’s worth mentioning that Ignite 2.8 became more prudent about disk space consumption by supporting the compaction of data files and write-ahead-logs of the native persistence. By sacrificing a bit more CPU cycles for the needs of compaction algorithms, you can save a lot on the storage end.
Last but not least, is an auto-baseline feature that changes a cluster topology for deployments with Ignite native persistence without the need for your intervention in many scenarios. Check this documentation page for more details.
Reach out to us on the community user list for more questions, details, and feedback.
Ignite contributors and committers
Posted at 12:00AM Mar 11, 2020 by Denis Magda in General | |
Apache Ignite 2.7: Deep Learning and Extended Languages Support
Deep Learning With TensorFlow
Even though it was natural to provide machine learning algorithms in Ignite out of the box, another direction was taken for deep learning capabilities. Primarily because machine learning approaches have already been adopted in businesses from big to small -- while deep learning is still being used for narrow and specific use cases.
Thus, Ignite 2.7 can boast about an official integration with TensorFlow deep learning framework that gives a way to use Ignite as a distributed storage for TensorFlow calculations. With Ignite, data scientists can store unlimited data sets across a cluster, gain performance improvements and rely on fault-tolerance of both products if an algorithm fails in the middle of an execution.
Extended Languages Support - Node.JS, Python, PHP
Java, .NET and C++ have been extensively supported by Ignite for a while now. But until now, when it came to other languages, developers had to fall back to REST, JDBC/ODBC calls. To address the limitation of missing native APIs for programming languages different from the three above, the community released a low-level binary protocol used to build thin clients. A thin client is a lightweight Ignite client that connects to the cluster via a standard socket connection.
Based on this protocol, Ignite 2.7 adds support for Node.JS, Python and PHP. As for Java, .NET and C++, you can leverage from thin clients, as well, if the regular clients are not suitable for some reason.
Transparent Data Encryption
For those of you who are using Ignite persistence in production, this functionality brings peace of mind. Whether you store any sensitive information -- or an entire data set has to be encrypted due to regulations -- this feature is what you need. Check this page for more details.
Transactional SQL Beta
Last, but probably the most anticipated addition to Ignite, is fully transactional SQL. You're no longer limited to key-value APIs if an application needs to run ACID-compliant distributed transactions. Prefer SQL? Use SQL! Yes, it's still in beta and might not yet be the best fit for mission-critical deployments, but definitely try it in your development cycles and share your feedback. It took us several years to reach this milestone and before GA release comes out, we want to hear what you think.
Finally, I have no more paper left to cover other optimizations and improvements. So, go ahead and check out our release notes.
Apache Ignite 2.5: Scaling to 1000s Nodes Clusters
Apache Ignite was always appreciated by its users for two primary things it delivers - scalability and performance. And now it grew to the point when the community decided to revisit its discovery subsystem that influences how well and far the database scales out. The goal was pretty clear - Ignite has to scale to 1000s of nodes as good as it scales to 100s now. Check what we did to solve the challenge.[Read More]
Apache Ignite 2.4 Brings Advanced Machine Learning and Spark DataFrames Capabilities
Usually, Ignite community rolls out a new version once in 3 months, but we had to make an exception for Apache Ignite 2.4 that consumed five months in total. We could easily blame Thanksgiving, Christmas and New Year holidays for the delay and would be forgiven, but, in fact, we were forging the release you can't simply pass by.
Let's dive in and search for a big fish.
Machine Learning General Availability
Eight months ago, at the time of Apache Ignite 2.0, we put out the first APIs that formed the foundation of the Ignite's machine learning component of today. Since that time, Ignite machine learning experts and enthusiasts have been moving the library to the general availability condition meticulously. And Ignite 2.4 became a milestone that let us consider the ML Grid to be production ready.
The component gained a variety of algorithms that can solve a myriad of regression and classification tasks, gave an ability to train models avoiding ETL from Ignite to other systems, paved a way to deep learning usage scenarios. All that now empowers Ignite users with the tools for dealing with fraud detection, predictive analytics, and for building recommendation systems...if you want. Note, ETL is optional, and the whole memory-centric cluster is at your service!
Moreover, Machine Learning Grid welcomed a software donation by NetMillennium, Inc. in the form of genetic algorithms that solve optimization problems by simulating the process of biological evolution. The algorithms haven't got to Ignite 2.4 and waiting for their time for a release in the master branch. Once you get them, you can apply the biological evolution simulation for real-world applications including automotive design, computer gaming, robotics, investments, traffic/shipment routing and more.
It's not a joke or misprint. Spark users, the DataFrames are now officially supported for you! Many of you have been anticipating them for years and, thanks to Nikolay Izhikov, who was "promoted" to an Ignite committer for the contribution, now you can leverage from them.
No need to be wordy here. Just go ahead and start with DataFrames in Ignite.
Expanding Ignite ecosystem
It was unfair that only Java, C#, and C++ developers could utilize the breadth and depth of Ignite APIs in their applications. Ignite 2.4 solved the injustice with its new low-level binary client protocol. The protocol communicates with an existing Ignite cluster without starting a full-fledged Ignite node. An application can connect to the cluster through a raw TCP socket from any programming language you like.
The beauty of the protocol is that you can develop a so-called Ignite thin client that is a lightweight client connected to the cluster and interacts with it using key-value, SQL, and other APIs. .NET thin client is already at your service and Node.JS, Python, PHP, Java thin clients are in a forge and being developed for the next releases.
RPM repository and much more
So, now Apache Ignite can also be installed from the official RPM repository. Debian users, the packages for your operating systems to be assembled soon.
Overall, if to list all the features and benefits Ignite 2.4 brings, only 2 people will read the article till the end - me and my dear mom Thus, I'll let you discover the rest from the release notes.
Meltdown and Spectre patches show negligible impact to Apache Ignite performance
As promised in my initial blog post on this matter, Apache Ignite community applied security patches against the notorious Meltdown Spectre vulnerabilities and completed performance testing of general operations and workloads that are typical for Ignite deployments.
The security patches were applied only for CVE-2017-5754 (Meltdown) and CVE-2017-5753 (Spectre Variant 1) vulnerabilities. The patches for CVE-2017-5715 (Spectre Variant 2) for the hardware the community used for testing are not stable yet an can cause system reboot issues or another unpredictable behavior.
The applied patches have shown that the performance implications are negligible - the performance drop is just in the 0 - 7% range as the figure shows:
Thus, Apache Ignite community highly recommends its customers and partners to consider security patches for CVE-2017-5754 (Meltdown) and CVE-2017-5753 (Spectre Variant 1) in their deployment environments and contact us on the user list if you run into a larger performance drop in your use case.
At the same time, we're keeping an eye on Intel announcements and will validate the performance implications of Spectre Variant 2 once a solution is released by the hardware vendor.
Just for your reference, the benchmarks were executed in the following environment and configuration.
- 4 servers and 8 client nodes
- Apache Ignite version: 2.4.0
- Huawei RH2288 V3, CPU - 2x Xeon E5-2609 v4, 1.7GHz, RAM - 96Gb, SSD - 3x800Gb RAID0 2.4Tb, Network - 10Gb/s
- DEll R610, CPU - 2x Xeon X5570, RAM - 96Gb, SSD - 512Gb, HDD - 2048GB, Network - 10Gb/s
- OS CentOS Linux release 7.4.1708 (Core)
- Kernel - Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64
Protecting Apache Ignite from 'Meltdown' and 'Spectre' vulnerabilities
The world was rocked after the recent disclosure of the Meltdown and Spectre vulnerabilities that literally affect almost all software ever developed. Both issues are related to the way all modern CPUs are designed and this is why they have opened unprecedented security breaches -- making the software, including Apache Ignite, vulnerable to hacker attacks.
The vulnerabilities are registered in the National Vulnerability Database under the following CVEs:
How to protect Apache Ignite deployments?
First, the vulnerabilities can be fixed only on the operating system (OS) or hardware levels. All OS and hardware vendors are working on and releasing patches to fill-in the security breaches. Depending on the type of your Apache Ignite deployment, make sure to do the following:
- On-premise deployments - apply the patches prepared by your OS and hardware vendors. Consult with them to find out additional steps to act on. This page is a good place to start with.
- Cloud deployments - major cloud providers such as Amazon and Microsoft are in a process of patching their cloud computing services. Consider a cloud provider's security announcements and recommendations or follow up with a representative for suggestions.
Second, an Apache Ignite cluster becomes vulnerable to the attacks only if someone gets unauthorized access to cluster machines (both on-premise or cloud deployments) and executes a malicious shell script or connects to the cluster directly and executes a Java, .NET or C++ computation there.
Do the following to prevent this from happening:
- Make sure the cluster machines are secured with a hard-to-guess or hard-to-calculate password.
- Consider using 3rd party security components provided by enterprise vendors (such as this one) to strengthen a security shield of your deployments.
Finally, researchers who discovered Meltdown and Spectre have said that the first issue can be fixed with software patches while the second can be fully addressed only with hardware upgrades/replacement. Luckily, it's much more difficult for hackers to exploit Spectre. Thus, if the two recommendations given above are taken seriously, the chances that you will be impacted from Spectre are low.
What is the performance impact of security patches?
Many security patches are rolled out with a precaution that some of the applications can see up to a 30% performance degradation. Apache Ignite community is planning to measure the impact of general usage scenarios and will follow up with the results in a consequent post.
This general performance testing might not cover your use case. Therefore, it's highly recommended that you assess and test a possible performance drop of your Apache Ignite deployments before applying the patches in production. If the drop is significant, then contact us on the dev list.
Apache Ignite Essentials: 2-part Webinar Series for Architects and Java Developers
We finally made this happen! I’m happy to invite all of the software architects and engineers out there to a series of webinars that will introduce you to the fundamental capabilities of in-memory computing platforms such as Apache Ignite.
There will also be a mix of theory and practice. A lot of code examples are waiting to be shown so that you can apply the theory in practice right away.
The series consists of two parts.
- Cluster configuration and deployment.
- Distributed database internals (partitioning, replication).
- Data processing with key-value APIs.
- Affinity Collocation.
- Data processing with SQL.
- Collocated processing.
- Collocated processing for distributed computations.
- Collocated processing for SQL (distributed joins and more).
- Machine Learning.
- Memory Architecture.
Book your seat!
Apache Ignite 2.3 - More SQL and Persistence Capabilities
Putting aside the regular bug fixes and performance optimizations, the Apache Ignite 2.3 release brings new SQL capabilities and Ignite persistence improvements that are worth mentioning.
Let's start with SQL first.
Apache Ignite users have consistently told us that despite all of Ignite’s SQL capabilities, it’s been at times challenging trying to figure out how to start using Ignite as an SQL database.
This was mostly caused by scattered documentation pages, lack of “getting started” guides and tutorials. We’ve remedied this oversight! All related SQL knowledge has been curated in a single documentation domain.
Are you curious about the SQL scope? Go to the new SQL Reference Overview section!
Cannot wait to learn how the Ignite SQL engine runs internally? We’ve prepared an Architectural Overview section for you.
Simply need to know how to connect to an Ignite cluster from an SQL tool? Here is a tooling section for you.
Let’s take a look at some specific SQL features released in Ignite 2.3.
First, we’re proud to deliver support of ALTER TABLE command. Presently, the command allows adding new columns to an SQL schema in runtime -- avoiding any cluster restarts. Once a new column is added, it can be turned into an index. Again, in runtime. No restarts!
Another significant addition seen in Ignite 2.3 is the integration with SQLLine tool that is bundled with every Apache Ignite release and can be used as a default command line tool for SQL based interactions.
To prove that it's fairly simple to work with Ignite as with an SQL database using the tool, we recorded a short screencast for you:
Ignite native persistence keeps getting more attention and installs -- which is why the community released a feature requested by at least a dozen users. The feature allows enabling the persistence for specific data sets. Before Ignite version 2.3, the persistence could be enabled globally only.
Now, it's up to you to decide which data to persist and which to store in RAM only. The persistence can be configured via data regions as shown below:
This data region will consume up to 500 MB of RAM and will store a superset of data on disk ensuring that no data loss happens in case of a crash or even if there is no more space left in RAM.
Flip through our release notes to see all the changes and improvements available in Apache Ignite 2.3 -- and, for sure, download and use this version in production.
Questions, comments? Let us know!
Apache Ignite Community News (Issue 3)
by Tom Diederich
This is our third community update – there’s a lot going on, so let's get started.
Apache Ignite experts have already spoken at two meetups this month, both in Silicon Valley, but there are several more scheduled this month around the world.
On Sept. 9 Apache Ignite PMC chair Denis Magda was the featured presenter at the Big Data and Cloud Meetup in Santa Clara, Calif. His talk, titled "Apache Spark and Apache Ignite: Where Fast Data Meets the IoT," was highly rated and we’re planning a hands-on workshop with meetup organizers for November.
On Sept. 13 Denis also spoke at the SF Big Analytics Meetup in Mountain View, Calif. Again, to a packed room. The topic of his talk was "Better Machine Learning with Apache Ignite."
But Denis isn’t having all the fun – next Monday (Sept. 18), technology evangelist Akmal Chaudhri will speak at the Cambridge .NET User Group. The title of his talk: "Scale Out and Conquer: Apache Ignite for .NET Users."
We’re still looking for more meetups to speak at this month, so if you’re an organizer or would like us to speak at one you’re a member of, just let me know. In the meantime, here are the meetups planned for the remainder of September:
- Bay Area In-Memory Computing Meetup, Wednesday, Sept. 20 – Denis will present, "Apache Spark, Ignite and Flink: Where Fast Data Meets the IoT."
- Internet of Things (IoT) New York Meetup, Monday, Sept. 25 – Akmal will present, "Apache Spark and Apache Ignite: Where Fast Data Meets the IoT."
- NYC In-Memory Computing Meetup, Tuesday, Sept. 26 – Akmal will present, "Powering Up Banks and Financial Institutions with Distributed Systems."
- New York Kubernetes Meetup, Wednesday, Sept. 27 – Akmal will provide a DevOps perspective on the orchestration of distributed databases and Apache Ignite.
See? I told you Denis wasn’t having all the fun! Akmal is definitely on the road again. J
- Implementing In-Memory Computing for Financial Services Use Cases with Apache Ignite Sept 12 (recording available)
- Better Machine Learning with Apache Ignite, Wednesday, Sept. 27
On Sept. 5, Akmal published “Using Java and .NET apps to connect to an Apache Ignite cluster, that details how to create an Apache Ignite cluster that can support the reading and writing of user-defined objects in a common storage format. This is particularly useful in situations where applications need to work with objects but these objects will be accessed by different programming languages and frameworks.
On Sept. 7, Dmitriy Setrakyan published “Apache Ignite - In Memory Performance with Durability of Disk.”
Next up, on Sept. 12, was Akmal, who published “Kubernetes and Apache® Ignite™ Deployment on AWS.” That post walked through the steps required to get Kubernetes and Apache Ignite deployed on Amazon Web Services (AWS).
And then on Sept. 13 Dmitriy published “What is Apache Ignite.” I think the headline of that one is self-explanatory. J
In the news
Nikita Ivanov is also an InfoWorld contributor. Read the first in his series on in-memory computing, “Ensuring big data and fast data performance with in-memory computing.”
- Stack Overflow. Stack Overflow is a question and answer site for professional and enthusiast programmers.
- Habrahabr. Habrahabr (also "Habr") (Russian: Хабрахабр, Хабр) is a Russian collaborative blog with elements of social network about IT, Computer science and anything related to the Internet, owned by Thematic Media.
- In-Memory Computing Planet (blogs and events) Add you blog feed!
- “Meetup in a Box.” If you would like to speak at a meetup, start or support a meetup, or have questions about meetups in general – let me know! I can help get you up and running with everything you’ll need.
Please share any resources I've excluded in the comments section and I'll include them in the next edition.
Apache Ignite Community Update (August 2017 Issue)
by Tom Diederich
Igniters, here are some community highlights from the last couple week. If I missed anything, please share it here. Meetups! Did you know that Apache Ignite experts are available to speak at your meetup? And we also have spots open for YOU to speak at the following meetups that some of us co-organize:
- Apache Ignite London
- Bay Area In-Memory Computing Meetup
- NYC In-Memory Computing Meetup
- Moscow Apache Ignite Meetup
Meanwhile, here’s where to catch some great talks about Apache Ignite! We have 19 newly scheduled meetup talks on the books since the last update. All upcoming Ignite events can be found here. Let’s take a closer look at some of them….
Scheduled speaking engagements
* Sept. 9: Big Data and Cloud Meetup (Santa Clara, Calif.)
Apache Ignite PMC chair Denis Magda will be speaking at the Big Data and Cloud Meetup September 9 from 10 a.m. to noon. His talk is titled: "Apache Spark and Apache Ignite: Where Fast Data Meets the IoT".
* Sept. 13: SF Big Analytics Meetup
Denis Magda will be the featured speaker at the SF Big Analytics Meetup on Sept. 13. Denis' talk is titled: "Apache Ignite: the in-memory hammer in your data science toolkit."
* Sept. 18: Meetup: Cambridge .NET User Group
Apache Ignite evangelist Akmal Chaudhri will speak at the Cambridge .NET User Group Sept. 17. The title of his talk: "Scale Out and Conquer: Apache Ignite for .NET Users."
* Sept. 27: New York Kubernetes Meetup
Apache Ignite evangelist Akmal Chaudhri will focus on a DevOps perspective on the orchestration of distributed databases such as Apache Ignite. Akmal will speak on node auto-discovery, automated horizontal scalability, availability, and utilization of RAM and disk with Apache Ignite.
* Oct. 4: Openstack & Ceph User Group Amsterdam
Apache Ignite evangelist Akmal Chaudhri will show attendees how to build a Fast Data solution that will receive endless streams from the IoT side and will be capable of processing the streams in real-time using Apache Ignite's cluster resources.
* Oct. 13: Big Data Week London 2017: A Festival of Data (conference)
Akmal Chaudhri will be speaking at the Big Data Week conference Oct. 13 in London. His talk, titled "Powering up banks and financial institutions with distributed systems," will educate attendees about important Apache Ignite features for financial applications -- such as ACID compliance, SQL compatibility, persistence, replication, security, fault tolerance and more.
* Oct. 18: Silicon Valley Java User Group
Join Apache Ignite PMC Chair Denis Magda will introduce the many components of the open-source Apache Ignite. His talk, titled, “Catch an intro to Apache Ignite and skyrocket Java applications,” will teach attendees how to solve some of the most demanding scalability and performance challenges. He will also cover a few typical use cases and work through some code examples.
* Oct. 19: Eurostaff Big Data London
Apache Ignite evangelist Akmal Chaudhri will show attendees how to build a Fast Data solution that will receive endless streams from the IoT side and will be capable of processing the streams in real-time using Apache Ignite's cluster resources.
* Oct. 24: Spark Summit Europe 2017 (conference)
Akmal Chaudhri will be presenting at the Spark Summit Europe conference, Oct. 24-26 at the Convention Centre Dublin in Ireland. His session is titled: "How to Share State Across Multiple Spark Jobs using Apache Ignite."
* Nov. 2: Byte-Academy-FinTech-Python-Blockchain-Education Meetup (London)
In his talk, titled, "Powering up banks and financial institutions with distributed systems,” Apache Ignite technical Akmal Chaudhri will explain important Apache Ignite features for financial applications -- such as ACID compliance, SQL compatibility, persistence, replication, security, fault tolerance and more. A customer case study will also be presented.
- Scale-up vs. scale-out architectures
- Cloud Wars: Apache Ignite – Getting started with AWS for Beginners (Part I)
- Apache Ignite Tip: Peer Class Loading Deployment Magic
- The Future of In-Memory Computing
- Sept. 27: Better Machine Learning with Apache Ignite, with technical evangelist Akmal B. Chaudhri.
- Oct. 4: Postgres with Apache Ignite: Faster Transactions and Analytics, with GridGain senior solution architect Fotios Filacouris.
Past webinars (recordings available!)
Deploy like a Boss: Using Kubernetes and Apache Ignite, with GridGain solution architect Dani Traphagen.
Apache Ignite 2.1 - A Leap from In-Memory to Memory-Centric Architecture
The power and beauty of in-memory computing projects are that they truly do what they state -- deliver outstanding performance improvements by moving data closer to the CPU, using RAM as a storage and spreading the data sets out across a cluster of machines relying on horizontal scalability.
However, there is an unspoken side of the story. No matter how fast a platform is, we do not want to lose the data and encounter cluster restarts or other outages. To guarantee this we need to somehow make data persistent on the disk.
Most in-memory computing projects address the persistence dilemma by giving a way to sync data back to a relational database (RDBMS). That sounds reasonable and undoubtedly works pretty well in practice, but if we dig deeper, you’ll likely encounter the following limitations:
If you use either Apache Ignite 1.x or 2.0 along with the RDBMS for disk storage, then you will hit these limitations. It’s just the way in-memory architectures are integrated with the disk.
However, the limitations are no longer relevant for Apache Ignite 2.1! This version made a leap from in-memory to a memory-centric architecture that:
Curious about how Ignite achieved these huge advantages? Lifting the curtain….
Durable Memory Architecture
The Apache Ignite memory-centric platform is based on the durable memory architecture that allows storing and processing data and indexes both in-memory and on disk when the Ignite Persistent Store is enabled. The memory architecture helps to achieve in-memory performance with the durability of the disk using all of the resources available in the cluster.
The durable memory is built and operates in a way similar to the virtual memory of operating systems such as Linux. However, the one significant difference between these two types of architectures is that the durable memory one always keeps the whole data set and indexes on the disk -- if the Ignite Persistent Store is enabled -- while the virtual memory uses the disk for swapping purposes only.
Ignite Persistent Store
Persistent Store is a distributed ACID and SQL-compliant disk store that transparently integrates with the durable memory as an optional disk layer (SSD, Flash, 3D XPoint). Having the store enabled, you no longer need to keep all of the data in memory or warm-up the RAM after a whole cluster restart. The persistent store will keep the superset of data and all the SQL indexes on the disk making Ignite fully operational from the disk.
Tired of hooking up Ignite with an RDBMS? Go ahead and download Apache Ignite 2.1, enable Ignite Persistent Store, and launch your first durable Ignite cluster that distributes data sets and workloads relying on the performance of RAM and durability of the disk!
Finally, Apache Ignite 2.1 can boast about another achievements in .NET, C++, SQL and Machine Learning. Go ahead and discover them!