The Apache Software Foundation Blog
The Apache Software Foundation Announces Apache® SystemML™ as a Top-Level Project
Open Source Big Data machine learning platform in use at Cadent Technology and IBM Watson Health, among other organizations.
Forest Hill, MD –31 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® SystemML™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
Apache SystemML is a machine learning platform optimal for Big Data that provides declarative, large-scale machine learning and deep learning. SystemML can be run on top of Apache Spark, where it automatically scales data, line by line, to determine whether code should be run on the driver or an Apache Spark cluster.
"Today, the machine learning revolution is leading to thousands of life-altering innovations such as self-driving cars and computers that detect cancer," said Deron Eriksson, Vice President of Apache SystemML. "Apache SystemML enables and simplifies this process by executing optimized high-level algorithms on Big Data using proven technologies such as Apache Spark and Apache Hadoop MapReduce."
The core of Apache SystemML has been created from the ground up with the following design principles in mind:
- Performance and Scalability, as SystemML scales up on single nodes, and scales out on large clusters using Apache Spark or Apache Hadoop;
- "Designed for data scientists", enabling data scientists to develop algorithms in a system with a strong foundation in linear algebra and statistical functions; and
- Cost-based optimization for scalable execution plans, that significantly shortens and simplifies the development and deployment cycle of algorithms for varying data characteristics and system configurations.
Using Apache SystemML, data scientists are able to implement algorithms using high-level language concepts without knowledge of distributed programming. Depending on data characteristics such as data size/shape and data sparsity (dense/sparse), and cluster characteristics such as cluster size and memory configurations, SystemML's cost-based optimizing compiler automatically generates hybrid runtime execution plans that are composed of single-node and distributed operations on Apache Spark or Apache Hadoop clusters for best performance.
"SystemML allows Cadent to implement advanced numerical programming methods in Apache Spark, empowering us to leverage specialized algorithms in our predictive analysis software," said Michael Zargham, Chief Scientist at Cadent Technology.
"SystemML is like SQL for Machine Learning, it enables Data Scientists to concentrate on the problem at hand, working in a high-level script language like R, and all the optimizations and rewrites are handled by the very powerful SystemML optimizer that considers data and available resources to produce the best execution plan for the application," said Luciano Resende, Architect at the IBM Spark Technology Center and Apache SystemML Incubator Mentor.
"IBM Watson Health VBC is using Apache SystemML on Apache Spark to build risk models on a very large EHR data set to predict emergency department visits," said Steve Beier, Vice President of Value Based Care Platform and Analytics at IBM Watson Health. "The models identify high-risk patients so that they can be targeted with preemptive strategies, thus potentially reducing care costs while at the same time leading to optimal outcomes for patients."
SystemML originated at IBM Research - Almaden in 2010, and was submitted to the Apache Incubator in November 2015. SystemML initiated compressed linear algebra research, a differentiating feature in SystemML, which received the VLDB 2016 Best Paper.
"The Apache Incubator is all about open collaboration and communication and was invaluable for everyone involved in SystemML," added Eriksson. "The Apache SystemML community sincerely encourages everyone interested in machine learning and deep learning to help build our community around this revolutionary technology."
Catch Apache SystemML in action at the Big Data Developers Silicon Valley MeetUp on 8 June 2017 in San Francisco, CA.
Availability and Oversight
Apache SystemML software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache SystemML, visit http://systemml.apache.org/ and https://twitter.com/ApacheSystemML
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit https://www.apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "SystemML", "Apache SystemML", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 11:00AM May 31, 2017 by Sally Khudairi in General | |