The Apache Software Foundation Blog

Thursday January 13, 2022

Apache Software Foundation statement on White House Open Source Security Summit

The Apache Software Foundation (ASF) participated today in a meeting hosted by the White House to discuss security of open source software, and how to improve the "supply chain" of open source software to better facilitate the rapid adoption of security fixes when necessary.


The virtual summit included representation from a number of companies and U.S. departments and agencies. Three representatives of the ASF participated in the virtual summit, ASF President David Nalley, VP of Security Mark Cox, and ASF board member Sam Ruby.

Securing open source and its supply chain


The ASF produces software for the public good. We are committed to working with the larger community, including industry and government consumers of open source software, to find ways to improve security while adhering to The Apache Way.


This means that we believe the path forward will require upstream collaboration by the companies and organizations that consume and ship open source software. There's no single "silver bullet" to get there, and it will take all of our organizations working together to improve the open source supply chain.


Since its inception more than 20 years ago, the ASF has evolved and adapted to meet the changing needs of its mission: to provide software in the public good, by providing support and services of its project communities. To do this, we've refined our governance models, our infrastructure, recommended best practices, and more over the years. 


We expect to continue to evolve and improve over the next 20 years, and helping to improve the security of the open source supply chain is part of that. We are committed to doing the work through our communities to help make that a reality.

Communities thrive on conversation

Those who are familiar with the ASF know that we value community and having a level playing field for contributors. We believe today’s conversation is a good beginning that can help catalyze and direct a wider response to addressing today’s security needs for open source software. 


Many of the organizations represented today are important contributors and consumers of open source, but of course are not all of the important contributors or consumers. We know that it’s important to hear from individual contributors as well as corporations, foundations and government entities. For our part, we’ll strive to make sure that happens.


As always, we welcome participation and contributions in our communities from those who wish to show up and be part of the projects that are part of the ASF. We appreciate the opportunity to participate in today’s conversation, and look forward to participating in the follow on conversations that this effort inspired.

Monday January 10, 2022

Apache Software Foundation Security Report: 2021

Synopsis: This report explores the state of security across all of The Apache Software Foundation projects for the calendar year 2021. We review key metrics, specific vulnerabilities, and the most common ways users of ASF projects were affected by security issues.


Released: January 2022


Author: Mark Cox, Vice President Security, The Apache Software Foundation

Background

The security committee of The Apache Software Foundation (ASF) oversees and coordinates the handling of vulnerabilities across all of the 350+ Apache projects.  Established in 2002 and composed of all volunteers, we have a consistent process for how issues are handled, and this process includes how our projects must disclose security issues.


Anyone finding security issues in any Apache project can report them to security@apache.org where they are recorded and passed on to the relevant dedicated security teams or private project management committees (PMC) to handle.  The security committee monitors all the issues reported across all the projects and keeps track of the issues throughout the vulnerability lifecycle.  


The security committee is responsible for ensuring that issues are dealt with properly and actively reminds projects of their outstanding issues and responsibilities.  As a board committee, we have the ability to take action including blocking their future releases or, worst case, archiving a project if such projects are unresponsive to handling their security issues.  This, along with the Apache License v2,0, are key parts of the ASF’s general oversight function around official releases, allowing the ASF to protect individual developers and giving users confidence to deploy and rely on ASF software.  


The oversight into all security reports, along with tools we have developed, gives us the ability to easily create metrics on the issues.  Our last report covered the metrics for 2020.

Statistics for 2021

In 2021 our security email addresses received in total ~18,500 emails. After spam filtering and thread grouping there were 1272 (2020: 946, 2019: 620) non-spam threads.  Unfortunately security reports do sometimes look like spam, especially if they include lots of attachments or large videos, and so the security team are careful to review all messages to ensure real reports are not missed for too long.


Diagram 1: Breakdown of ASF security email threads for calendar year 2021


Diagram 1 gives the breakdown of those 1272 threads.  359 threads (28%) were people confused by the Apache License.  As many projects use the Apache License, not just those under the ASF umbrella, people can get confused when they see the Apache License and they don't understand what it is. This is most common for example on mobile phones where the licenses are displayed in the settings menu, usually due to the inclusion of software by Google released under the Apache License.  We no longer reply to these emails. This is up from the 257 received in 2020.


The next 337 of the 1272 (26%) are email threads with people asking non-security (usually support-type) questions.


The next 135 of those reports were researchers reporting issues in an Apache web site.  These are almost always false positives; where a researcher reports us having directory listings enabled, source code visible, public “.git” directories, and so on.  These reports are generally the unfiltered output of some publicly available scanning tool, and often where the reporter asks us for some sort of monetary reward (bounty) for their report.


That left 441 (2020: 376, 2019: 320) reports of new vulnerabilities in 2021, which spanned 99 of the top level projects.  These 441 reports are a mix of external reporters and internal. For example, where a project has found an issue themselves and followed the ASF process to assign it a CVE (Common Vulnerabilities and Exposures) name and address it, we’d still count it here.  We don’t keep metrics that would give the breakdown of internal vs external reports.


The next step is that the appropriate project triages the report to see if it's really an issue or not.  Invalid reports and reports of things that are not actually vulnerabilities get rejected back to the reporter.  Of the remaining issues that are accepted they are assigned appropriate CVE names and eventually fixes are released.


As of January 1st 2022, 50 of those 441 reports were still under triage and investigation. This is where a project was working on an issue and had not rejected the issue or assigned it a CVE as of the snapshot taken on January 1st 2022.  This number was higher than what we’d normally expect and was due to the large influx of reports that came at the end of December 2021.


The remaining 391 (2020: 341, 2019: 301) reports led to us assigning 183 (2020: 151, 2019: 122) CVE names.  Some vulnerability reports may include multiple issues, some reports are across multiple projects, and some reports are duplicates where the same issue is found by different reporters, so there isn't an exact one-to-one mapping of accepted reports to CVE names.  The Apache Security committee handles CVE name allocation and is a MITRE Candidate Naming Authority (CNA), so all requests for CVE names in any ASF project are routed through us, even if the reporter is unaware and contacts MITRE directly or goes public with an issue before contacting us.

Noteworthy events

During 2021 there were a few events worth discussing; either because they were severe and high risk, they had readily available exploits, or there was media attention. These included:

  • January: A cross-site scripting (XSS) flaw was found in the default error page of Apache Velocity (CVE-2020-13959) which affected a number of public visible websites. Despite a fix being available it then took several months to produce a new release to include the fix, causing the reporter to publicise it early. As a consequence, the security team did a deeper dive through all the outstanding open issues with the affected PMCs to ensure they were all being handled.

  • January: A report was published which showed how malware is still exploiting Apache ActiveMQ instances that have not been patched for over 5 years (CVE-2016-3088)

  • June: The Airflow PMC published a blog about how they handle security issues, how users are sometimes slow to deploy updates (CVE-2020-17526), and how flaws in dependencies can affect Airflow.

  • July: A third-party blog explained how threat actors are exploiting mis-configured Apache Hadoop YARN services

  • August: A researcher discovered a number of issues in HTTP/2 implementations.  The Apache HTTP Server was affected by a moderate vulnerability (CVE-2021-33193)

  • September: A keynote presentation at ApacheCon 2021 discussed the security committee, the US Executive Order on Improving the Nation’s Cybersecurity, and third party security projects such as those under the OpenSSF.

  • September: A flaw in Apache OpenOffice could allow a malicious document to run arbitrary code if opened (CVE-2021-33035)

  • October: A critical issue was found in the Apache HTTP Server. The default configuration protected against this vulnerability, but in custom configurations without those protections, and with CGI support enabled, this could lead to remote code execution (CVE-2021-41773). The issue was fixed in an update 5 days after the issue was reported to the security team, however the fix was quickly found to be insufficient and a further update to fully address it was released 3 days after that (CVE-2021-42013). A MetaSploit exploit exists for this issue.

  • October: The Internet Bug Bounty from HackerOne extended their program to include Apache Airflow, the Apache HTTP Server, and Apache Commons.  Unlike many other programs, this program relies on vulnerability finders following the standard ASF notification process, and allows finders to claim a reward for eligible issues after the fix is available and the issue is public.

  • December: A vulnerability in Log4J 2 (CVE-2021-44228, “Log4Shell”), a popular and common Java logging library, allowed remote attackers to achieve remote code execution in a default and likely installation.  The issue was widely exploited, starting the day before a release with a fix was published.  There is a MetaSploit exploit module for this issue. After the fixed release a few subsequent Log4J vulnerabilities were also fixed, but none had the same impact or default conditions.  

  • December: The ASF is invited to a forum in 2022 around open source security. White House Extends Invitation to Improve Open-Source Security.  We produced a position paper in advance of the meeting.

Timescales

Our security teams and project management teams are all volunteers and so we do not give any formal SLA on the handling of issues.  However we can break down our aims and goals for each part of the process:


Triage: Our aim is to handle incoming mails to the security@apache.org alias within three working days.  We do not measure or report on this because we assess the severity of each incoming issue and apply the limited resources we have appropriately.  The alias is staffed by a very small number of volunteers taken from the different project PMCs.  After the security team forwards a report to a PMC, the PMC will reply to the reporter.  Sometimes reporters send reports attaching large PDF files or even movies of exploitation that don’t make it to us due to size restrictions on incoming email, so please ensure any follow ups are a simple plain text email.


Investigation: Once a report is sent to the private list of the projects management committee, the process of triage and investigation varies in time depending on the project, availability of resources, and number of issues to be assessed.  As security issues are dealt with in private, we send reports to a private list made up only of the PMC. Therefore these reports do not reach every project committer, so there is a smaller set of people in each project able to investigate and respond.  As a general guideline we try to ensure projects have triaged issues within 90 days of the report.  The ASF security team follow-up on any untriaged issues over 90 days old.


Fix: Once a security issue is triaged and accepted, the timeline for the fixing of issues depends on the schedules of the projects themselves.  Issues of lower severity are most often held to pre-planned releases.  


Announcement: Our process allows projects up to a few days between a fix release being pushed and the announcement of the vulnerability.  All vulnerabilities and mitigating software releases are announced via the announce@apache.org list.  We now aim to have them appear in the public CVE project list within a day of that announcement, and even quicker for critical issues.

Conclusion

The Apache Software Foundation projects are highly diverse and independent.  They have different languages, communities, management, and security models.  However one of the things every project has in common is a consistent process for how reported security issues are handled. 


The ASF Security Committee works closely with the project teams, communities, and reporters to ensure that issues get handled quickly and correctly.  This responsible oversight is a principle of The Apache Way and helps ensure Apache software is stable and can be trusted.


This report gave metrics for calendar year 2021 showing from the 18,500 emails received we triaged over 390 vulnerability reports relating to ASF projects, leading to fixing 183 (CVE) issues.  The number of non-spam threads dealt with was up 34% from 2020 with the number of actual vulnerability reports up 17% and assigned CVE up 21%.


While the ASF often gets updates for critical issues out quickly, reports show that users are being exploited by old issues in ASF software that have failed to be updated for years, and vendors (and, thus, their users) still make use of end of life versions which have known unfixed vulnerabilities. This will continue to be a big problem and we are committed to engaging on this industry-wide problem to figure out what we can do to help.


If you have vulnerability information you would like to share please contact us or for comments on this report see the public security-discuss mailing list.

Tuesday December 14, 2021

Apache Log4j CVEs

The Apache Software Foundation project Apache Logging Services has responded to a security vulnerability that is described in two CVEs, CVE-2021-44228 and CVE-2021-45046. In this post we’ll list the CVEs affecting Log4j and keep a list of frequently asked questions. 

The most recent CVE has been addressed in Apache Log4j 2.16.0, released on 13 December. We recommend that users update to 2.16.0 if possible. While the 2.15.0 release addressed the most severe vulnerability, the fix in Log4j 2.15.0 was incomplete in some non-default configurations and could allow an attacker to execute a denial of service (DoS) attack. Users still on Java 7 should upgrade to the Log4j 2.12.2 release. 

CVE-2021-44228: Apache Log4j2 JNDI features do not protect against attacker controlled LDAP and other JNDI related endpoints

In Apache Log4j2 versions up to and including 2.14.1, the JNDI features used in configurations, log messages, and parameters do not protect against attacker-controlled LDAP and other JNDI related endpoints. An attacker who can control log messages or log message parameters can execute arbitrary code loaded from LDAP servers when message lookup substitution is enabled.

See the entire description and history on the Apache Logging security page.

CVE-2021-45046: Apache Log4j2 Thread Context Message Pattern and Context Lookup Pattern vulnerable to a denial of service attack

It was found that the fix to address CVE-2021-44228 in Apache Log4j 2.15.0 was incomplete in certain non-default configurations. 

This could allow attackers, in some situations, to craft malicious input data using a JNDI Lookup pattern resulting in a DoS attack. Log4j 2.15.0 restricts JNDI LDAP lookups to localhost by default. Note that previous mitigations involving configuration such as to set the system property log4j2.formatMsgNoLookups to true do NOT mitigate this specific vulnerability.

See the entire description and history on the Apache Logging security page.

CVE-2021-4104: Deserialization of untrusted data in JMSAppender in Apache Log4j 1.2

Apache Log4j 1.x has been end-of-life since August 2015. However, we are aware that it is still a dependency for some applications and in use in some environments. We have found that Log4j 1.2, if used in a non-default configuration with JMSAppender used to perform JNDI requests, is vulnerable to deserialization of untrusted data when the attacker has write access to the Log4j configuration.

This is not the same vulnerability described in the recent Log4j 2.x CVEs, but it could also result in remote code execution (RCE), so we are providing this information to make users aware of the vulnerability and urge them to upgrade to Log4j 2.16.0 or 2.12.2, or to take steps to mitigate the issue by disabling the use of JMSAppender to perform JNDI requests.

Frequently Asked Questions about the Log4j vulnerabilities

In this section we’ll try to address some of the most common questions that our community and press have had about the Log4j vulnerabilities. 

What about systems or applications with Log4j 1.x?

While the Log4j 1.x series is not known to be affected by the two CVEs above, it has reached end of life and is no longer supported. Vulnerabilities reported after August 2015 against Log4j 1.x were not checked and will not be fixed. Users should upgrade to Log4j 2 to obtain security fixes.

How many systems have been impacted or how widespread is the impact of this CVE?

Log4j, like all software distributed by the Apache Software Foundation, is open source. It’s been distributed via a mirror system for many years and then more recently via a Content Delivery Network (CDN) directly to users and developers, and also to organizations who have then shipped it as part of their projects, products or services. 

We know that Log4j is included in a number of ASF projects, other open source projects and a number of products and services. But beyond that any numbers would merely be speculation and most likely wrong by a wide margin.

Are any other Apache projects impacted by the Log4j vulnerabilities?

Yes. The Apache Security Team has compiled a list of projects that are known to be affected with links to updates if available. See the Apache projects affected by log4j CVE-2021-44228 blog post.

Apache Log4j is the only Logging Services subproject affected by this vulnerability. Other projects like Log4net and Log4cxx are not impacted by this.

How can I get help?

If you need help on building or configuring Log4j or other help on following the instructions to mitigate the known vulnerabilities listed here, please send your questions to the public Log4j Users mailing list

If you have encountered an unlisted security vulnerability or other unexpected behavior that has security impact, or if the descriptions here are incomplete, please report them privately to the Log4j Security Team. Thank you.

Friday October 29, 2021

CloudStack Collaboration Conference 2021 - 8-12 November 2021

For the 9th year running, the global Apache CloudStack community will be holding its major event —CloudStack Collaboration Conference— on 8-12 November 2021. Due to the pandemic, the event will take place virtually to enable even more people interested in CloudStack technology to learn about its latest features, capabilities, and integrations. 


Тhe 2021 edition of the CloudStack Collaboration Conference starts with a full hackathon day on 8 November. The next 4 days acome with numerous exciting technical talks, as well as 5 different workshops that will provide newcomers an in-depth overview of the power of CloudStack technology. A separate track focused on user success stories is expected to be an engaging draw, where attendees will learn about the CloudStack implementations in companies that include NTT Data, CloudOps, EWERK, Cloud.ca, and more.


One of the most promising talks at the event is the keynote, "Bringing digital services to 1.3 billion people with CloudStack". In this presentation, event attendees will learn about Digital India, Government of India's flagship program to realize its vision of transforming India into a digitally-empowered society and knowledge economy —powered by Apache CloudStack. 


CloudStack Collaboration Conference is a free-to-join event, open to everybody seeking to learn more about one of the most powerful and mature Open Source Cloud management platforms. Registration is now open online and closes the day of the event.


A community-organised event, CloudStack Collaboration Conference is run entirely by volunteers and passionate enthusiasts. Thank you to 2021 CloudStack Collaboration Conference sponsors ShapeBlue, LINBIT, StorPool, XCP-ng, CloudOps, EWERK Group, and Versio for their partnership and commitment to delivering the event to the global CloudStack community.


Apache CloudStack originated in 2008 as a start-up project at Cloud.com, and rapidly evolved through the years to become a favored turnkey solution for Cloud builders, IaaS providers, telcos, and enterprises. CloudStack entered the Apache Incubator in 2012 and graduated as an Apache Top-Level Project in 2013, backed by a vibrant, diverse community of engineers, DevOps, Cloud architects, and C-level leaders united with the aim of advancing the Apache CloudStack project and its community.


# # #

Thursday October 14, 2021

Apache Software Foundation moves to CDN distribution for software

It’s not enough to create and release useful software. As an open source foundation, a major part of the Apache Software Foundation’s (ASF) job is to help get that software into the hands of users.

To do so, we’ve relied for many years on the contributions of individuals and organizations to provide mirror infrastructure to distribute our software. We’re now retiring that system in favor of a content distribution network (CDN), and taking a moment to say thank you to all the individuals and organizations who helped get ASF software into the hands of millions of users.

The history and function of the ASF mirror system

Today if you want to download the source or binaries for an ASF project, you’ll probably have it copied over before you can refill your coffee. But when the Apache Group (precursor to the foundation) first got its start, bandwidth was a lot more limited.

This was true for users and true for the limited resources available to those who wanted to distribute the software. As demand grew it became more than a single site could handle.

To share the load, we began to use a “mirror” system. That is, copies of the artifacts distributed to mirror sites that were closer to the users who wanted the software. Instead of all requests being served by a central server, the mirrors could sync up with the main site and then serve a portion of the audience looking to download Apache software.

The first mirror sites became available in April, 1995. Among the first mirror providers was SunSite, 'a network of Internet servers providing archives of information, software and other publicly available resources.'

In April 1997, Brian Behlendorf invited 66 people already hosting mirrors to join the 'mirror@' Apache mailing list. In June of the same year users could automatically be directed to a local mirror by a CGI script that would select the right mirror based on their country code.

Henk P. Penning joined the mirrors mailing list in 2002, and went on to become a major contributor to the system (among other things at the foundation). A mirror in 2002 would need to allocate a whopping 10 GB of space to handle all the artifacts available for download. Penning contributed to the ASF infrastructure until his passing in 2019.

Penning was joined in improving the mirror system by Gavin McDonald, who helped check for “stale” mirrors with out-of-date copies and sent reminders to the admins to keep them up to date. Eventually the team implemented a checker to do this automatically.

This elides a great deal of work, history and dedication to providing open source software for the public good. Suffice to say, the history of the mirror system (which you can read more about, here) is the story of open source writ small: many individuals and organizations coming together to chop wood and carry water to lay infrastructure that many more will take for granted.

The present and future for distributing ASF software

Today, that 10GB has grown to more than 180GB for a mirror to carry all ASF software.

The industry has changed as well. Technology has advanced, bandwidth costs have dropped, and mirror systems are giving way to content delivery networks (CDNs).

After discussion and deliberation, the ASF’s Infrastructure team has decided to move our download system to a CDN with professional support and a service level appropriate to the foundation’s status in the technology world.

Our new delivery system is part of a global CDN with economies of scale and fast, reliable downloads around the world. We expect ASF users will see faster deployment of software, without any lag that one might usually see with a mirror system while local mirrors sync off the main instance.

ASF projects won’t see any difference in their workflow, just a faster delivery of open source artifacts to their users.

Once again, we’d like to thank all the contributors who’ve helped stand up mirrors over the past 20+ years. Without the mirror system to deliver our software, we would never have made it this far.

Thursday October 07, 2021

The Apache Software Foundation Announces Apache® OpenOffice® 4.1.11

Updates to security and availability of leading Open Source office document productivity suite

Wilmington, DE —7 October 2021— The Apache® Software Foundation (ASF), the world’s largest Open Source foundation, announced today Apache OpenOffice® 4.1.11, the popular Open Source office-document productivity suite.

Used by millions of organizations, institutions, and individuals around the world, Apache OpenOffice delivered 317M+ downloads* and provides more than $25M in value to users per day. Apache OpenOffice supports more than 40 languages, offers hundreds of ready-to-use extensions, and is the productivity suite of choice for governments seeking to meet mandates for using ISO/IEC standard Open Document Format (ODF) files.

"Users worldwide depend on OpenOffice to meet their office productivity needs," said Carl Marcum, Vice President of Apache OpenOffice. "We are proud to offer improved security and availability with our latest release. Businesses of all sizes across numerous industries, educational institutions, non-profits, digitally-inclusive communities, application developers, and countless others rely on Apache OpenOffice to efficiently create, manage, and deliver high-impact, integrated content."

Apache OpenOffice comprises six productivity applications: Writer (word processor), Calc (spreadsheet tool), Impress (presentation editor), Draw (vector graphics drawing editor), Math (mathematical formula editor), and Base (database management program). The OpenOffice suite ships for Windows, macOS, and Linux.

Apache OpenOffice v4.1.11
The 14th release under the auspices of the ASF, OpenOffice v4.1.11 reflects dozens of improvements, features, and bug fixes that include:

  • New Writer Fontworks gallery
  • Updated document types where hyperlink is allowed
  • Updated Windows Installer
  • Increased font size in Help


In addition, the project is mitigating 5 CVE (Common Vulnerabilities and Exposures) reports, three of which will be disclosed on 11 October, in coordination with The Document Foundation.

Apache OpenOffice delivers up to 2.4M downloads per month and is available as a free download to all users at 100% no cost, charge, or fees of any kind.

Apache OpenOffice is available on the Windows 11 Store as of 5 October 2021.

OpenOffice source code is available for anyone who wishes to enhance the applications. The Project welcomes contributions back to the project as well as its code community. Those interested in participating with Apache OpenOffice can learn more at https://openoffice.apache.org/get-involved.html .

* partial count: the number above reflects full-install downloads of Apache OpenOffice via SourceForge as of September 2021.

Tribute
Of special note, Apache OpenOffice 4.1.11 is dedicated to the memory of Dr. Patricia Shanahan, late member of the Apache OpenOffice Project Management Committee, former member of the ASF Board of Directors, former Vice President Apache River, and contributor to Apache Community Development. More information on Patricia can be found at the ASF's memorial page http://apache.org/memorials/patricia_shanahan.html . 

Availability and Oversight
Apache OpenOffice software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. The project strongly recommends that users download OpenOffice only from the official site https://www.openoffice.org/download/ to ensure that they receive the original software in the correct and most recent version.

About Apache OpenOffice
Apache OpenOffice is a leading Open Source office-document productivity suite comprising six productivity applications: Writer, Calc, Impress, Draw, Math, and Base. OpenOffice is based around the OpenDocument Format (ODF), supports 40+ languages, and ships for Windows, macOS, and Linux. OpenOffice originated as "StarOffice" in 1985 by StarDivision, who was acquired by Sun Microsystems in 1999. The project was open-sourced under the name "OpenOffice.org", and continued development after Oracle Corporation acquired Sun Microsystems in 2010. OpenOffice entered the Apache Incubator in 2011 and graduated as an Apache Top-level Project in October 2012. Apache OpenOffice delivers up to 2.4 Million downloads each month is the productivity suite of choice for hundreds of educational institutions and government organizations seeking to meet mandates for using ISO/IEC standard Open Document Format (ODF) files. For more information, including documentation and ways to become involved with Apache OpenOffice, visit https://openoffice.apache.org/ and https://twitter.com/ApacheOO .

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Replicated, Reprise Software, Talend, Target, Tencent Cloud, Union Investment, Workday, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF .

© The Apache Software Foundation. "Apache", "OpenOffice", "Apache OpenOffice", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

#  #  #

Tuesday September 21, 2021

Apache Ranger response to incorrect analyst report on Cloud data security

Introduction

A recent industry analyst report by GigaOm and sponsored by Immuta comparing Apache Ranger to Immuta paints an incorrect picture on the complexities of using Apache Ranger. We believe the report contains a number of errors and inconsistencies. Unfortunately the Apache Ranger Project Management Committee (PMC) was not contacted by the analyst firm during preparation of the report.


We have attempted to contact the authors and members of the research team several times, requesting the opportunity to review the inaccuracies and have them corrected. Despite our many attempts to rectify the misinformation, no-one from the analyst firm responded.


For the benefit of existing and potential users of Apache Ranger, it is important for Apache Ranger PMC to respond to this report with facts.


Use cases

Let us now go through the scenarios covered in the report, and see how the numbers reported change with appropriate use of Apache Ranger to address the requirements.


  • Scenario 1b: Mask All PII Data

    • lists 2 policy changes in Immuta vs 5 in Apache Ranger. In fact, only one Apache Ranger policy would be needed to address this requirement. 

    • Shows author's lack of understanding of Apache Ranger policy model. Series of steps to allow/deny/deny-exception listed are applicable only for an access policy but not for a masking policy. Also, in access policies, allow/deny/deny-exception can be replaced by a switch named denyAllElse, as shown in the image below.

    • With use of user-groups or roles, a time-tested best practice followed universally by access control systems, this requirement can be met by a single Apache Ranger policy, as shown below.
      Masking policy:

Access policy:


  • Scenario 1c: Allow Email Domains Through the Masking Policy

    • lists 2 policy changes in Immuta vs 5 in Apache Ranger. In fact, only one Apache Ranger masking policy would be needed to address this requirement. Same as the previous scenario.

    • Claim: Apache Ranger does not have a regular expression masking policy

    • Truth: instead of building a virtualization layer that can introduce significant complexities and performance penalties, Apache Ranger uses native capabilities of the data processing application to perform masking and filtering. Given regular expressions are supported by such applications, it will be simpler to create a custom expression to suit your needs like email address, account numbers, credit card numbers; importantly without having to drag security software vendor.


  • Scenario 1d: Add Two Users Access to All PII Data

    • lists 1 policy change in Immuta vs 4 in Apache Ranger. However, the following suggests that each user must be updated in Immuta UI to add necessary attributes. Wouldn't the number of steps be as large as the number of users?

      • Added the AuthorizedSensitiveData > All attribute to each user in the Immuta UI.

    • counts 4 policy changes in Apache Ranger policies, while the only change needed is to add users (2 or 200 users!) to a group or role. No policy changes are needed if time tested best practices are followed - by referencing groups or roles in policies instead of individual users.


  • Scenario 2a: Share Data With Managers

    • lists 1 policy change in Immuta vs 101 in Apache Ranger. With use of lookup tables, which is a common practice in enterprises, the requirement can be met with a single row-filter policy in Apache Ranger.

ss_store_sk in (select store_id from store_authorization where user_name=current_user())


  • Scenario 2b: Merging Groups

    • lists 0 policy change in Immuta vs 1 in Apache Ranger. This is the same as the previous scenario, where the author chose to not follow common practice of using lookup tables. With use of a lookup table, as detailed above, no policy changes will be needed in Apache Ranger.


  • Scenario 2c: Share Additional Data With Managers

    • lists 0 policy changes in Immuta vs 102 in Apache Ranger. Once again, with use of a lookup table, only 2 policies would be required in Apache Ranger:

table store:
s_store_sk in (select store_id from store_authorization where user_name=current_user())

table store_returns:
sr_store_sk in (select store_id from store_authorization where user_name=current_user())


  • Scenario 2d: Reorganize Managers Into Regions

    • lists 0 policy changes in Immuta vs 40 in Apache Ranger. Same as previous scenarios - with use of a lookup table, no policy changes will be needed in Apache Ranger.


  • Scenario 2e: Restrict Data Access to Specific Countries

    • lists 1 policy change in Immuta vs 71 in Apache Ranger. With use of a lookup table, only one row-filter policy is needed in Apache Ranger.


  • Scenario 2f: Grant New User Group Access to All Rows by Default

    • lists 0 policy change in Immuta vs 30 in Apache Ranger. With use of a lookup table, no additional policy would be needed in Apache Ranger.


  • Scenario 2g: Apply Policies to a Derived Data Mart

    • lists 0 policy changes in Immuta vs 140 in Apache Ranger for the addition of 15 tables. With Apache Ranger, new tables can either be added to existing policies, or new policies can be created. It will require 15 policy updates in Apache Ranger - not 140 as claimed by the author. Also, no details on the changes to be done in Immuta (other than ‘0 policy changes’) are provided.


  • Scenario 3a: "AND" logic policy

    • says "unable to meet requirement" in Apache Ranger - which is incorrect. The author does suggest a good approach to meet this requirement in Apache Ranger - by creating a role with users who are both the groups, and referencing this role in policies. However, the point about Apache Ranger not supporting policies based on a user belonging to multiple groups is correct. However, this can easily be addressed with a custom condition extension. If there is enough interest from the user community, an enhancement to support this condition out of the box would be considered.


  • Scenario 3b: Conditional Policies

    • says "unable to meet requirement" in Apache Ranger - which is incorrect. As mentioned earlier, Apache Ranger leverages expressions supported by underlying data processing engine for masking and row-filtering. The requirement can easily be met with following expression in the masking policy:

      CASE WHEN (extract(year FROM current_date()) - birth_year) > 16) THEN {col} ELSE NULL END


There is no need to create views as suggested in the report.


  • Scenario 3c: Minimization Policies

    • as mentioned in the report Apache Ranger doesn't support policies to limit the number of records accessed. If there is enough interest from the user community, this enhancement would be considered.


  • Scenario 3d: De-Identification Policies

    • Says “unable to meet requirement” in Apache Ranger - which is incorrect. While Apache Ranger doesn’t talk about k-anonymity directly, the requirements can be implemented using Apache Ranger data masking policies - by setting up appropriate masking expressions for columns.

      • for columns that require NULL value to be returned, setup a mask policy with type as MASK_NULL

      • for columns that require a constant value, setup a mask policy with type as CONSTANT and specify desired value - like “NONE”

      • for columns that require a ‘generalized’ value based on the existing value of the column, use custom expressions as shown below. This does require analyzing the table to arrive at generalized values:
        CASE WHEN {col} < 20 THEN 16
            WHEN {col} BETWEEN 20 AND 29 THEN 26
            WHEN {col} BETWEEN 30 AND 39 THEN 36
            WHEN {col} BETWEEN 40 AND 49 THEN 46
            WHEN {col} BETWEEN 50 AND 59 THEN 56
            WHEN {col} BETWEEN 60 AND 69 THEN 66
            WHEN {col} BETWEEN 70 AND 79 THEN 76
            WHEN {col} BETWEEN 80 AND 89 THEN 86
            WHEN {col} BETWEEN 90 AND 99 THEN 96
            ELSE 106
        END

 

What the report doesn't talk about?

It is important to take note of what the report doesn’t talk about. For example:


Extendability: Apache Ranger’s open policy model and plugin architecture enable extending access control to other applications, including custom applications within an enterprise.


Wider acceptance of Apache Ranger by major cloud vendors like AWS, Azure, GCP; and availability of support from seasoned industry experts who continue to contribute to Apache Ranger and extend its reach.


Performance: Apache Ranger policy-engine is highly optimized for performance, which results in only a very small overhead (mostly around 1 millisecond) to authorize accesses; and importantly, there are no overheads in the data access path.


Apache Ranger features like security zones that allow different sets of policies to be applied to data in landing, staging, temp, production zones. A security zone can consist of resources across applications, for example: S3 buckets/paths, Solr collections, Snowflake tables, Presto catalogs/schemas/tables, Trino catalogs/schemas/tables, Apache Kafka topics, Synapse database/schemas/tables.



Monday August 30, 2021

The Apache Drill Project Announces Apache® Drill(TM) v1.19 Milestone Release

Open Source, enterprise-grade, schema-free Big Data SQL query engine used by thousands of organizations, including Ant Group, Cisco, Ericsson, Intuit, MicroStrategy, Tableau, TIBCO, TransUnion, Twitter, and more.

Wilmington, DE —30 August 2021— The Apache Drill Project announced the release of Apache® DrillTM v1.19, the schema-free Big Data SQL query engine for Apache Hadoop®, NoSQL, and Cloud storage.

"Drill 1.19 is our biggest release ever," said Charles Givre, Vice President of Apache Drill. "With an already short learning curve, Drill 1.19 makes it even easier for users to quickly query, analyze, and visualize data from disparate sources and complex data sets.”

An "SQL-on-Hadoop" engine, Apache Drill is easy to deploy, highly performant, able to quickly process trillions of records, and scalable from a single laptop to a 1000-node cluster. With its schema-free JSON model (the first distributed SQL query engine of its kind), Drill is able to query complex semi-structured data in situ without requiring users to define schemas or transform data. It provides plug-and-play integration with existing Hive and HBase deployments, and is extensible out-of-the-box to access multiple data sources, such as S3 and Apache HDFS, HBase, and Hive. Additionally, Drill can directly query data from REST APIs to include platforms like SalesForce and ServiceNow. 

Drill supports the ANSI SQL 2003 standard syntax ecosystem as well as dozens of NoSQL databases and file systems, including Apache HBase, MongoDB, Elasticsearch, Cassandra, REST APIs, , HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, NAS,  local files, and more. Drill leverages familiar BI tools (such as Apache Superset, Tableau, MicroStrategy, QlikView and Excel) as well as data virtualization and visualization tools, and runs interactive queries on Hive tables with different Hive metastores.

Apache Drill v1.19
Drill is designed from the ground up to support high-performance analysis on rapidly evolving data on modern Big Data applications. v1.19 reflects more than 100 changes, improvements, and new features that include:

  • New Connectors for Apache Cassandra, Elasticsearch, and Splunk.

  • New Format Reader for XML without schemas

  • Added Avro support for Kafka plugin

  • Integrated password vault for secure credential storage

  • Support for Linux ARM64 systems

  • Added limit pushdowns for file systems, HTTP REST APIs and MongoDB

  • Added streaming for Drill's REST API

  • Integration with Apache Airflow


Developers, analysts, business users, and data scientists use Apache Drill for data exploration and analysis for its enterprise-grade reliability, security, and performance. Drill's flexibility and ease-of-use have attracted thousands of users that include Ant Group, Cardlytics, Cisco, Ericsson, Intuit, MicroStrategy, Qlik, Tableau, TIBCO, TransUnion, Twitter, National University of Singapore, and more.

"Individuals, businesses, and organizations of all types rely on Apache Drill's rich functionality," added Givre. "We invite everyone to participate in our user and developer lists as well as our Slack channel, and contribute to the project to build on our momentum and help improve the future experience for all Drill users."

Catch Apache Drill in action at ApacheCon@Home, taking place online 21-23 September 2021. For more information and to register, visit https://www.apachecon.com/ .

Availability and Oversight
Apache Drill software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases.

About Apache Drill
Apache Drill is the Open Source, schema-free Big Data SQL query engine for Apache Hadoop, NoSQL, and Cloud storage. For more information, including documentation and ways to become involved with Apache Drill, visit http://drill.apache.org/ , https://twitter.com/ApacheDrill , and https://apache-drill.slack.com/ .

© The Apache Software Foundation. "Apache", "Drill", "Apache Drill", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

#  #  #

Monday August 02, 2021

The Apache Software Foundation Announces Apache® Pinot™ as a Top-Level Project

Open Source distributed real-time Big Data analytics infrastructure in use at Amazon-Eero, Doordash, Factual/FourSquare, LinkedIn, Stripe, Uber, Walmart, Weibo, and WePay, among others.

Wilmington, DE —2 August 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Pinot™ as a Top-Level Project (TLP).

Apache Pinot is a distributed Big Data analytics infrastructure created to deliver scalable real-time analytics at high throughput with low latency. The project was first created at LinkedIn in 2013, open-sourced in 2015, and entered the Apache Incubator in October 2018.

"We are pleased to successfully adopt 'the Apache Way' and graduate from the Apache Incubator," said Kishore Gopalakrishna, Vice President and original co-creator of Apache Pinot. "Pinot initially pushed the boundaries of real-time analytics by delivering insights to millions of Linkedin users. Today, as an Apache Top-Level Project, Pinot is in the hands of developers across the globe who are building it to power several user-facing  analytical applications and unlock the value of data within their organizations."

Scalable to trillions of records, Apache Pinot’s online analytical processing (OLAP) ingests both online and offline data sources from Apache Kafka, Apache Spark, Apache Hadoop HDFS, flat files, and Cloud storages in real time. Pinot is able to ingest millions of events and serve thousands of queries per second, and provide unified analytics in a distributed, fault-tolerant fashion. Features include:

  • Speed —answers OLAP queries with low latency on real-time data

  • Pluggable indexing —Sorted, Inverted, Text Index, Geospatial Index, JSON Index, Range Index, Bloom filters

  • Smart Materialized Views - Fast Aggregations via star-tree index

  • Supports different stream systems with near real-time ingestion —with Apache Kafka, Confluent Kafka, and Amazon Kinesis, as well as customizable input format, with out-of the box support for Avro and JSON formats

  • Highly available, horizontally scalable, and fault tolerant

  • Supports lookup joins natively and full joins using PrestoDB/Trino

Apache Pinot is used to power internal and external analytics at Adbeat, Amazon-Eero, Cloud Kitchens, Confluera, Doordash, Factual/FourSquare, Guitar Center, LinkedIn, Publicis Sapient, Razorpay, Scale Unlimited, Startree, Stripe, Traceable, Uber, Walmart, Weibo, WePay, and more.

Examples of how Apache Pinot helps organizations across numerous verticals include: 1) a fintech company uses Pinot to achieve financial data visibility across 500+ terabytes of data and sustain half million queries per second with financial transactions; 2) a food delivery service leveraged Pinot in the midst of the COVID-19 pandemic to analyze real-time data to provide a socially-distanced pick-up experience for its riders and restaurants; and 3) a large retail chain with geographically distributed franchises and stores uses Pinot for revenue-generating opportunities by analyzing real-time data for internal use cases, as well as real-time cart analysis to increase sales.

"We rely on Apache Pinot for all our real-time analytics needs at LinkedIn," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It's battle-tested at LinkedIn scale for hundreds of our low-latency analytics applications. We believe Apache Pinot is the best tool out there to build site-facing analytics applications and we will continue to contribute heavily and collaborate with the Apache Pinot community. We are very happy to see that it's now a Top-level Apache project."

"We use Apache Pinot in our real-time analytics platform to power external user-facing applications and critical operational dashboards," said Ujwala Tulshigiri, Engineering Manager at Uber. "With Pinot's multi-tenancy support and horizontal scalability, we have scaled to hundreds of use cases that run complex aggregations queries on terabytes of data at millisecond latencies, with the minimal overhead of cluster management."

"We've been using Apache Pinot since last year, and it's been a huge win for our client’s dashboard project," said Ken Krugler, President of Scale Unlimited. "Pinot's ability to rapidly generate aggregation results over billions of records, with modest hardware requirements, was critical for the success of the project. We've also been able to provide patches to add functionality and fix issues, which the Pinot community has quickly integrated and released. There was never any doubt in our minds that Pinot would graduate from the Apache incubator and become a successful top-level project."

"Last year, we started without analytics built into our product," said Pradeep Gopanapalli, technical staff member at Confluera. "By the end of the year, we were using Apache Pinot for real-time analytics in production. Not many of our competitors can even dream of having such results. We are very happy with our choice."

"Pinot is critical to our real-time analytics platform and allowed us to scale without degrading latency," said software engineer Elon Azoulay. "Pinot enables us to onboard large datasets effortlessly, run complex queries which return in milliseconds and is super reliable. We would like to emphasize how helpful and engaged the community is and are certain that we made the right choice with Pinot, it continues to impress us and satisfy our real-time analytics needs."

"We created Pinot at LinkedIn with the goal of tackling the low-latency OLAP problem for site-facing use cases at scale. We evolved it to solve numerous OLAP use cases, and open-sourced it because there aren't many technologies in that domain," said Subbu Subramaniam, member of the Apache Pinot Project Management Committee, and Senior Staff Engineer at LinkedIn. "It is heart-warming to see such a wide adoption and great contributions from the community in improving Pinot over time."

"We are at the beginning of this transformation and we cannot wait to see every software company build real-time applications using Apache Pinot," added Gopalakrishna. "We welcome everyone to join our community Slack channel and contribute to the project."

Catch Apache Pinot in action at ApacheCon Asia online on 7 August 2021. For more information and to register, visit https://www.apachecon.com/acasia2021/

Availability and Oversight
Apache Pinot software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Pinot, visit http://pinot.apache.org/ and https://twitter.com/ApachePinot

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors that include Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Talend, Tencent, Target, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Pinot", "Apache Pinot", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday July 27, 2021

The Apache Cassandra Project Releases Apache® Cassandra™ v4.0, the Fastest, Most Scalable and Secure Cassandra Yet

Open Source enterprise-grade Big Data distributed database powers mission-critical deployments with improved performance and unparalleled levels of scale in the Cloud

Wilmington, DE —27 July 2021— The Apache Cassandra Project released today v4.0 of Apache® Cassandra™, the Open Source, highly performant, distributed Big Data database management platform.

"A long time coming, Cassandra 4.0 is the most thoroughly tested Cassandra yet," said Nate McCall, Vice President of Apache Cassandra. "The latest version is faster, more scalable, and bolstered with enterprise security features, ready-for-production with unprecedented scale in the Cloud."

As a NoSQL database, Apache Cassandra handles massive amounts of data across load-intensive applications with high availability and no single point of failure. Cassandra’s largest production deployments include Apple (more than 160,000 instances storing over 100 petabytes of data across 1,000+ clusters), Huawei (more than 30,000 instances across 300+ clusters), and Netflix (more than 10,000 instances storing 6 petabytes across 100+ clusters, with over 1 trillion requests per day), among many others. Cassandra originated at Facebook in 2008, entered the Apache Incubator in January 2009, and graduated as an Apache Top-Level Project in February 2010.

Apache Cassandra v4.0
Cassandra v4.0 effortlessly handles unstructured data, with thousands of writes per second. Three years in the making, v4.0 reflects more than 1,000 bug fixes, improvements, and new features that include:

  • Increased speed and scalability – streams data up to 5 times faster during scaling operations, and up to 25% faster throughput on reads and writes, that delivers a more elastic architecture, particularly in Cloud and Kubernetes deployments.

  • Improved consistency – keeps data replicas in sync to optimize incremental repair for faster, more efficient operation and consistency across data replicas.

  • Enhanced security and observability – audit logging tracks users access and activity with minimal impact to workload performance. New capture and replay enables analysis of production workloads to help ensure regulatory and security compliance with SOX, PCI, GDPR, or other requirements.

  • New configuration settings – exposed system metrics and configuration settings provides flexibility for operators to ensure they have easy access to data that optimize deployments.

  • Minimized latency – garbage collector pause times are reduced to a few milliseconds with no latency degradation as heap sizes increase.

  • Better compression – improved compression efficiency eases unnecessary strain on disk space and improves read performance.


Cassandra 4.0 is community-hardened and tested by Amazon, Apple, DataStax, Instaclustr, iland, Netflix, and others that routinely run clusters as large as 1,000 nodes and with hundreds of real-world use cases and schemas. 

The Apache Cassandra community deployed several testing and quality assurance (QA) projects and methodologies to deploy the most stable release yet. During the testing and QA period, the community generated reproducible workloads that are as close to real-life as possible, while effectively verifying the cluster state against the model without pausing the workload itself.

"In our experience, nothing beats Apache Cassandra for write scaling, and we're looking forward to the performance and management improvements in the 4.0 release," said Elliott Sims, Senior Systems Administrator at Backblaze. "We rely on Cassandra to manage over one exabyte of customer data and serve over 50 billion files for our customers across 175 countries so optimizing Cassandra's capabilities and performance means a lot to us."

"Since 2016, software engineers at Bloomberg have turned to Apache Cassandra because it’s easy to use, easy to scale, and always available," said Isaac Reath, Software Engineering Team Lead, NoSQL Infrastructure at Bloomberg. "Today, Cassandra is used to support a variety of our applications, from low-latency storage of intraday financial market data to high-throughput storage for fixed income index publication. We serve up more than 20 billion requests per day on a nearly 1 PB dataset across a fleet of 1,700+ Cassandra nodes."

"Netflix uses Apache Cassandra heavily to satisfy its ever-growing persistence needs on its mission to entertain the world. We have been experimenting and partially using the 4.0 beta in our environments and its features like Audit Logging and backpressure," said Vinay Chella, Netflix Engineering Manager and Apache Cassandra Committer. "Apache Cassandra 4.0's improved performance helps us reduce infrastructure costs. 4.0's stability and correctness allow us to focus on building higher-level abstractions on top of data store compositions, which results in increased developer velocity and optimized data store access patterns. Apache Cassandra 4.0 is faster, secure, and enterprise-ready; I highly suggest giving it a try in your environments today."

"Apache Cassandra's contributors have worked hard to deliver Cassandra 4.0 as the project's most stable release yet, ready for deployment to production-critical Cloud services," said Scott Andreas, Apache Cassandra Contributor. "Cassandra 4.0 also brings new features, such as faster host replacements, active data integrity assertions, incremental repair, and better compression. The project's investment in advanced validation tooling means that Cassandra users can expect a smooth upgrade. Once released, Cassandra 4.0 will also provide a stable foundation for development of future features and the database's long-term evolution."

Apache Cassandra is in use at Activision, Apple, Backblaze, BazaarVoice, Best Buy, Bloomberg Engineering, CERN, Constant Contact, Comcast, DoorDash, eBay, Fidelity, GitHub, Hulu, ING, Instagram, Intuit, Macy's, Macquarie Bank, Microsoft, McDonalds, Netflix, New York Times, Monzo, Outbrain, Pearson Education, Sky, Spotify, Target, Uber, Walmart, Yelp, and thousands of other companies that have large, active data sets. In fact, Cassandra is used by 40% of the Fortune 100. Select Apache Cassandra case studies are available at https://cassandra.apache.org/case-studies/ 

In addition to Cassandra 4.0, the Project also announced a shift to a yearly release cycle, with releases to be supported for a three-year term.

Catch Apache Cassandra in action through presentations from the April 2021 Cassandra World Party https://s.apache.org/jjv2d .

Availability and Oversight
Apache Cassandra software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Cassandra, visit https://cassandra.apache.org/ and https://twitter.com/cassandra .

About Apache Cassandra
Apache Cassandra is an Open Source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Apache Cassandra is used in some of the largest data management deployments in the world, including nearly half of the Fortune 100.

© The Apache Software Foundation. "Apache", "Cassandra", "Apache Cassandra", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

#  #  #

Tuesday May 04, 2021

Media Alert: Apache OpenOffice Recommends upgrade to v4.1.10 to mitigate legacy vulnerability

Wilmington, DE —4 May 2021— 


Who:
Apache OpenOffice, an Open Source office-document productivity suite comprising six productivity applications: Writer, Calc, Impress, Draw, Math, and Base. The OpenOffice suite is based around the OpenDocument Format (ODF), supports 41 languages, and ships for Windows, macOS, Linux 64-bit, and Linux 32-bit. Apache OpenOffice delivers up to 2.4 Million downloads each month.

What: A recently reported vulnerability states that all versions of OpenOffice through 4.1.9 can open non-http(s) hyperlinks, and could lead to untrusted code execution. 

The Apache OpenOffice Project has filed a Common Vulnerabilities and Exposures report with MITRE Corporation’s national vulnerability reporting system:

> CVE-2021-30245: Code execution in Apache OpenOffice via non-http(s) schemes in Hyperlinks
>
> Severity: moderate
>
>Credit: Fabian Bräunlein and Lukas Euler of Positive Security https://positive.security/blog/url-open-rce#open-libreoffice


The complete CVE report is available at https://www.openoffice.org/security/cves/CVE-2021-30245.html

How: Applications of the OpenOffice suite handle non-http(s) hyperlinks in an insecure way, allowing for 1-click code execution on Windows and Xubuntu systems via malicious executable files hosted on Internet-accessible file shares.

Why: The mitigation in Apache OpenOffice 4.1.10 assures that a security warning is displayed to give users the option of continuing to open the hyperlink. Best practice dictates to be careful when opening documents from unknown and unverified sources. 

When: The vulnerability predates OpenOffice entering the Apache Incubator. During the analysis of this issue, it was discovered that an incorrect bug fix was made by the StarOffice/OpenOffice.org developers preparing OpenOffice 2.0 in 2005, whilst under the auspices of Sun Microsystems. 


Where: Download Apache OpenOffice v4.1.10 at https://www.openoffice.org/download/

Apache OpenOffice Highlights

24 October 2020 — 300 million downloads of Apache OpenOffice
14 October 2020 — 20 year anniversary of OpenOffice
18 October 2016 — 200 million downloads of Apache OpenOffice
17 April 2014 — 100 million downloads of Apache OpenOffice
17 October 2012 — OpenOffice graduated as an Apache Top Level Project (TLP)
13 June 2011 — OpenOffice.org entered the Apache Incubator

[downloads are binary installation files]

For more information, visit https://openoffice.apache.org/ and https://twitter.com/ApacheOO

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with more than 8,100 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "OpenOffice", "Apache OpenOffice", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Sunday April 11, 2021

The Apache Software Foundation Welcomes 40 New Members

The Apache Software Foundation (ASF) welcomes the following new Members who were elected during the annual ASF Members' Meeting on 9 and 11 March 2021:

Maxime Beauchemin, Bolke de Bruin, Wei-Chiu Chuang, Jiangjie (Becket), Pablo Estrada, Dave Grove, Madhawa Kasun Gunasekara, Nathan Hartman, Tilman Hausherr, Georg Henzler, Xiangdong Huang, Nikita Ivanov, Yu Li, Geoff Macartney, Denis A. Magda, Carl Marcum, Matteo Merli, Aaron Morton, Aizhamal Nurmamat kyzy, Enrico Olivelli, Jaikiran Pai, Juan Pan, Pranay Pandey, Arun Patidar, Jarek Potiuk, Rodric Rabbah, Katia Rojas, Maruan Sahyoun, Aditya Sharma, Atri Sharma, Ankit Singhal, Michael Adam Sokolov, Simon Steiner, Benoit Tellier, Josh Thompson, Abhishek Tiwari, Sven Vogel, William Guo Wei, Ming Wen, Andrew Wetmore, and Liang Zhang.

The ASF incorporated in 1999 with a core membership of 21 individuals who oversaw the progress of the Apache HTTP Server. This group grew with Committers —developers who contributed code, patches, documentation, and other contributions, and were subsequently granted access by the Membership:

  •  to "commit" or "write" directly to Apache code repositories as well as make non-code contributions;
  •  the right to vote on community-related decisions; and
  •  the ability to propose an active contributor for Committership.

Those Committers who demonstrate merit in the Foundation's growth, evolution, and progress are nominated for ASF Membership by existing Members.

This election brings the total number of ASF Members to 853 today. Individuals elected as ASF Members legally serve as the "shareholders" of the Foundation https://www.apache.org/foundation/governance/members.html

For more information on how the ASF works, visit http://www.apache.org/foundation/how-it-works.html 

Apache Is Open https://blogs.apache.org/foundation/entry/apache-is-open and 

Briefing: The Apache Way http://apache.org/theapacheway/

# # #

Thursday March 11, 2021

Announcing New ASF Board of Directors

At The Apache Software Foundation (ASF) Annual Members' Meeting held this week, the following individuals were elected to the ASF Board of Directors:

  • Bertrand Delacretaz (current Director)
  • Roy Fielding (current Director)
  • Sharan Foga (new Director)
  • Justin Mclean (current Director)
  • Craig Russell (current Director)
  • Sam Ruby (current Director)
  • Roman Shaposhnik (former Director)
  • Sander Striker (current Director)
  • Sheng Wu (new Director)


The ASF thanks Shane Curcuru, Patricia Shanahan, and Niclas Hedhman (who resigned from the Board prior to the Members’ Meeting) for their service, and welcomes our new and returning directors.

An overview of the ASF's governance, along with the complete list of ASF Board of Directors, Executive Officers, and Project/Committee Vice Presidents, can be found at http://apache.org/foundation/

For more information on the Foundation's operations and structure, see http://apache.org/foundation/how-it-works.html#structure

# # #

Tuesday February 23, 2021

The Apache® Software Foundation Sustains its Mission of Providing Software for the Public Good through Corporate Sponsorships and Charitable Giving

World's largest Open Source foundation provides more than $22B worth of community-led software at 100% no charge to users worldwide.

Wilmington, DE —23 February 2021— The Apache® Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Corporate Sponsorship and Charitable Giving has enabled the Foundation to sustain its mission of providing software for the public good.

The ASF is the world's largest Open Source foundation. Apache software projects are integral to nearly every end-user computing device, benefit billions of users worldwide, with Web requests received from every Internet-connected country on the planet. Valued conservatively at more than $22B, Apache Open Source software is available to the public-at-large at 100% no cost. No payment of any kind is ever required to use, contribute to, or otherwise participate in Apache projects. The ASF depends on tax-deductible Sponsorships and donations to offset its operations expenses that include infrastructure, marketing and publicity, accounting, and legal services.

"We are proud of our Sponsors, whose generous support helps our volunteer community continue to develop essential software that keeps the world running," said Daniel Ruggeri, ASF Vice President of Fundraising. "ASF Sponsorship allows us to make great strides towards developing and improving our projects, enriching our communities, educating and mentoring newcomers, and encouraging and facilitating participation by under-represented groups. Fiscal support today secures the groundwork to ensure future Apache benefits can be shared by all."

ASF Sponsors include:

Platinum —Amazon Web Services, Facebook, Google, Huawei, Microsoft, Namebase, Pineapple Fund, Tencent, and Verizon Media.

Gold —Anonymous, Baidu, Bloomberg, Cloudera, Confluent, IBM, Indeed, Reprise Software, Union Investment, and Workday.

Silver —Aetna, Alibaba Cloud Computing, Capital One, Comcast, Didi Chuxing, Red Hat, and Target.

Bronze —Bestecasinobonussen.nl, Bookmakers, Casino2k, Cerner, Curity, Gundry MD, GridGain, Host Advice, HotWax Systems, LeoVegas Indian Online Casino, Miro-Kredit AG, Mutuo Kredit AG, Online Holland Casino, ProPrivacy, PureVPN, RX-M, RenaissanceRe, SCAMS.info, SevenJackpots.com, Start a Blog by Ryan Robinson, Talend, The Best VPN, The Blog Starter, The Economic Secretariat, Top10VPN, and Twitter.

In addition to ASF Sponsors, Targeted Sponsors provide in-kind support for select Foundation operations and initiatives that benefit Apache Projects and their communities. They include:

Platinum —Amazon Web Services, CloudBees, DLA Piper, JetBrains, Leaseweb, Microsoft, OSU Open Source Labs, Sonatype, and Verizon Media.

Gold —Atlassian, Datadog, Docker, PhoenixNAP, and Quenda.

Silver —HotWax Systems, Manning Publications, and Rackspace.

Bronze —Bintray, Education Networks of America, Friend of Apache Cordova, Hopsie, Google, No-IP, PagerDuty, Peregrine Computer Consultants Corporation, Sonic.net, SURFnet, and Virtru.

"We deeply appreciate the ongoing support over the course of this unprecedentedly challenging year," said Sally Khudairi, ASF Vice President of Sponsor Relations. "Widespread awareness of the value of The Apache Software Foundation has led organizations and individuals to reach deep and help ensure our day-to-day operations continue without interruption. We are grateful and humbled by the support."

Corporate Contributions
In addition to Sponsorship, a variety of Corporate Giving programs benefit the ASF. They include:

Annual Corporate Giving —organizations such as Bloomberg, IBM, Microsoft, PayPal, Vanguard, and many others offer tax benefits and provide their employees the ability to boost their support of a diverse set of nonprofit organizations that include the ASF.

Matching Gifts and Volunteer Grants —donations to the ASF can be doubled or tripled through a corporate matching gift program. Employers such as American Express, AOL, Bloomberg, IBM, and Microsoft match contributions and volunteer hours made by their employees.

Charitable Gifts and Payroll Giving —as an official charity in Benevity https://www.benevity.com/ , the Blackbaud Giving Fund https://blackbaudgivingfund.org/ , and other philanthropic giving distributors, the ASF benefits from numerous corporate giving initiatives, such as the Microsoft Tech Talent for Good volunteer program and Charles Schwab Charitable, among others.

Individual Donations
Individuals and organizations wishing to support Apache with one-time and recurring tax-deductible donations using a credit or debit card, PayPal, ACH electronic bank transfer, or Apple/Google/Microsoft Pay on their mobile device are invited to do so at https://donate.apache.org/ . Supporting Apache through an online purchase from Amazon, using cryptocurrency, mailing in a check, and other methods are also possible.

For more information, including ways to support the ASF, visit http://apache.org/foundation/contributing.html

Learn about the ASF's commitment to providing software for the public good in "Apache Everywhere" https://s.apache.org/ApacheEverywhere

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,100 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF .


© The Apache Software Foundation. "Apache", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday February 16, 2021

The Apache Software Foundation Announces Apache® Gobblin™ as a Top-Level Project

Open Source distributed Big Data integration framework in use at Apple, CERN, Comcast, Intel, LinkedIn, Nerdwallet, PayPal, Prezi, Roku, Sandia National Labs, Swisscom, Verizon, and more.

Wilmington, DE —16 February 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Gobblin™ as a Top-Level Project (TLP).

Apache Gobblin is a distributed Big Data integration framework used in both streaming and batch data ecosystems. The project originated at LinkedIn in 2014, was open-sourced in 2015, and entered the Apache Incubator in February 2017.

"We are excited that Gobblin has completed the incubation process and is now an Apache Top-Level Project," said Abhishek Tiwari, Vice President of Apache Gobblin and software engineering manager at LinkedIn. "Since entering the Apache Incubator, we have completed four releases and grown our community the Apache Way to more than 75 contributors from around the world."

Apache Gobblin is used to integrate hundreds of terabytes and thousands of datasets per day by simplifying the ingestion, replication, organization, and lifecycle management processes across numerous execution environments, data velocities, scale, connectors, and more.

"Originally creating this project, seeing it come to life and solve mission-critical problems at many companies has been a very gratifying experience for me and the entire Gobblin team," said Shirshanka Das, Founder and CTO at Acryl Data, and member of the Apache Gobblin Project Management Committee.

As a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems, Apache Gobblin makes the arduous task of creating and maintaining a modern data lake easy. It supports the three main capabilities required by every data team: 

  • Ingestion and export of data from a variety of sources and sinks into and out of the data lake while supporting simple transformations. 
  • Data Organization within the lake (e.g. compaction, partitioning, deduplication).
  • Lifecycle and Compliance Management of data within the lake (e.g. data retention, fine-grain data deletions) driven by metadata.

"Apache Gobblin supports deployment models all the way from a single-process standalone application to thousands of containers running in cloud-native environments, ensuring that your data plane can scale with your company’s growth," added Das.

Apache Gobblin is in use at Apple, CERN, Comcast, Intel, LinkedIn, Nerdwallet, PayPal, Prezi, Roku, Sandia National Laboratories, Swisscom, and Verizon, among many others.

"We chose Apache Gobblin as our primary data ingestion tool at Prezi because it proved to scale, and it is a swiss army knife of data ingestion," said Tamas Nemeth, Tech Lead and Manager at Prezi. "Today, we ingest, deduplicate, and compact more than 1200 Apache Kafka topics with its help, and this number is still growing. We are looking forward to continuing to contribute to the project and helping the community enable other companies to use Apache Gobblin."

"Apache Gobblin has been at the center stage of the data management story at LinkedIn. We leverage it for various use-cases ranging from ingestion, replication, compaction, retention, and more," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It is battle-tested and serves us well at exabyte scale. We firmly believe in the data wrangling capabilities that Gobblin has to offer, and we will continue to contribute heavily and collaborate with the Apache Gobblin community. We are happy to see that Gobblin has established itself as an industry standard and is now an Apache Top-Level Project."

"Open community and meritocracy are the key drivers for Apache Gobblin's success," added Tiwari. "We invite everyone interested in the data management space to join us and help shape the future of Gobblin."

Catch Apache Gobblin in action in the upcoming hackathon planned for late Q1 2021. Details will be posted on the Apache Gobblin mailing lists and Twitter feed listed below.

Availability and Oversight
Apache Gobblin software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Gobblin, visit https://gobblin.apache.org/ and https://twitter.com/ApacheGobblin 

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/ 

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,000 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF 

© The Apache Software Foundation. "Apache", "Gobblin", "Apache Gobblin", "Hadoop", "Apache Hadoop", "MapReduce", "Apache MapReduce", "Mesos", "Apache Mesos", "YARN", "Apache YARN", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation