Entries tagged [open]

Monday May 09, 2022

The Apache News Round-up: week ending 6 May 2022

Welcome, May --we're opening the month with another great week. Here's what the Apache community has been up to:

ApacheCon – the ASF's official global conference series, bringing Tomorrow's Technology Today since 1998.
 - CFP open: ApacheCon Asia - 29-31 July (online) https://apachecon.com/acasia2022/cfp.html
 - CFP open: ApacheCon North America - 3-6 October (New Orleans) https://cfp.apachecon.com/
 - Travel Assistance applications open: for ApacheCon North America. Apply today https://apache.org/travel/

ASF Board – management and oversight of the business affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 18 May 2022. Running Board calendar and minutes are available.

ASF Infrastructure – our distributed team on three continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield uptime at 100.00%. Performance checks across 50 different service components spread over more than 250 machines in data centers around the world. View the ASF's Infrastructure Uptime site to see the most recent averages.

Apache Code Snapshot – Over the past week, 298 Apache Committers and 714 contributors changed 2,668,743 lines of code over 3,291 commits. Top 5 contributors, in order, are: Jean-Baptiste Onofré, Gary Gregory, Liang Zhang, Jiajing Lu, and Benoit Tellier. 

Apache Project Announcements – the latest updates by category.

Big Data --
 - Apache NiFi 1.16.1 released
   -- CVE-2022-29265: Improper Restriction of XML External Entity References in Multiple Components 
 - Apache Flink 1.15.0 released 

Content --
 - Apache Tika 1.28.2 and 2.4.0 released 
 - Apache PDFBox 3.0.0-alpha3 released 

Integration --
 - Apache Camel 3.11.7 (LTS) and 3.14.3 (LTS) released

Libraries --
 - Apache Jena CVE-2022-28890: Processing external DTDs

Messaging --
 - Apache ActiveMQ 5.16.5 and 5.17.1 released

Workflow --
 - Apache Airflow 2.3.0 released

Web Frameworks -
 -
Apache Wicket 9.10.0 released


Did You Know?

- Did you know that the following Apache projects are celebrating anniversaries this month? Congratulations to Apache Geronimo (18 years); Tomcat (17 years); OpenJPA, POI, TomEE, Turbine (15 years); Libcloud (11 years); Giraph, ManifoldCF (10 years); Phoenix (8 years); Whimsy (7 years); Bahir, TinkerPop, Zeppelin (6 years); SystemDS (5 years); Traffic Control (4 years); Dubbo (3 years); Hudi, Iceberg (2 years). https://projects.apache.org/committees.html?date

- Did you know that the ASF Security team has opened a paid position for Security Response Program Manager? https://blogs.apache.org/security/entry/position-available-security-response-program

- Did you know that Japan's Nara Women's University's Researchers Database webapp is powered by Apache Wicket? https://wicket.apache.org/

Apache Community Notices

 - Apache in 2021 - By The Digits + Video highlights 

 - Watch "Trillions and Trillions Served", the documentary on the ASF 1) full feature [49 min] 2) "Apache Everywhere" [6 min] 3) "Why Apache" [2.5 min] 4) "Apache Innovation" [40 min] 

 - ASF Annual Report: FY2021 (PDF)

 - The Apache Way to Sustainable Open Source Success 

 - Foundation Reports and Statements

 - Presentations from 2021's ApacheCon Asia and ApacheCon@Home are available on the ASF YouTube channel.

 - "Success at Apache" focuses on the people and processes behind why the ASF "just works." 

 - Follow the ASF on social media: @TheASF on Twitter and The ASF page LinkedIn

 - Follow the Apache Community on Facebook and Twitter

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos.


Stay updated about The ASF

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, Planet Apache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

Monday May 02, 2022

The Apache News Round-up: week ending 29 April 2022

Farewell, April --we're wrapping up the month with another great week. Here are the latest updates on the Apache community's activities:

ApacheCon – the ASF's official global conference series, bringing Tomorrow's Technology Today since 1998.
 - CFP open: ApacheCon Asia - 29-31 July (online) https://apachecon.com/acasia2022/cfp.html
 - CFP open: ApacheCon North America - 3-6 October (New Orleans) https://cfp.apachecon.com/
 - Travel Assistance applications open: for ApacheCon North America. Apply today https://apache.org/travel/

ASF Board – management and oversight of the business affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 18 May 2022. Running Board calendar and minutes are available.

ASF Infrastructure – our distributed team on three continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield uptime at 100.00%. Performance checks across 50 different service components spread over more than 250 machines in data centers around the world. View the ASF's Infrastructure Uptime site to see the most recent averages.

Apache Code Snapshot – Over the past week, 330 Apache Committers and 838 contributors changed 3,654,519 lines of code over 3,500 commits. Top 5 contributors, in order, are: Andi Huber, Liang Zhang, Jean-Baptiste Onofré, Tamas Cservenak, and Tim Allison.    

Apache Project Announcements – the latest updates by category.

Application Servers/Middleware --
 - Apache Karaf 4.3.7 and Karaf runtime 4.4.0 released

Big Data --
 - Apache CouchDB CVE-2022-24706: Remote Code Execution Vulnerability in Packaging 

Libraries --
 - Apache Log4cxx 0.13.0 released

Messaging --
 - Apache Qpid JMS 2.0.0 released 

Observability --
 - Apache SkyWalking Kubernetes Event Exporter 1.0.0 released

Programming Languages --
 - Apache Groovy 4.0.2 released 

Workflow -
 -
New Apache Airflow Providers released 


Did You Know?

- Did you know that enterprises seeking to meet their growing demand for rapid analytics with terabytes of real-time analytical data use Apache Ignite

- Did you know that the CFP for Pulsar Summit (18 August/San Francisco) closes on 21 May?

- Did you know that Airflow Summit will be held 23-27 May online and free of charge? 

Apache Community Notices

 - Apache in 2021 - By The Digits + Video highlights 

 - Watch "Trillions and Trillions Served", the documentary on the ASF 1) full feature [49 min] 2) "Apache Everywhere" [6 min] 3) "Why Apache" [2.5 min] 4) "Apache Innovation" [40 min] 

 - ASF Annual Report: FY2021 (PDF)

 - The Apache Way to Sustainable Open Source Success 

 - Foundation Reports and Statements

 - Presentations from 2021's ApacheCon Asia and ApacheCon@Home are available on the ASF YouTube channel.

 - "Success at Apache" focuses on the people and processes behind why the ASF "just works." 

 - Follow the ASF on social media: @TheASF on Twitter and The ASF page LinkedIn

 - Follow the Apache Community on Facebook and Twitter

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos.


Stay updated about The ASF

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, Planet Apache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

Monday April 25, 2022

The Apache News Round-up: week ending 22 April 2022

Hello, everyone --let's review the Apache community's activities from over the past week:

ApacheCon – the ASF's official global conference series, bringing Tomorrow's Technology Today since 1998.
 - CFP open: ApacheCon Asia - 29-31 July (online) https://apachecon.com/acasia2022/cfp.html
 - CFP open: ApacheCon North America - 3-6 October (New Orleans) https://cfp.apachecon.com/
 - Travel Assistance applications open: for ApacheCon North America. Apply today https://apache.org/travel/

ASF Board – management and oversight of the business affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 18 May 2022. Running Board calendar and minutes are available.

ASF Infrastructure – our distributed team on three continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield uptime at 99.99%. Performance checks across 50 different service components spread over more than 250 machines in data centers around the world. View the ASF's Infrastructure Uptime site to see the most recent averages.

Apache Code Snapshot – Over the past week, 306 Apache Committers and 772 contributors changed 17,057,661 lines of code over 3,534 commits. Top 5 contributors, in order, are: Michael Osipov, Benoit Tellier, Olivier Lamy, Gary Gregory, and Henrik Krohns.

Apache Project Announcements – the latest updates by category.

APIs --
 - Apache APISIX 2.13.1 released
    -- CVE-2022-29266: apisix/jwt-auth may leak secrets in the error response 
 - Apache ShenYu (Incubating) 2.4.3 released

Big Data --
 - Apache CouchDB 3.2.2 released 
 - Apache Beam 2.38.0 released 
 - Apache Kyuubi (Incubating) 1.5.1-incubating released 

Confidential Computing --
 - Apache Teaclave (incubating) 0.4.0 and Teaclave TrustZone SDK 0.2.0 released 

Content --
 - Apache PDFBox 2.0.26 released 

Messaging --
 - Apache Pulsar 2.9.2 and 2.10.0 released 

Middleware --
 - Apache Linkis 1.1.0 (incubating) released

Workflow --
 - Apache DolphinScheduler 3.0.0-alpha released 

Servers -
 - Apache TomEE 8.0.11
released 

Did You Know?

- Did you know that recent projects undergoing development in the Apache Incubator include HugeGraph (graph database), Linkis (computational middleware), and SeaTunnel (Big Data integration)? https://incubator.apache.org/projects/

- Did you know that the ASF is the top-ranked Open Source not-for-profit organization with the most stars on GitHub? Also ranked #4 of all organizations https://gitstar-ranking.com/

- Did you know that the CFP for Ignite Summit (14 June - online) is now open? https://ignite-summit.org/

Apache Community Notices

 - Apache in 2021 - By The Digits + Video highlights 

 - Watch "Trillions and Trillions Served", the documentary on the ASF 1) full feature [49 min] 2) "Apache Everywhere" [6 min] 3) "Why Apache" [2.5 min] 4) "Apache Innovation" [40 min] 

 - ASF Annual Report: FY2021 (PDF)

 - The Apache Way to Sustainable Open Source Success 

 - Foundation Reports and Statements

 - Presentations from 2021's ApacheCon Asia and ApacheCon@Home are available on the ASF YouTube channel.

 - "Success at Apache" focuses on the people and processes behind why the ASF "just works." 

 - Follow the ASF on social media: @TheASF on Twitter and The ASF page LinkedIn

 - Follow the Apache Community on Facebook and Twitter

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos.


Stay updated about The ASF

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, Planet Apache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

Monday April 18, 2022

The Apache News Round-up: week ending 15 April 2022

Happy Friday, everyone --here's what the Apache community has been up to over the past week:

ApacheCon – the ASF's official global conference series, bringing Tomorrow's Technology Today since 1998.
 - CFP for ApacheCon North America 2022 (taking place 3-6 October in New Orleans) is now open https://blogs.apache.org/conferences/entry/call-for-presentations-apachecon-north
 - Travel Assistance applications for ApacheCon are open until 1 July https://apache.org/travel/

ASF Board – management and oversight of the business affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 20 April 2022. Running Board calendar and minutes are available.

ASF Infrastructure – our distributed team on three continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield uptime at 99.99%. Performance checks across 50 different service components spread over more than 250 machines in data centers around the world. View the ASF's Infrastructure Uptime site to see the most recent averages.

Apache Code Snapshot – Over the past week, 402 Apache Committers and 1,048 contributors changed 17,760,803 lines of code over 5,646 commits. Top 5 contributors, in order, are: Otavio Rodolfo Piske, Andi Huber, Jinrui.Zhang, Liang Zhang, and Dillon Walls.   

Apache Project Announcements – the latest updates by category.

Incubator --where new Apache projects (aka "podlings") are mentored in the Apache Way of community-led development.
 - Apache brpc (incubating) 1.1.0 released

Attic --provides process and solutions when an Apache project has reached its end of life.
 - Apache River is now retired

Big Data --
 - Apache Bigtop 3.0.1 released
 - Apache ShardingSphere 5.1.1 released

Business Intelligence/Data Visualization --
 - Apache Superset CVE-2022-27479: SQL injection vulnerability in chart data API

Messaging --
 - Apache Qpid ProtonJ2 1.0.0-M5 released

Observability --
 - Apache SkyWalking 9.0.0, Client JS version 0.8.0, and Java Agent 8.10.0 released

Web Frameworks --
Apache Struts 2.5.30 released
- Apache Struts CVE-2021-31805: Forced OGNL evaluation ...
Apache Wicket 9.9.1 released

Workflow --
- New Apache Airflow Providers released 

Did You Know?

- Did you know that Apache APISIX Summit Asia will be held online 20-21 May? https://s.apache.org/rhzue 

- Did you know that the next Apache Airflow Community Meetup is taking place on 20 April 2022? https://www.crowdcast.io/e/airflow-meetup-april/register

- Did you know that demand for Apache Syncope identity management artifacts were downloaded 22.5K times over the last month? https://syncope.apache.org/

Apache Community Notices

 - Apache in 2021 - By The Digits + Video highlights 

 - Watch "Trillions and Trillions Served", the documentary on the ASF 1) full feature [49 min] 2) "Apache Everywhere" [6 min] 3) "Why Apache" [2.5 min] 4) "Apache Innovation" [40 min] 

 - ASF Annual Report: FY2021 (PDF)

 - The Apache Way to Sustainable Open Source Success 

 - Foundation Reports and Statements

 - Presentations from 2021's ApacheCon Asia and ApacheCon@Home are available on the ASF YouTube channel.

 - "Success at Apache" focuses on the people and processes behind why the ASF "just works." 

 - Follow the ASF on social media: @TheASF on Twitter and The ASF page LinkedIn

 - Follow the Apache Community on Facebook and Twitter

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos.


Stay updated about The ASF

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, Planet Apache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

Tuesday February 08, 2022

Foundation Statement at 8 February 2022 Senate Committee hearing on Homeland Security and Government Affairs

“Responding to and Learning from the Log4Shell Vulnerability”

Opening Statement by David Nalley

President, Apache Software Foundation

Senate Committee on Homeland Security and Government Affairs

February 8, 2022


    Chairman Peters, Ranking Member Portman, and distinguished members of the Committee: thank you for the invitation to appear this morning.

    My name is David Nalley, and I am the President of the Apache Software Foundation (ASF). The ASF is a non-profit public-benefit charity established in 1999 to facilitate the development of open source software. Thanks to the ingenuity and collaboration of our community of programmers, the ASF has grown into one of the largest open source organizations in the world. Today, more than 650,000 contributors around the world contribute to more than 350 ongoing projects, comprising more than 237 million lines of code.

    Open source is not simply a large component of the software industry -- it is one of the foundations of the modern global economy. Whether they realize it or not, most businesses, individuals, non-profits, or government agencies depend on open source; it is an indispensable part of America’s digital infrastructure.

    Projects developed from open source, like Log4j, tend to resolve problems that many people have, essentially serving as reusable building blocks for solving those problems. This enables faster innovation because it eliminates the need for every company or developer to reimplement software for already solved problems. This efficiency allows programmers to stand on the shoulders of giants. The ASF provides a vendor-neutral environment to enable interested programmers – oftentimes direct competitors of one another – to do this common work together in transparent, open-handed cooperation.

This is the essence of open-source software: brilliant individuals contributing their time and expertise to do unglamorous work solving problems – many with the intent of incorporating the results into their employer’s products. And it’s why I’ve dedicated my professional life to it.

    Log4j – first released by Apache in 2001 – is the product of just this kind of collaboration. It performs a particular set of functions, like recording a computer’s operating events, so well that it has been used in products as diverse as storage management software, software development tools, virtualization software and (most famously) the Minecraft video game. As Log4j’s footprint grew over the years, so did its feature list. It was a 2013 addition to Log4j, along with a part of the Java programming environment, that combined in such a way that exposed this security flaw.

    The vulnerability was reported to Apache’s Log4j team late November 2021, after having been latent for many years. The Apache Logging project, and Apache’s Security team immediately got to work addressing the vulnerability in the code. The full solution was released approximately two weeks later. Given the near ubiquity of Log4j’s use, it may be months or even years before all deployed instances of this vulnerability are eliminated. As a software professional myself, I am proud of how the Logging project and the ASF’s security team (and many others across the ASF’s projects) responded and remediated last fall. We acted quickly and in accordance with practices we have adopted over many years of supporting a diverse set of open source projects. We will continue to develop our projects in responding to and preventing security vulnerabilities.

    Moreover, every stakeholder in the software industry – including its largest customers, like the federal government – should be investing in software supply chain security. While ideas like the Software Bills of Materials won’t prevent vulnerabilities, they can mitigate the impact by accelerating the identification of potentially vulnerable software. However, the ability to quickly update to the most secure and up-to-date versions remains a significant hurdle for the software industry.

    The reality is that humans write software, and as a result there will continue to be bugs, and despite best efforts some of those will include security vulnerabilities. As we continue to become ever more connected and digital, the number of vulnerabilities and potential consequences are likely to grow. There is no easy software security solution - it requires defense in depth – incorporating upstream development in open source projects, vendors that incorporate these projects, developers that make use of the software in custom applications, and even down to the organizations that deploy these applications to provide services important to their users.

    Rather than shying away from this risk, I submit that software developers, open-source communities, and federal policymakers should face it head-on together – with the determination and the vigilance it demands.

    Thank you again, and I look forward to answering any questions you might have.

Tuesday January 18, 2022

The Apache Software Foundation Announces Open Source data orchestration platform Apache® Hop™ as a Top-Level Project

Wilmington, DE —18 January 2022— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Hop™ as a Top-Level Project (TLP).

Apache Hop —the Hop Orchestration Platform— is a flexible, metadata-infused data orchestration, engineering, and integration platform. The project originated more than two decades ago as the Extract-Transform-Load (ETL) platform Kettle (Pentaho Data Integration), was refactored over several years, and entered the Apache Incubator in September 2020. 

"We are pleased to successfully adopt 'the Apache Way' and graduate from the Apache Incubator," said Bart Maertens, Vice President of Apache Hop. "Apache Hop enables people of all skill levels to build powerful and scalable data solutions without the need to write code. As an Apache Top-Level Project, Hop is developed and used by people across the globe. Hop's full project life cycle support helps these data teams to successfully build, test and run their projects in ways that would otherwise be hard or impossible to do."

Using Apache Hop, data professionals can rapidly and affordably facilitate all aspects of data and metadata orchestration whilst supporting DevOps best practices, such as testing. Apache Hop’s Java-based visual designer, server, and configuration tools are easy to set up, deploy, and maintain across numerous platforms. Features include:

  • Lightweight “design once, run anywhere” architecture —workflows and pipelines can be designed in the Hop GUI and executed locally or remotely on the Hop native engine, on Apache Flink, Apache Kafka, Apache Spark, Google Dataflow, or AWS EMR through Apache Beam runtimes;

  • Metadata-driven —every object type in Hop describes how data is read, manipulated or written, or how workflows and pipelines need to be orchestrated. In addition, Hop itself is internally metadata-driven, using a kernel architecture with a robust engine; 

  • Visual development environment —intuitive drag-and-drop graphical user interface (GUI) enables developers to enjoy the ease and productivity of visual development rather than code. Using Hop, data engineers can focus on business logic and requirements rather than how it needs to be done;

  • Plug-in integration —more than 250 plugins make it easy to manage ecosystem complexity, and add new functionality; and

  • Built-in lifecycle management —enables developers, engineers, and administrators to manage, test, deploy, and switch between projects, workflows, pipelines, environments, purposes, Git versions and more —all from the Hop GUI.


Apache Hop has been designed to work in any scenario: on-premises, on a cloud, on a bare OS, in containers, IoT environments, large datasets, and more, on Windows, Linux, and OSX.

Many of the thousands of organizations in finance, retail, supply chain, and other sectors that use Kettle (Pentaho Data Integration; the precursor to Apache Hop) have started to look into Hop or already are in the process of upgrading to Hop.

"I'm very happy that we can now safely collaborate with any company or person across the global community under the umbrella of the Apache Software Foundation on something as cool as Apache Hop," said Matt Casters, Chief Solution Architect at Neo4j and member of the Apache Hop Project Management Committee.

"We started adopting Apache Hop in our data integration projects in early 2021 because of its flexibility, scalability and ease of use, in various scenarios ranging from classical DWH ETL processes to highly critical, real time processes," said Sergio Ramazzina, CEO and Chief Architect at Serasoft S.r.l., and member of the Apache Hop Project Management Committee. "We are impressed by how responsive the community is in solving issues and helping users approaching the platform --an important point to increase users adoption and trust. We welcome everyone joining our Hop community and contributing to the project."

"This graduation is just the beginning for Hop, and is proof that great communities build great software. The entire Hop community would like to thank the Apache Software Foundation for making this possible, especially our mentors who guided us through the Incubator," added Maertens. "We invite everyone to download and try Hop, join our chat and become part of the Hop community."

Catch Apache Hop in action at a future Hop community event. For more information and to register, visit https://hop.apache.org/community/events/ 

Availability and Oversight
Apache Hop software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Hop, visit https://hop.apache.org/ and https://twitter.com/ApacheHop 

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/ 

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 820+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,400+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors that include Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Replicated, Talend, Target, Tencent, Union Investment, Workday, and Yahoo!. For more information, visit http://apache.org/ and https://twitter.com/TheASF 

© The Apache Software Foundation. "Apache", "Hop", "Apache Hop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday October 07, 2021

The Apache Software Foundation Announces Apache® OpenOffice® 4.1.11

Updates to security and availability of leading Open Source office document productivity suite

Wilmington, DE —7 October 2021— The Apache® Software Foundation (ASF), the world’s largest Open Source foundation, announced today Apache OpenOffice® 4.1.11, the popular Open Source office-document productivity suite.

Used by millions of organizations, institutions, and individuals around the world, Apache OpenOffice delivered 317M+ downloads* and provides more than $25M in value to users per day. Apache OpenOffice supports more than 40 languages, offers hundreds of ready-to-use extensions, and is the productivity suite of choice for governments seeking to meet mandates for using ISO/IEC standard Open Document Format (ODF) files.

"Users worldwide depend on OpenOffice to meet their office productivity needs," said Carl Marcum, Vice President of Apache OpenOffice. "We are proud to offer improved security and availability with our latest release. Businesses of all sizes across numerous industries, educational institutions, non-profits, digitally-inclusive communities, application developers, and countless others rely on Apache OpenOffice to efficiently create, manage, and deliver high-impact, integrated content."

Apache OpenOffice comprises six productivity applications: Writer (word processor), Calc (spreadsheet tool), Impress (presentation editor), Draw (vector graphics drawing editor), Math (mathematical formula editor), and Base (database management program). The OpenOffice suite ships for Windows, macOS, and Linux.

Apache OpenOffice v4.1.11
The 14th release under the auspices of the ASF, OpenOffice v4.1.11 reflects dozens of improvements, features, and bug fixes that include:

  • New Writer Fontworks gallery
  • Updated document types where hyperlink is allowed
  • Updated Windows Installer
  • Increased font size in Help


In addition, the project is mitigating 5 CVE (Common Vulnerabilities and Exposures) reports, three of which will be disclosed on 11 October, in coordination with The Document Foundation.

Apache OpenOffice delivers up to 2.4M downloads per month and is available as a free download to all users at 100% no cost, charge, or fees of any kind.

Apache OpenOffice is available on the Windows 11 Store as of 5 October 2021.

OpenOffice source code is available for anyone who wishes to enhance the applications. The Project welcomes contributions back to the project as well as its code community. Those interested in participating with Apache OpenOffice can learn more at https://openoffice.apache.org/get-involved.html .

* partial count: the number above reflects full-install downloads of Apache OpenOffice via SourceForge as of September 2021.

Tribute
Of special note, Apache OpenOffice 4.1.11 is dedicated to the memory of Dr. Patricia Shanahan, late member of the Apache OpenOffice Project Management Committee, former member of the ASF Board of Directors, former Vice President Apache River, and contributor to Apache Community Development. More information on Patricia can be found at the ASF's memorial page http://apache.org/memorials/patricia_shanahan.html . 

Availability and Oversight
Apache OpenOffice software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. The project strongly recommends that users download OpenOffice only from the official site https://www.openoffice.org/download/ to ensure that they receive the original software in the correct and most recent version.

About Apache OpenOffice
Apache OpenOffice is a leading Open Source office-document productivity suite comprising six productivity applications: Writer, Calc, Impress, Draw, Math, and Base. OpenOffice is based around the OpenDocument Format (ODF), supports 40+ languages, and ships for Windows, macOS, and Linux. OpenOffice originated as "StarOffice" in 1985 by StarDivision, who was acquired by Sun Microsystems in 1999. The project was open-sourced under the name "OpenOffice.org", and continued development after Oracle Corporation acquired Sun Microsystems in 2010. OpenOffice entered the Apache Incubator in 2011 and graduated as an Apache Top-level Project in October 2012. Apache OpenOffice delivers up to 2.4 Million downloads each month is the productivity suite of choice for hundreds of educational institutions and government organizations seeking to meet mandates for using ISO/IEC standard Open Document Format (ODF) files. For more information, including documentation and ways to become involved with Apache OpenOffice, visit https://openoffice.apache.org/ and https://twitter.com/ApacheOO .

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Replicated, Reprise Software, Talend, Target, Tencent Cloud, Union Investment, Workday, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF .

© The Apache Software Foundation. "Apache", "OpenOffice", "Apache OpenOffice", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

#  #  #

Monday August 30, 2021

The Apache Drill Project Announces Apache® Drill(TM) v1.19 Milestone Release

Open Source, enterprise-grade, schema-free Big Data SQL query engine used by thousands of organizations, including Ant Group, Cisco, Ericsson, Intuit, MicroStrategy, Tableau, TIBCO, TransUnion, Twitter, and more.

Wilmington, DE —30 August 2021— The Apache Drill Project announced the release of Apache® DrillTM v1.19, the schema-free Big Data SQL query engine for Apache Hadoop®, NoSQL, and Cloud storage.

"Drill 1.19 is our biggest release ever," said Charles Givre, Vice President of Apache Drill. "With an already short learning curve, Drill 1.19 makes it even easier for users to quickly query, analyze, and visualize data from disparate sources and complex data sets.”

An "SQL-on-Hadoop" engine, Apache Drill is easy to deploy, highly performant, able to quickly process trillions of records, and scalable from a single laptop to a 1000-node cluster. With its schema-free JSON model (the first distributed SQL query engine of its kind), Drill is able to query complex semi-structured data in situ without requiring users to define schemas or transform data. It provides plug-and-play integration with existing Hive and HBase deployments, and is extensible out-of-the-box to access multiple data sources, such as S3 and Apache HDFS, HBase, and Hive. Additionally, Drill can directly query data from REST APIs to include platforms like SalesForce and ServiceNow. 

Drill supports the ANSI SQL 2003 standard syntax ecosystem as well as dozens of NoSQL databases and file systems, including Apache HBase, MongoDB, Elasticsearch, Cassandra, REST APIs, , HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, NAS,  local files, and more. Drill leverages familiar BI tools (such as Apache Superset, Tableau, MicroStrategy, QlikView and Excel) as well as data virtualization and visualization tools, and runs interactive queries on Hive tables with different Hive metastores.

Apache Drill v1.19
Drill is designed from the ground up to support high-performance analysis on rapidly evolving data on modern Big Data applications. v1.19 reflects more than 100 changes, improvements, and new features that include:

  • New Connectors for Apache Cassandra, Elasticsearch, and Splunk.

  • New Format Reader for XML without schemas

  • Added Avro support for Kafka plugin

  • Integrated password vault for secure credential storage

  • Support for Linux ARM64 systems

  • Added limit pushdowns for file systems, HTTP REST APIs and MongoDB

  • Added streaming for Drill's REST API

  • Integration with Apache Airflow


Developers, analysts, business users, and data scientists use Apache Drill for data exploration and analysis for its enterprise-grade reliability, security, and performance. Drill's flexibility and ease-of-use have attracted thousands of users that include Ant Group, Cardlytics, Cisco, Ericsson, Intuit, MicroStrategy, Qlik, Tableau, TIBCO, TransUnion, Twitter, National University of Singapore, and more.

"Individuals, businesses, and organizations of all types rely on Apache Drill's rich functionality," added Givre. "We invite everyone to participate in our user and developer lists as well as our Slack channel, and contribute to the project to build on our momentum and help improve the future experience for all Drill users."

Catch Apache Drill in action at ApacheCon@Home, taking place online 21-23 September 2021. For more information and to register, visit https://www.apachecon.com/ .

Availability and Oversight
Apache Drill software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases.

About Apache Drill
Apache Drill is the Open Source, schema-free Big Data SQL query engine for Apache Hadoop, NoSQL, and Cloud storage. For more information, including documentation and ways to become involved with Apache Drill, visit http://drill.apache.org/ , https://twitter.com/ApacheDrill , and https://apache-drill.slack.com/ .

© The Apache Software Foundation. "Apache", "Drill", "Apache Drill", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

#  #  #

Monday August 02, 2021

The Apache Software Foundation Announces Apache® Pinot™ as a Top-Level Project

Open Source distributed real-time Big Data analytics infrastructure in use at Amazon-Eero, Doordash, Factual/FourSquare, LinkedIn, Stripe, Uber, Walmart, Weibo, and WePay, among others.

Wilmington, DE —2 August 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Pinot™ as a Top-Level Project (TLP).

Apache Pinot is a distributed Big Data analytics infrastructure created to deliver scalable real-time analytics at high throughput with low latency. The project was first created at LinkedIn in 2013, open-sourced in 2015, and entered the Apache Incubator in October 2018.

"We are pleased to successfully adopt 'the Apache Way' and graduate from the Apache Incubator," said Kishore Gopalakrishna, Vice President and original co-creator of Apache Pinot. "Pinot initially pushed the boundaries of real-time analytics by delivering insights to millions of Linkedin users. Today, as an Apache Top-Level Project, Pinot is in the hands of developers across the globe who are building it to power several user-facing  analytical applications and unlock the value of data within their organizations."

Scalable to trillions of records, Apache Pinot’s online analytical processing (OLAP) ingests both online and offline data sources from Apache Kafka, Apache Spark, Apache Hadoop HDFS, flat files, and Cloud storages in real time. Pinot is able to ingest millions of events and serve thousands of queries per second, and provide unified analytics in a distributed, fault-tolerant fashion. Features include:

  • Speed —answers OLAP queries with low latency on real-time data

  • Pluggable indexing —Sorted, Inverted, Text Index, Geospatial Index, JSON Index, Range Index, Bloom filters

  • Smart Materialized Views - Fast Aggregations via star-tree index

  • Supports different stream systems with near real-time ingestion —with Apache Kafka, Confluent Kafka, and Amazon Kinesis, as well as customizable input format, with out-of the box support for Avro and JSON formats

  • Highly available, horizontally scalable, and fault tolerant

  • Supports lookup joins natively and full joins using PrestoDB/Trino

Apache Pinot is used to power internal and external analytics at Adbeat, Amazon-Eero, Cloud Kitchens, Confluera, Doordash, Factual/FourSquare, Guitar Center, LinkedIn, Publicis Sapient, Razorpay, Scale Unlimited, Startree, Stripe, Traceable, Uber, Walmart, Weibo, WePay, and more.

Examples of how Apache Pinot helps organizations across numerous verticals include: 1) a fintech company uses Pinot to achieve financial data visibility across 500+ terabytes of data and sustain half million queries per second with financial transactions; 2) a food delivery service leveraged Pinot in the midst of the COVID-19 pandemic to analyze real-time data to provide a socially-distanced pick-up experience for its riders and restaurants; and 3) a large retail chain with geographically distributed franchises and stores uses Pinot for revenue-generating opportunities by analyzing real-time data for internal use cases, as well as real-time cart analysis to increase sales.

"We rely on Apache Pinot for all our real-time analytics needs at LinkedIn," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It's battle-tested at LinkedIn scale for hundreds of our low-latency analytics applications. We believe Apache Pinot is the best tool out there to build site-facing analytics applications and we will continue to contribute heavily and collaborate with the Apache Pinot community. We are very happy to see that it's now a Top-level Apache project."

"We use Apache Pinot in our real-time analytics platform to power external user-facing applications and critical operational dashboards," said Ujwala Tulshigiri, Engineering Manager at Uber. "With Pinot's multi-tenancy support and horizontal scalability, we have scaled to hundreds of use cases that run complex aggregations queries on terabytes of data at millisecond latencies, with the minimal overhead of cluster management."

"We've been using Apache Pinot since last year, and it's been a huge win for our client’s dashboard project," said Ken Krugler, President of Scale Unlimited. "Pinot's ability to rapidly generate aggregation results over billions of records, with modest hardware requirements, was critical for the success of the project. We've also been able to provide patches to add functionality and fix issues, which the Pinot community has quickly integrated and released. There was never any doubt in our minds that Pinot would graduate from the Apache incubator and become a successful top-level project."

"Last year, we started without analytics built into our product," said Pradeep Gopanapalli, technical staff member at Confluera. "By the end of the year, we were using Apache Pinot for real-time analytics in production. Not many of our competitors can even dream of having such results. We are very happy with our choice."

"Pinot is critical to our real-time analytics platform and allowed us to scale without degrading latency," said software engineer Elon Azoulay. "Pinot enables us to onboard large datasets effortlessly, run complex queries which return in milliseconds and is super reliable. We would like to emphasize how helpful and engaged the community is and are certain that we made the right choice with Pinot, it continues to impress us and satisfy our real-time analytics needs."

"We created Pinot at LinkedIn with the goal of tackling the low-latency OLAP problem for site-facing use cases at scale. We evolved it to solve numerous OLAP use cases, and open-sourced it because there aren't many technologies in that domain," said Subbu Subramaniam, member of the Apache Pinot Project Management Committee, and Senior Staff Engineer at LinkedIn. "It is heart-warming to see such a wide adoption and great contributions from the community in improving Pinot over time."

"We are at the beginning of this transformation and we cannot wait to see every software company build real-time applications using Apache Pinot," added Gopalakrishna. "We welcome everyone to join our community Slack channel and contribute to the project."

Catch Apache Pinot in action at ApacheCon Asia online on 7 August 2021. For more information and to register, visit https://www.apachecon.com/acasia2021/

Availability and Oversight
Apache Pinot software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Pinot, visit http://pinot.apache.org/ and https://twitter.com/ApachePinot

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors that include Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Talend, Tencent, Target, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Pinot", "Apache Pinot", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday May 04, 2021

Media Alert: Apache OpenOffice Recommends upgrade to v4.1.10 to mitigate legacy vulnerability

Wilmington, DE —4 May 2021— 


Who:
Apache OpenOffice, an Open Source office-document productivity suite comprising six productivity applications: Writer, Calc, Impress, Draw, Math, and Base. The OpenOffice suite is based around the OpenDocument Format (ODF), supports 41 languages, and ships for Windows, macOS, Linux 64-bit, and Linux 32-bit. Apache OpenOffice delivers up to 2.4 Million downloads each month.

What: A recently reported vulnerability states that all versions of OpenOffice through 4.1.9 can open non-http(s) hyperlinks, and could lead to untrusted code execution. 

The Apache OpenOffice Project has filed a Common Vulnerabilities and Exposures report with MITRE Corporation’s national vulnerability reporting system:

> CVE-2021-30245: Code execution in Apache OpenOffice via non-http(s) schemes in Hyperlinks
>
> Severity: moderate
>
>Credit: Fabian Bräunlein and Lukas Euler of Positive Security https://positive.security/blog/url-open-rce#open-libreoffice


The complete CVE report is available at https://www.openoffice.org/security/cves/CVE-2021-30245.html

How: Applications of the OpenOffice suite handle non-http(s) hyperlinks in an insecure way, allowing for 1-click code execution on Windows and Xubuntu systems via malicious executable files hosted on Internet-accessible file shares.

Why: The mitigation in Apache OpenOffice 4.1.10 assures that a security warning is displayed to give users the option of continuing to open the hyperlink. Best practice dictates to be careful when opening documents from unknown and unverified sources. 

When: The vulnerability predates OpenOffice entering the Apache Incubator. During the analysis of this issue, it was discovered that an incorrect bug fix was made by the StarOffice/OpenOffice.org developers preparing OpenOffice 2.0 in 2005, whilst under the auspices of Sun Microsystems. 


Where: Download Apache OpenOffice v4.1.10 at https://www.openoffice.org/download/

Apache OpenOffice Highlights

24 October 2020 — 300 million downloads of Apache OpenOffice
14 October 2020 — 20 year anniversary of OpenOffice
18 October 2016 — 200 million downloads of Apache OpenOffice
17 April 2014 — 100 million downloads of Apache OpenOffice
17 October 2012 — OpenOffice graduated as an Apache Top Level Project (TLP)
13 June 2011 — OpenOffice.org entered the Apache Incubator

[downloads are binary installation files]

For more information, visit https://openoffice.apache.org/ and https://twitter.com/ApacheOO

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with more than 8,100 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "OpenOffice", "Apache OpenOffice", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday April 08, 2021

The Apache Software Foundation Announces Apache® DolphinScheduler™ as a Top-Level Project

Open Source distributed Big Data visual workflow scheduler system in use at thousands of organizations, including Budweiser, China Unicom, IDG Capital, IBM China, JD.com, Lenovo, New Oriental, Nokia China, Qihoo 360, SF Express, and Tencent, among others.


Wilmington, DE —8 April 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® DolphinScheduler™ as a Top-Level Project (TLP).


Apache DolphinScheduler is a distributed, extensible visual Big Data workflow scheduler system. The project was first created at Analysys in December 2017, and entered the Apache Incubator in August 2019.


"We learned a lot about becoming a strong Open Source project during our time in the Apache Incubator," said Lidong Dai, Vice President of Apache DolphinScheduler. "Our incubation mentors helped guide us with developing our project and community the Apache Way. We are pleased to have graduated as an Apache Top-Level Project."


As a distributed and extensible data workflow scheduler platform with rich directed acyclic graph (DAG) visual interfaces, DolphinScheduler solves complex task dependencies and triggers in the data pipeline. Out-of-the-box, its easy-to-extend processing connects numerous systems to 100,000-level data task scheduling. Apache DolphinScheduler is:

  • Cloud Native —support multi-cloud/data center workflow  management, also supports Kubernetes, Docker deployment and custom task types, distributed scheduling, with overall scheduling capability increased linearly with the scale of the cluster

  • Highly Reliable —decentralized multi-master and multi-worker, high availability, supported by itself, overload processing

  • User-Friendly —all process definition operations are visualized, defines key information at a glance, one-click deployment

  • Supports Rich Scenarios —includes streaming, pause, recover operation, multi-tenant, and additional task types such as spark, hive, mr, shell, python, flink, sub_process, and more.

"Apache DolphinScheduler is designed for cloud-native," added Dai. "We are proud to have built a reliable and cloud friendly data workflow system while using next generation architecture and smart UI design."


Apache DolphinScheduler has more than 4,000 users in China, with Internet companies and banks forming a large percentage of users. Users include Budweiser, China Unicom, IDG Capital, IBM China, JD.com, Lenovo, New Oriental, Nokia China, Qihoo 360, SF Express, and Tencent, among others.


"Apache DolphinScheduler is an excellent data workflow open-source product," said Zhengjun Yin, Architect at China Unicom. "Its community is very friendly and gives us strong support. We save the cost of hundreds of human-months by using DolphinScheduler!"


"Apache DolphinScheduler is amazing," said Xide Gu, Architect at JD Logistics. "JD Logistics used Apache DolphinScheduler as  a stable and powerful platform to connect and control the data flow from various data sources in JDL, such as SAP Hana and Hadoop. It offers open API, easy plug-in and stable data flow development and scheduler environment. DolphinScheduler really helps JD Logistics data team accelerate development efficiency in many Agile BI projects!"

"I am honored to guide the DolphinScheduler community from day one of the incubating. In the past 1.5 years, it grows fast and healthy," said Sheng Wu, ASF Board Member and DolphinScheduler Incubator Champion. "They learned the Apache culture quickly, and have great executive capability. It is great to see the project graduating from the incubator with a diverse and active community. Being a top-level project is a new beginning for you, look forward to becoming a global and powerful project." "I am honored to witness the entire process of DolphinScheduler from open source to entry into the Apache incubator, and then to graduation to become an independent Apache top-level project," said Shi Shaofeng, Member of the Apache Kylin and Apache Incubator Project Management Committees. "During more than one year, the participants in the DolphinScheduler community have been adhering to the open-source spirit, constantly innovating and making progress. The developers and contributors join in the community constantly and make DolphinScheduler, a big data scheduling tool created by the Chinese, become more and more perfect, more and more users, and enter a virtuous cycle of development. It is expected that after graduation from the incubator, she will continue to move forward under the management of PMCs and create more value for society and the public through open-source software." "Congratulations to open source project DolphinScheduler for graduating from the Apache incubator and becoming ASF's top project," said Chen Liang, Vice President of Apache CarbonData. "DolphinScheduler has been developing the community in accordance with the Apache Way and has attracted many open-source developers to join. With the joint efforts of community members, the project has become more and more mature. Best wishes to the DolphinScheduler community!"


"We look forward to diversifying the Apache DolphinScheduler community with seed users from all over the world," added Dai. "Those interested in participating are welcome to reach out to us on our project mailing lists and other channels."


Catch Apache DolphinScheduler in action at its global MeetUp, held online in collaboration with the Apache ShardingSphere community, on 15 May 2021. Members of the DolphinScheduler and ShardingSphere Project Management Committees will share features and use cases on both projects in English. To register, visit https://www.meetup.com/dolphinscheduler-meetup-group/

Availability and Oversight

Apache DolphinScheduler software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache DolphinScheduler, visit https://DolphinScheduler.apache.org/ , https://twitter.com/DolphinSchedule , and https://asf-dolphinscheduler.slack.com/ .


About the Apache Incubator

The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/ 


About The Apache Software Foundation (ASF)

Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,100 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF  


© The Apache Software Foundation. "Apache", "DolphinScheduler", "Apache DolphinScheduler", "ShardingSphere", "Apache ShardingSphere", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday March 24, 2021

The Apache® Software Foundation Celebrates 22 Years of Open Source Innovation "The Apache Way"

World's largest Open Source foundation provides $22B+ in community-led software 100% free of charge for the common good

Wilmington, DE —24 March 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today its 22nd Anniversary.

Originally established by the 21-member Apache Group, who oversaw the then-3-year-old Apache HTTP Server, the ASF today is the world's largest, vendor-neutral, Open Source foundation, comprising 800+ individual Members, 8,100+ Committers, and 40,000+ code contributors located on every continent. Conservatively valued at more than $22B, Apache’s 350+ projects and 37 incubating podlings are all freely-available to the public-at-large, at 100% no cost, and with no licensing fees.

"Over the past 22 years the ASF has evolved to meet the growing needs of the greater community," said Sander Striker, Board Chair of The Apache Software Foundation. "The ASF enables people from all over the world to collaborate, develop, and shepherd the projects and communities that are helping individuals, sustaining businesses, and transforming industries."

Advancing its mission of providing software for the public good, the ASF's projects are integral to nearly every aspect of modern computing, benefitting billions worldwide. The "Apache Way" process of community-led, collaborative development has led to breakthrough innovations in Artificial Intelligence and Deep Learning, Big Data, Build Management, Cloud Computing, Content Delivery and Management, Edge Computing and IoT, Fintech, Identity Management, Integration, Libraries, Messaging, Mobile, Search, Security, Servers, and Web Frameworks, among other categories. Projects undergoing development in the Apache Incubator span AI, Big Data, blockchain, Cloud computing, cryptography, deep learning, email, IoT, machine learning, microservices, mobile, operating systems, testing, visualization, and more.

Nearly half a million people participate in ASF projects and initiatives, including ApacheCon, the ASF's official global conference series; Community Development, which oversees contributor onboarding and mentoring and programs such as Google Summer of Code; and Diversity & Inclusion, whose programs promote diversity, equity, and inclusion across the greater Apache community.

The ASF's influence is everywhere —countless ubiquitous and mission-critical applications across dozens of industries are powered by Apache projects; the Apache License 2.0 was the top-ranked Open Source license in 2020 (source: WhiteSource); the Apache Way is the backbone for open development and inner source environments; and new users, developers, and enthusiasts are onboarding to the greater Apache community every day (the ASF has been a Google Summer of Code mentoring organization for the past 16 years, since the program's inception). The ASF is the top-ranked Open Source not-for-profit organization with the most stars on GitHub (source: GitHub).

A just-released feature on the ASF in FOSSlife [1] states, "The Apache project has undeniably changed the world … Apache remains a crucial Web server, the most popular in the field. For building Open Source communities, the lessons learned by creating the project still resonate throughout the open source world. Every project is advised to respect the Apache value of 'community over code'."

ASF operations bolster Apache projects and their communities with infrastructure support, bandwidth, connectivity, servers, hardware, development environments, legal counsel, accounting services, trademark protection, marketing and publicity, educational events, and related administrative assistance. As a United States private 501(c)(3) not-for-profit charitable organization, the ASF's day-to-day operating expenses are offset through tax-deductible sponsorships, corporate contributions, and individual donations. Current ASF Sponsors are:

Platinum: Amazon Web Services, Facebook, Google, Huawei, Microsoft, Namebase, Pineapple Fund, Tencent, and Verizon Media.

Gold: Anonymous, Baidu, Bloomberg, Cloudera, Confluent, IBM, Indeed, Reprise Software, Union Investment, and Workday.

Silver: Aetna, Alibaba Cloud Computing, Capital One, Comcast, Didi Chuxing, Red Hat, and Target.

Bronze: Bestecasinobonussen.nl, Bookmakers, Casino2k, Cerner, Curity, GridGain, Gundry MD, Host Advice, HotWax Systems, Journal Review, LeoVegas Indian Online Casino, Miro-Kredit AG, Mutuo Kredit AG, Online Holland Casino, ProPrivacy, PureVPN, RX-M, RenaissanceRe, SCAMS.info, SevenJackpots.com, Start a Blog by Ryan Robinson, Talend, The Best VPN, The Blog Starter, The Economic Secretariat, Top10VPN, Twitter, and Writers Per Hour.

Targeted Platinum: Amazon Web Services, CloudBees, DLA Piper, Fastly, JetBrains, Leaseweb, Microsoft, OSU Open Source Labs, Sonatype, and Verizon Media.

Targeted Gold: Atlassian, Datadog, Docker, PhoenixNAP, and Quenda.

Targeted Silver: HotWax Systems, Manning Publications, and Rackspace.

Targeted Bronze: Bintray, Education Networks of America, Friend of Apache Cordova, Google, Hopsie, No-IP, PagerDuty, Peregrine Computer Consultants Corporation, Sonic.net, SURFnet, and Virtru.

"Baidu has always maintained close cooperation with Apache Software Foundation. In the past, we donated Apache ECharts, Apache Doris, Apache brpc, and Apache Teaclave. We are very grateful to Apache way for promoting the growth of these projects and enabling Baidu to make greater contributions to the open source world together with ASF."
—Zhenyu Hou, Corporate Vice President of Baidu Group

"Congratulations to the Apache Software Foundation on its twenty-second anniversary! If it were not for ASF's work to incubate and steward open source projects, the internet community would not be thriving to the same degree. Open source is enabling our digital prosperity, and the ASF plays a key, behind-the-scenes role in this. We share their vision for the availability of trustworthy open-source software and are proud to be a sponsor."
—Travis Spencer, CEO of Curity

"Congratulations to the 22nd anniversary of the Apache Software Foundation! Didi Chuxing is more than honored to join the Apache family as a corporate sponsor this year. At Didi, our developers utilize and contribute to many Apache projects such as Hadoop, Kylin, and Flink etc. Sharing the same “Community Over Code” principle, we hope to drive more innovations with Apache and we look forward to further collaborations!"
—Yunbo Wang, Director of Technical Community and Open Source at Didi Chuxing

"Facebook was originally built on a stack using the Apache HTTP Server, and it's one of the many reasons we've been sponsoring, advocating, utilizing, and contributing to the ASF for the past 10 years. We're proud to be a part of the ASF community and look forward to continued support of its mission to provide Open Source software for the public good."
—Joel Marcey, Open Source Developer Advocate and Ecosystem Lead at Facebook

"We are honored to be a part of and proud to support the ASF! The Apache community continues to be an incredibly valuable resource for HotWax. Contributing to and receiving from the ASF remains a central focal point for our business, and an important part of our team philosophy."
—Mike Bates, CEO of HotWax Systems

"It is an honor to support Apache, an organization responsible for such an astounding amount of Open Source projects that truly make up the fabric of the Internet. Here's to all that's been accomplished in the last 22 years – we can't wait to see what the future of open development brings."
—Robert van der Meulen, Global Product Strategy Lead at Leaseweb

"We're extending a big congratulations to the Apache Software Foundation on their 22nd anniversary! The ASF has been a key driver for the success of open source software models and community-led development for over two decades. Microsoft is honored to engage with and contribute to the Apache community across many facets of our business including Azure big data, Hadoop and Spark – and we look forward to continuing the collaboration."
—Stormy Peters, Director of Open Source Programs Office at Microsoft

"Congratulations to the Apache Software Foundation on its 22nd anniversary! Tencent has been a user and contributor to the projects at ASF. Many developers from Tencent have been actively involved with the ASF projects as Chair or PMC. We look forward to continuing our collaboration and creating more open-source innovations with 'The Apache Way'."
—Mark Shan, Chair of Tencent Open Source Alliance


[1] FOSSlife "How the Apache Project Boosted the Free and Open Source Software Movements" https://www.fosslife.org/how-apache-project-boosted-free-and-open-source-software-movements

Additional ASF Resources

 - "Trillions and Trillions Served" documentary on the ASF https://s.apache.org/Trillions-Feature

 - About The Apache Way http://apache.org/theapacheway/

 - The Apache Way to Sustainable Open Source Success https://s.apache.org/GhnI

 - FY2020 Annual Report https://s.apache.org/FY2020AnnualReport

 - Ways to support the ASF http://apache.org/foundation/contributing.html


About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world's largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF's all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,100 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Apache HTTP Server", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday March 04, 2021

The Apache Software Foundation Announces Apache® Daffodil™ as a Top-Level Project

Open Source universal data interchange implementation of the Data Format Description Language (DFDL) standard in use at DARPA, GE Research, Naval Postgraduate School, Owl Cyber Defense, Perspecta Labs, and Raytheon BBN Technologies, among others.

Wilmington, DE —4 March 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Daffodil™ as a Top-Level Project (TLP).

Apache Daffodil is an Open Source implementation of the Data Format Description Language 1.0 specification (DFDL; the Open Grid Forum open standard framework for describing the attributes of any data format [1]) to enable universal data interchange. The project was first created at the University of Illinois National Center for Supercomputing Applications (NCSA) in 2009, and entered the Apache Incubator in August 2017.

"We're extremely excited that Apache Daffodil has achieved this important milestone in its development. The Daffodil DFDL implementation is a game changer in complex text and binary data interfaces and creates massive opportunities for organizations to easily implement highly sophisticated processes like data decomposition, inspection, and reassembly," said Michael Beckerle, Vice President of Apache Daffodil. "Instead of spending a lot of time worrying about how to deal with so many kinds of data that you need to take in, from day one you can convert all sorts of data into XML, or JSON, or your preferred data structure, and convert back if you need to write data out in its original format."

Apache Daffodil is particularly useful in large-scale organizations, such as governments and large corporations, where massive amounts of complex and legacy data must be exchanged and made accessible every day. Daffodil is also particularly useful in cybersecurity, where data must be inspected for correctness and sanitized.

Apache Daffodil is in use at major global organizations that include DARPA, GE Research, Naval Postgraduate School, Owl Cyber Defense, Perspecta Labs, and Raytheon BBN Technologies, among others.

"We are using Daffodil to translate DFDL schema specifications into code for our Monitoring & INspection Device (MIND) as part of our work on DARPA’s Guaranteed Architecture for Physical Security (GAPS) program," said said Bill Smith, Principal Engineer at GE Research. "One of our engineers has joined the Apache Daffodil Project Management Committee and is building out the new DFDL-to-C backend on a dedicated Daffodil development branch. We are now translating DFDL schemas provided by other DARPA GAPS performers to C code suitable for the small resource-constrained controllers in our MIND device. When complete, Daffodil's DFDL-to-C backend will give us the ability to annotate DFDL schemas with security policies and rapidly reconfigure our MIND device for different mission security profiles."

"Apache Daffodil is an important asset to our cross domain solutions technology stack, allowing Owl to support our customers by extending our filtering capabilities to new data types faster and with less risk," said Ken Walker, CTO at Owl Cyber Defense. "It's directly in line with our company priorities, as supporters of the Open Source community, and highly beneficial to our product lines to have this high-quality Open Source implementation of DFDL to support challenging, sometimes proprietary data formats, such as Link16, VMF, USMTF, OSIsoft PI System, and JANAP-128, without the need to develop additional software. DFDL enables our Raise-the-Bar compliant cross domain solutions to support new data types without additional rounds of lengthy lab-based testing and recertification."

"The DFDL open spec and the Apache Daffodil implementation have helped us tremendously in parsing and transforming fixed-format data in a variety of different R&D projects at BBN," said Michael Atighetchi, Lead Scientist at Raytheon BBN Technologies. "Sharing parsers through a vendor-neutral XML representation is a game changer that enables a significant speedup in developing, maturing, and transitioning advanced capabilities to help war fighters."

"Our research on applying Data Format Description Language (DFDL) is exploring how to unlock and archive a plethora of diverse data streams from unmanned systems," said Don Brutzman, Naval Postgraduate School. "Both the DFDL standard and the Apache Daffodil open-source implementation provide a big benefit for these potential capabilities. Continuing work at Naval Postgraduate School (NPS) Consortium for Robotics and Unmanned Systems Education and Research (CRUSER) hopes to make telemetry from field experimentation and simulation repeatably tractable for Big Data analytics."

"Graduation to a TLP recognizes that the Apache Daffodil project follows the rigorous software development practices that have made so many of ASF projects trusted and successful," added Beckerle. "With the increasing interest in Big Data, interoperability, and protection from malicious data, we welcome new contributors to help us further grow the Apache Daffodil community."

[1] Data Format Description Language (DFDL) v1.0 Specification https://www.ogf.org/documents/GFD.240.pdf

Availability and Oversight
Apache Daffodil software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Daffodil, visit https://daffodil.apache.org/ and https://twitter.com/ApacheDaffodil 

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,100 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF 

© The Apache Software Foundation. "Apache", "Daffodil", "Apache Daffodil", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday February 16, 2021

The Apache Software Foundation Announces Apache® Gobblin™ as a Top-Level Project

Open Source distributed Big Data integration framework in use at Apple, CERN, Comcast, Intel, LinkedIn, Nerdwallet, PayPal, Prezi, Roku, Sandia National Labs, Swisscom, Verizon, and more.

Wilmington, DE —16 February 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Gobblin™ as a Top-Level Project (TLP).

Apache Gobblin is a distributed Big Data integration framework used in both streaming and batch data ecosystems. The project originated at LinkedIn in 2014, was open-sourced in 2015, and entered the Apache Incubator in February 2017.

"We are excited that Gobblin has completed the incubation process and is now an Apache Top-Level Project," said Abhishek Tiwari, Vice President of Apache Gobblin and software engineering manager at LinkedIn. "Since entering the Apache Incubator, we have completed four releases and grown our community the Apache Way to more than 75 contributors from around the world."

Apache Gobblin is used to integrate hundreds of terabytes and thousands of datasets per day by simplifying the ingestion, replication, organization, and lifecycle management processes across numerous execution environments, data velocities, scale, connectors, and more.

"Originally creating this project, seeing it come to life and solve mission-critical problems at many companies has been a very gratifying experience for me and the entire Gobblin team," said Shirshanka Das, Founder and CTO at Acryl Data, and member of the Apache Gobblin Project Management Committee.

As a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems, Apache Gobblin makes the arduous task of creating and maintaining a modern data lake easy. It supports the three main capabilities required by every data team: 

  • Ingestion and export of data from a variety of sources and sinks into and out of the data lake while supporting simple transformations. 
  • Data Organization within the lake (e.g. compaction, partitioning, deduplication).
  • Lifecycle and Compliance Management of data within the lake (e.g. data retention, fine-grain data deletions) driven by metadata.

"Apache Gobblin supports deployment models all the way from a single-process standalone application to thousands of containers running in cloud-native environments, ensuring that your data plane can scale with your company’s growth," added Das.

Apache Gobblin is in use at Apple, CERN, Comcast, Intel, LinkedIn, Nerdwallet, PayPal, Prezi, Roku, Sandia National Laboratories, Swisscom, and Verizon, among many others.

"We chose Apache Gobblin as our primary data ingestion tool at Prezi because it proved to scale, and it is a swiss army knife of data ingestion," said Tamas Nemeth, Tech Lead and Manager at Prezi. "Today, we ingest, deduplicate, and compact more than 1200 Apache Kafka topics with its help, and this number is still growing. We are looking forward to continuing to contribute to the project and helping the community enable other companies to use Apache Gobblin."

"Apache Gobblin has been at the center stage of the data management story at LinkedIn. We leverage it for various use-cases ranging from ingestion, replication, compaction, retention, and more," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It is battle-tested and serves us well at exabyte scale. We firmly believe in the data wrangling capabilities that Gobblin has to offer, and we will continue to contribute heavily and collaborate with the Apache Gobblin community. We are happy to see that Gobblin has established itself as an industry standard and is now an Apache Top-Level Project."

"Open community and meritocracy are the key drivers for Apache Gobblin's success," added Tiwari. "We invite everyone interested in the data management space to join us and help shape the future of Gobblin."

Catch Apache Gobblin in action in the upcoming hackathon planned for late Q1 2021. Details will be posted on the Apache Gobblin mailing lists and Twitter feed listed below.

Availability and Oversight
Apache Gobblin software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Gobblin, visit https://gobblin.apache.org/ and https://twitter.com/ApacheGobblin 

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/ 

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,000 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF 

© The Apache Software Foundation. "Apache", "Gobblin", "Apache Gobblin", "Hadoop", "Apache Hadoop", "MapReduce", "Apache MapReduce", "Mesos", "Apache Mesos", "YARN", "Apache YARN", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday February 03, 2021

The Apache Software Foundation Announces Apache® DataSketches™ as a Top-Level Project

Open Source high-performance Big Data streaming algorithm library in use at Nielsen Identity, Permutive, Splice Machine, and Verizon Media, among others.

Wilmington, DE —3 February 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® DataSketches™ as a Top-Level Project (TLP).

Apache DataSketches is a highly performant Big Data analysis library for scalable approximate algorithms. The project originated at Yahoo in 2012, was open-sourced in 2015, and entered the Apache Incubator in March 2019.

"We are excited to be part of the ASF," said Lee Rhodes, Vice President of Apache DataSketches. "We have learned a great deal from the incubation process and look forward to working with new users of our library that want to take advantage of sketching technology."

Apache DataSketches’s library of specialized streaming algorithms —known as sketches— comprise small data structures that process data at massive scale. Sketches are ideal for queries that cannot afford the time or huge compute resources needed to generate exact results. Where approximate results are acceptable, sketches are the only viable alternative for interactive queries with real-time analysis. Apache DataSketches is:

  • Fast —produces approximate results at orders of magnitude faster than traditional methods -- user configurable size vs accuracy tradeoff;
  • Efficient —sketch algorithms process data in a single pass for both real-time and batch;
  • Mergeable —allows for parallelization;
  • Optimized for large-scale computing environments that process Big Data —such as Apache Hadoop, Apache Spark, Apache Druid, Apache Hive, Apache Pig, PostgreSQL;
  • Binary compatible across multiple languages and platforms —available in Java, C++, and Python;
  • Expanded Analysis —including count distinct with set operations, quantiles, most frequent items (heavy hitters), matrix computations, and more; and
  • Mathematically defined and proven error properties —provides a priori and a posteriori error estimation and upper and lower bounds with statistically derived confidence intervals.

Apache DataSketches is used in large-scale computing environments such as Nielsen Identity, Permutive, Splice Machine, and Verizon Media, among others, as well as Apache Druid and Apache Pinot (incubating).

"The Apache DataSketches project takes powerful algorithms for data summarization and analysis, and makes them available to everyone," said Professor Graham Cormode of the University of Warwick. "While these methods are tremendously useful in practice, their descriptions were previously only in highly technical scientific papers. This project has made robust, dependable and well-documented implementations available to all. Already the library has been used for a wide range of applications, including service quality, monitoring, ad analytics and the sciences."

"Using Apache DataSketches has enabled Apache Druid users to perform common tasks such as quantiles and unique counting in a highly performant and efficient manner," said Gian Merlino, Vice President of Apache Druid. "We have worked closely together over the years to make the power of DataSketches accessible to Apache Druid users, helping us provide real-time analytics at scale."

"Sketches are fundamental to calculating many of our key company metrics," said Tom Miller, Director of Software Development Engineering at Verizon Media. "It allows us to greatly simplify our data processing and reduce storage costs by allowing us to calculate non-additive metrics across user specified dimension combinations at report time instead of having to either retain raw data or pre-calculate for each set of dimensions."

"Combining Apache Druid and DataSketches allows us to provide our customers real-time insights into their target audiences and advertising campaigns," said Yakir Buskilla, Senior Vice President of Research and Development and General Manager Israel at Nielsen Identity. "The ability to evaluate set expressions make the Theta Sketch especially powerful for multi-set cardinality estimation as well as funnel analysis."

“Apache DataSketches has provided us with a solid theoretical foundation upon which we are able to store and process data at scale - in a simple, fast and cost-efficient manner," said David Cromberge, Senior Software Engineer at Permutive. "It has been a pleasure to engage with their creators and community who have been helpful at every step of the way.”

"We use DataSketches's Theta-Sketches for distinct-count aggregations that are used to solve large multi-set cardinality approximation," said Mayank Shrivastava, Committer and member of the Apache Pinot (incubating) Podling Project Management Committee. "The ability to evaluate set expressions make the Theta Sketch especially powerful for multi-set cardinality estimation as well as funnel analysis."

"We welcome those interested in streaming algorithms to visit us, learn about this exciting technology, and contribute to Apache DataSketches to make our project even better," added Rhodes.

Availability and Oversight
Apache DataSketches software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache DataSketches, visit https://datasketches.apache.org .

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/ .

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,000 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF .

© The Apache Software Foundation. "Apache", "DataSketches", "Apache DataSketches", "Druid", "Apache Druid", "Hadoop", "Apache Hadoop", "Hive", "Apache Hive", "Pig", "Apache Pig", "Pinot (incubating)", "Apache Pinot (incubating)", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation