The Apache Software Foundation Blog

Monday August 02, 2021

The Apache Software Foundation Announces Apache® Pinot™ as a Top-Level Project

Open Source distributed real-time Big Data analytics infrastructure in use at Amazon-Eero, Doordash, Factual/FourSquare, LinkedIn, Stripe, Uber, Walmart, Weibo, and WePay, among others.

Wilmington, DE —2 August 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Pinot™ as a Top-Level Project (TLP).

Apache Pinot is a distributed Big Data analytics infrastructure created to deliver scalable real-time analytics at high throughput with low latency. The project was first created at LinkedIn in 2013, open-sourced in 2015, and entered the Apache Incubator in October 2018.

"We are pleased to successfully adopt 'the Apache Way' and graduate from the Apache Incubator," said Kishore Gopalakrishna, Vice President and original co-creator of Apache Pinot. "Pinot initially pushed the boundaries of real-time analytics by delivering insights to millions of Linkedin users. Today, as an Apache Top-Level Project, Pinot is in the hands of developers across the globe who are building it to power several user-facing  analytical applications and unlock the value of data within their organizations."

Scalable to trillions of records, Apache Pinot’s online analytical processing (OLAP) ingests both online and offline data sources from Apache Kafka, Apache Spark, Apache Hadoop HDFS, flat files, and Cloud storages in real time. Pinot is able to ingest millions of events and serve thousands of queries per second, and provide unified analytics in a distributed, fault-tolerant fashion. Features include:

  • Speed —answers OLAP queries with low latency on real-time data

  • Pluggable indexing —Sorted, Inverted, Text Index, Geospatial Index, JSON Index, Range Index, Bloom filters

  • Smart Materialized Views - Fast Aggregations via star-tree index

  • Supports different stream systems with near real-time ingestion —with Apache Kafka, Confluent Kafka, and Amazon Kinesis, as well as customizable input format, with out-of the box support for Avro and JSON formats

  • Highly available, horizontally scalable, and fault tolerant

  • Supports lookup joins natively and full joins using PrestoDB/Trino

Apache Pinot is used to power internal and external analytics at Adbeat, Amazon-Eero, Cloud Kitchens, Confluera, Doordash, Factual/FourSquare, Guitar Center, LinkedIn, Publicis Sapient, Razorpay, Scale Unlimited, Startree, Stripe, Traceable, Uber, Walmart, Weibo, WePay, and more.

Examples of how Apache Pinot helps organizations across numerous verticals include: 1) a fintech company uses Pinot to achieve financial data visibility across 500+ terabytes of data and sustain half million queries per second with financial transactions; 2) a food delivery service leveraged Pinot in the midst of the COVID-19 pandemic to analyze real-time data to provide a socially-distanced pick-up experience for its riders and restaurants; and 3) a large retail chain with geographically distributed franchises and stores uses Pinot for revenue-generating opportunities by analyzing real-time data for internal use cases, as well as real-time cart analysis to increase sales.

"We rely on Apache Pinot for all our real-time analytics needs at LinkedIn," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It's battle-tested at LinkedIn scale for hundreds of our low-latency analytics applications. We believe Apache Pinot is the best tool out there to build site-facing analytics applications and we will continue to contribute heavily and collaborate with the Apache Pinot community. We are very happy to see that it's now a Top-level Apache project."

"We use Apache Pinot in our real-time analytics platform to power external user-facing applications and critical operational dashboards," said Ujwala Tulshigiri, Engineering Manager at Uber. "With Pinot's multi-tenancy support and horizontal scalability, we have scaled to hundreds of use cases that run complex aggregations queries on terabytes of data at millisecond latencies, with the minimal overhead of cluster management."

"We've been using Apache Pinot since last year, and it's been a huge win for our client’s dashboard project," said Ken Krugler, President of Scale Unlimited. "Pinot's ability to rapidly generate aggregation results over billions of records, with modest hardware requirements, was critical for the success of the project. We've also been able to provide patches to add functionality and fix issues, which the Pinot community has quickly integrated and released. There was never any doubt in our minds that Pinot would graduate from the Apache incubator and become a successful top-level project."

"Last year, we started without analytics built into our product," said Pradeep Gopanapalli, technical staff member at Confluera. "By the end of the year, we were using Apache Pinot for real-time analytics in production. Not many of our competitors can even dream of having such results. We are very happy with our choice."

"Pinot is critical to our real-time analytics platform and allowed us to scale without degrading latency," said software engineer Elon Azoulay. "Pinot enables us to onboard large datasets effortlessly, run complex queries which return in milliseconds and is super reliable. We would like to emphasize how helpful and engaged the community is and are certain that we made the right choice with Pinot, it continues to impress us and satisfy our real-time analytics needs."

"We created Pinot at LinkedIn with the goal of tackling the low-latency OLAP problem for site-facing use cases at scale. We evolved it to solve numerous OLAP use cases, and open-sourced it because there aren't many technologies in that domain," said Subbu Subramaniam, member of the Apache Pinot Project Management Committee, and Senior Staff Engineer at LinkedIn. "It is heart-warming to see such a wide adoption and great contributions from the community in improving Pinot over time."

"We are at the beginning of this transformation and we cannot wait to see every software company build real-time applications using Apache Pinot," added Gopalakrishna. "We welcome everyone to join our community Slack channel and contribute to the project."

Catch Apache Pinot in action at ApacheCon Asia online on 7 August 2021. For more information and to register, visit https://www.apachecon.com/acasia2021/

Availability and Oversight
Apache Pinot software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Pinot, visit http://pinot.apache.org/ and https://twitter.com/ApachePinot

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors that include Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Talend, Tencent, Target, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Pinot", "Apache Pinot", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Comments:

Post a Comment:
Comments are closed for this entry.

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation