Entries tagged [bigdata]
The Apache Software Foundation Announces Apache® YuniKorn™ as a Top-Level Project
Open Source universal Big Data and Machine Learning resource scheduler in use at Alibaba, Apple, Cloudera, Lyft, Visa, and Zillow, among others.
Wilmington, DE —16 May 2022— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® YuniKorn™ as a Top-Level Project (TLP).
Apache YuniKorn is a cloud-native, standalone Big Data and Machine Learning resource scheduler for batch jobs and long-running services on large scale distributed systems. The project was originally developed at Cloudera in March 2019, entered the Apache Incubator in January 2020, and graduated as a Top-Level Project in March 2022.
"The Apache YuniKorn community is striving together to solve the resource scheduling problems on the cloud," said Weiwei Yang, Vice President of Apache YuniKorn. "It's really great to see the Apache Way shine in the incubating process of YuniKorn. We are lucky to have such an open, collaborative, and diverse community, which is sympathetic and cares about everyone's success. This motivates us to keep evolving and gets better every day."
Apache YuniKorn natively supports Big Data application workloads and mixed workloads, and provides a unified, cross-platform scheduling experience. Features include:
- Cloud native —runs on-premise and in a variety of public cloud environments; maximizes resource elasticity with better throughput.
- Hierarchical resource queues —efficiently manages cluster resources; provides the ability to control the resource consumption for each tenant.
- Application-aware scheduling —recognizes users, applications, and queues; schedules according to submission order, priority, resource usage, and more.
- Job ordering —built-in robust scheduling capabilities; supports fairness-based cross-queue preemption, hierarchies, pluggable node sorting policies, preemption, and more.
- Central management console —monitors performance across different tenants; one-stop-dashboard tracks resource utilization for managed nodes, clusters, applications and queues.
- Efficiency —reduces resource fragmentation and proactively triggers up-scaling; cloud elasticity lowers overall operational costs.
In addition, the Project has announced the release of Apache YuniKorn v1.0, the fifth update since entering the Apache Incubator. Improvements include:
- Decreased memory and cpu usage
- Extended metrics and diagnostics information
- New deployment model supporting future upgrades
- Technical preview of the plugin deployment mode
Optimized to run Apache Spark on Kubernetes (open source software container orchestration system), Apache YuniKorn’s performance makes it an optional replacement to the Kubernetes default scheduler. Apache YuniKorn excelled in benchmark tests with other schedulers in resource sharing, resource fairness, preemption, gang scheduling, and bin packing categories, with throughput exceeding 610 allocations per second across 2,000 nodes.
Posted at 01:00PM May 16, 2022
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Open Source data orchestration platform Apache® Hop™ as a Top-Level Project
Wilmington, DE —18 January 2022— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Hop™ as a Top-Level Project (TLP).
Apache Hop —the Hop Orchestration Platform— is a flexible, metadata-infused data orchestration, engineering, and integration platform. The project originated more than two decades ago as the Extract-Transform-Load (ETL) platform Kettle (Pentaho Data Integration), was refactored over several years, and entered the Apache Incubator in September 2020.
"We are pleased to successfully adopt 'the Apache Way' and graduate from the Apache Incubator," said Bart Maertens, Vice President of Apache Hop. "Apache Hop enables people of all skill levels to build powerful and scalable data solutions without the need to write code. As an Apache Top-Level Project, Hop is developed and used by people across the globe. Hop's full project life cycle support helps these data teams to successfully build, test and run their projects in ways that would otherwise be hard or impossible to do."
Using Apache Hop, data professionals can rapidly and affordably facilitate all aspects of data and metadata orchestration whilst supporting DevOps best practices, such as testing. Apache Hop’s Java-based visual designer, server, and configuration tools are easy to set up, deploy, and maintain across numerous platforms. Features include:
- Lightweight “design once, run anywhere” architecture —workflows and pipelines can be designed in the Hop GUI and executed locally or remotely on the Hop native engine, on Apache Flink, Apache Kafka, Apache Spark, Google Dataflow, or AWS EMR through Apache Beam runtimes;
- Metadata-driven —every object type in Hop describes how data is read, manipulated or written, or how workflows and pipelines need to be orchestrated. In addition, Hop itself is internally metadata-driven, using a kernel architecture with a robust engine;
- Visual development environment —intuitive drag-and-drop graphical user interface (GUI) enables developers to enjoy the ease and productivity of visual development rather than code. Using Hop, data engineers can focus on business logic and requirements rather than how it needs to be done;
- Plug-in integration —more than 250 plugins make it easy to manage ecosystem complexity, and add new functionality; and
- Built-in lifecycle management —enables developers, engineers, and administrators to manage, test, deploy, and switch between projects, workflows, pipelines, environments, purposes, Git versions and more —all from the Hop GUI.
Apache Hop has been designed to work in any scenario: on-premises, on a cloud, on a bare OS, in containers, IoT environments, large datasets, and more, on Windows, Linux, and OSX.
Many of the thousands of organizations in finance, retail, supply chain, and other sectors that use Kettle (Pentaho Data Integration; the precursor to Apache Hop) have started to look into Hop or already are in the process of upgrading to Hop.
"I'm very happy that we can now safely collaborate with any company or person across the global community under the umbrella of the Apache Software Foundation on something as cool as Apache Hop," said Matt Casters, Chief Solution Architect at Neo4j and member of the Apache Hop Project Management Committee.
"We started adopting Apache Hop in our data integration projects in early 2021 because of its flexibility, scalability and ease of use, in various scenarios ranging from classical DWH ETL processes to highly critical, real time processes," said Sergio Ramazzina, CEO and Chief Architect at Serasoft S.r.l., and member of the Apache Hop Project Management Committee. "We are impressed by how responsive the community is in solving issues and helping users approaching the platform --an important point to increase users adoption and trust. We welcome everyone joining our Hop community and contributing to the project."
"This graduation is just the beginning for Hop, and is proof that great communities build great software. The entire Hop community would like to thank the Apache Software Foundation for making this possible, especially our mentors who guided us through the Incubator," added Maertens. "We invite everyone to download and try Hop, join our chat and become part of the Hop community."
Catch Apache Hop in action at a future Hop community event. For more information and to register, visit https://hop.apache.org/community/events/
Availability and Oversight
Apache Hop software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Hop, visit https://hop.apache.org/ and https://twitter.com/ApacheHop
About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 820+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,400+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors that include Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Replicated, Talend, Target, Tencent, Union Investment, Workday, and Yahoo!. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "Hop", "Apache Hop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 02:00PM Jan 18, 2022
by Sally Khudairi in Projects |
|
The Apache Software Foundation Announces Apache® Pinot™ as a Top-Level Project
Open Source distributed real-time Big Data analytics infrastructure in use at Amazon-Eero, Doordash, Factual/FourSquare, LinkedIn, Stripe, Uber, Walmart, Weibo, and WePay, among others.
Wilmington, DE —2 August 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Pinot™ as a Top-Level Project (TLP).
Apache Pinot is a distributed Big Data analytics infrastructure created to deliver scalable real-time analytics at high throughput with low latency. The project was first created at LinkedIn in 2013, open-sourced in 2015, and entered the Apache Incubator in October 2018.
"We are pleased to successfully adopt 'the Apache Way' and graduate from the Apache Incubator," said Kishore Gopalakrishna, Vice President and original co-creator of Apache Pinot. "Pinot initially pushed the boundaries of real-time analytics by delivering insights to millions of Linkedin users. Today, as an Apache Top-Level Project, Pinot is in the hands of developers across the globe who are building it to power several user-facing analytical applications and unlock the value of data within their organizations."
Scalable to trillions of records, Apache Pinot’s online analytical processing (OLAP) ingests both online and offline data sources from Apache Kafka, Apache Spark, Apache Hadoop HDFS, flat files, and Cloud storages in real time. Pinot is able to ingest millions of events and serve thousands of queries per second, and provide unified analytics in a distributed, fault-tolerant fashion. Features include:
- Speed —answers OLAP queries with low latency on real-time data
- Pluggable indexing —Sorted, Inverted, Text Index, Geospatial Index, JSON Index, Range Index, Bloom filters
- Smart Materialized Views - Fast Aggregations via star-tree index
- Supports different stream systems with near real-time ingestion —with Apache Kafka, Confluent Kafka, and Amazon Kinesis, as well as customizable input format, with out-of the box support for Avro and JSON formats
- Highly available, horizontally scalable, and fault tolerant
- Supports lookup joins natively and full joins using PrestoDB/Trino
Apache Pinot is used to power internal and external analytics at Adbeat, Amazon-Eero, Cloud Kitchens, Confluera, Doordash, Factual/FourSquare, Guitar Center, LinkedIn, Publicis Sapient, Razorpay, Scale Unlimited, Startree, Stripe, Traceable, Uber, Walmart, Weibo, WePay, and more.
Examples of how Apache Pinot helps organizations across numerous verticals include: 1) a fintech company uses Pinot to achieve financial data visibility across 500+ terabytes of data and sustain half million queries per second with financial transactions; 2) a food delivery service leveraged Pinot in the midst of the COVID-19 pandemic to analyze real-time data to provide a socially-distanced pick-up experience for its riders and restaurants; and 3) a large retail chain with geographically distributed franchises and stores uses Pinot for revenue-generating opportunities by analyzing real-time data for internal use cases, as well as real-time cart analysis to increase sales.
"We rely on Apache Pinot for all our real-time analytics needs at LinkedIn," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It's battle-tested at LinkedIn scale for hundreds of our low-latency analytics applications. We believe Apache Pinot is the best tool out there to build site-facing analytics applications and we will continue to contribute heavily and collaborate with the Apache Pinot community. We are very happy to see that it's now a Top-level Apache project."
"We use Apache Pinot in our real-time analytics platform to power external user-facing applications and critical operational dashboards," said Ujwala Tulshigiri, Engineering Manager at Uber. "With Pinot's multi-tenancy support and horizontal scalability, we have scaled to hundreds of use cases that run complex aggregations queries on terabytes of data at millisecond latencies, with the minimal overhead of cluster management."
"We've been using Apache Pinot since last year, and it's been a huge win for our client’s dashboard project," said Ken Krugler, President of Scale Unlimited. "Pinot's ability to rapidly generate aggregation results over billions of records, with modest hardware requirements, was critical for the success of the project. We've also been able to provide patches to add functionality and fix issues, which the Pinot community has quickly integrated and released. There was never any doubt in our minds that Pinot would graduate from the Apache incubator and become a successful top-level project."
"Last year, we started without analytics built into our product," said Pradeep Gopanapalli, technical staff member at Confluera. "By the end of the year, we were using Apache Pinot for real-time analytics in production. Not many of our competitors can even dream of having such results. We are very happy with our choice."
"Pinot is critical to our real-time analytics platform and allowed us to scale without degrading latency," said software engineer Elon Azoulay. "Pinot enables us to onboard large datasets effortlessly, run complex queries which return in milliseconds and is super reliable. We would like to emphasize how helpful and engaged the community is and are certain that we made the right choice with Pinot, it continues to impress us and satisfy our real-time analytics needs."
"We created Pinot at LinkedIn with the goal of tackling the low-latency OLAP problem for site-facing use cases at scale. We evolved it to solve numerous OLAP use cases, and open-sourced it because there aren't many technologies in that domain," said Subbu Subramaniam, member of the Apache Pinot Project Management Committee, and Senior Staff Engineer at LinkedIn. "It is heart-warming to see such a wide adoption and great contributions from the community in improving Pinot over time."
"We are at the beginning of this transformation and we cannot wait to see every software company build real-time applications using Apache Pinot," added Gopalakrishna. "We welcome everyone to join our community Slack channel and contribute to the project."
Catch Apache Pinot in action at ApacheCon Asia online on 7 August 2021. For more information and to register, visit https://www.apachecon.com/acasia2021/
Availability and Oversight
Apache Pinot software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Pinot, visit http://pinot.apache.org/ and https://twitter.com/ApachePinot
About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $22B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors that include Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Huawei, IBM, Indeed, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Talend, Tencent, Target, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "Pinot", "Apache Pinot", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 12:00PM Aug 02, 2021
by Sally Khudairi in General |
|
The Apache Cassandra Project Releases Apache® Cassandra™ v4.0, the Fastest, Most Scalable and Secure Cassandra Yet
Open Source enterprise-grade Big Data distributed database powers mission-critical deployments with improved performance and unparalleled levels of scale in the Cloud
Wilmington, DE —27 July 2021— The Apache Cassandra Project released today v4.0 of Apache® Cassandra™, the Open Source, highly performant, distributed Big Data database management platform.
"A long time coming, Cassandra 4.0 is the most thoroughly tested Cassandra yet," said Nate McCall, Vice President of Apache Cassandra. "The latest version is faster, more scalable, and bolstered with enterprise security features, ready-for-production with unprecedented scale in the Cloud."
As a NoSQL database, Apache Cassandra handles massive amounts of data across load-intensive applications with high availability and no single point of failure. Cassandra’s largest production deployments include Apple (more than 160,000 instances storing over 100 petabytes of data across 1,000+ clusters), Huawei (more than 30,000 instances across 300+ clusters), and Netflix (more than 10,000 instances storing 6 petabytes across 100+ clusters, with over 1 trillion requests per day), among many others. Cassandra originated at Facebook in 2008, entered the Apache Incubator in January 2009, and graduated as an Apache Top-Level Project in February 2010.
Apache Cassandra v4.0
Cassandra v4.0 effortlessly handles unstructured data, with thousands of writes per second. Three years in the making, v4.0 reflects more than 1,000 bug fixes, improvements, and new features that include:
- Increased speed and scalability – streams data up to 5 times faster during scaling operations, and up to 25% faster throughput on reads and writes, that delivers a more elastic architecture, particularly in Cloud and Kubernetes deployments.
- Improved consistency – keeps data replicas in sync to optimize incremental repair for faster, more efficient operation and consistency across data replicas.
- Enhanced security and observability – audit logging tracks users access and activity with minimal impact to workload performance. New capture and replay enables analysis of production workloads to help ensure regulatory and security compliance with SOX, PCI, GDPR, or other requirements.
- New configuration settings – exposed system metrics and configuration settings provides flexibility for operators to ensure they have easy access to data that optimize deployments.
- Minimized latency – garbage collector pause times are reduced to a few milliseconds with no latency degradation as heap sizes increase.
- Better compression – improved compression efficiency eases unnecessary strain on disk space and improves read performance.
Cassandra 4.0 is community-hardened and tested by Amazon, Apple, DataStax, Instaclustr, iland, Netflix, and others that routinely run clusters as large as 1,000 nodes and with hundreds of real-world use cases and schemas.
The Apache Cassandra community deployed several testing and quality assurance (QA) projects and methodologies to deploy the most stable release yet. During the testing and QA period, the community generated reproducible workloads that are as close to real-life as possible, while effectively verifying the cluster state against the model without pausing the workload itself.
"In our experience, nothing beats Apache Cassandra for write scaling, and we're looking forward to the performance and management improvements in the 4.0 release," said Elliott Sims, Senior Systems Administrator at Backblaze. "We rely on Cassandra to manage over one exabyte of customer data and serve over 50 billion files for our customers across 175 countries so optimizing Cassandra's capabilities and performance means a lot to us."
"Since 2016, software engineers at Bloomberg have turned to Apache Cassandra because it’s easy to use, easy to scale, and always available," said Isaac Reath, Software Engineering Team Lead, NoSQL Infrastructure at Bloomberg. "Today, Cassandra is used to support a variety of our applications, from low-latency storage of intraday financial market data to high-throughput storage for fixed income index publication. We serve up more than 20 billion requests per day on a nearly 1 PB dataset across a fleet of 1,700+ Cassandra nodes."
"Netflix uses Apache Cassandra heavily to satisfy its ever-growing persistence needs on its mission to entertain the world. We have been experimenting and partially using the 4.0 beta in our environments and its features like Audit Logging and backpressure," said Vinay Chella, Netflix Engineering Manager and Apache Cassandra Committer. "Apache Cassandra 4.0's improved performance helps us reduce infrastructure costs. 4.0's stability and correctness allow us to focus on building higher-level abstractions on top of data store compositions, which results in increased developer velocity and optimized data store access patterns. Apache Cassandra 4.0 is faster, secure, and enterprise-ready; I highly suggest giving it a try in your environments today."
"Apache Cassandra's contributors have worked hard to deliver Cassandra 4.0 as the project's most stable release yet, ready for deployment to production-critical Cloud services," said Scott Andreas, Apache Cassandra Contributor. "Cassandra 4.0 also brings new features, such as faster host replacements, active data integrity assertions, incremental repair, and better compression. The project's investment in advanced validation tooling means that Cassandra users can expect a smooth upgrade. Once released, Cassandra 4.0 will also provide a stable foundation for development of future features and the database's long-term evolution."
Apache Cassandra is in use at Activision, Apple, Backblaze, BazaarVoice, Best Buy, Bloomberg Engineering, CERN, Constant Contact, Comcast, DoorDash, eBay, Fidelity, GitHub, Hulu, ING, Instagram, Intuit, Macy's, Macquarie Bank, Microsoft, McDonalds, Netflix, New York Times, Monzo, Outbrain, Pearson Education, Sky, Spotify, Target, Uber, Walmart, Yelp, and thousands of other companies that have large, active data sets. In fact, Cassandra is used by 40% of the Fortune 100. Select Apache Cassandra case studies are available at https://cassandra.apache.org/case-studies/
In addition to Cassandra 4.0, the Project also announced a shift to a yearly release cycle, with releases to be supported for a three-year term.
Catch Apache Cassandra in action through presentations from the April 2021 Cassandra World Party https://s.apache.org/jjv2d .
Availability and Oversight
Apache Cassandra software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Cassandra, visit https://cassandra.apache.org/ and https://twitter.com/cassandra .
About Apache Cassandra
Apache Cassandra is an Open Source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Apache Cassandra is used in some of the largest data management deployments in the world, including nearly half of the Fortune 100.
© The Apache Software Foundation. "Apache", "Cassandra", "Apache Cassandra", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 03:00PM Jul 27, 2021
by Sally Khudairi in General |
|
The Apache® Software Foundation Welcomes its Global Community Online at ApacheCon(TM) Asia 2021
Asia edition of the official Apache global conference series to be held virtually, with 140+ sessions, and keynote and plenary sessions by luminaries from AliCloud, API7, DiDi Chuxing, Huawei, Kyligence, PingCAP, Tencent Cloud, Tsinghua University, and more.
Wilmington, DE —9 June 2021— The Apache® Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced keynotes, sponsors, and program for ApacheConTM Asia, taking place online 6-8 August 2021. Registration is open and free for all attendees.
"We’re excited to hold ApacheCon Asia online following last year’s highly successful ApacheCon@Home," said Sheng Wu, ApacheCon Asia co-Chair and member of the ASF Board of Directors. "The pandemic mobilized the global Apache community to collectively produce a first-rate online event, supported by an outstanding group of sponsors. We are proud to build on ApacheCon’s new virtual format and bring the ApacheCon Asia program to participants joining us from any location."
ApacheCon is the ASF's official global conference series, first held in 1998. ApacheCon draws attendees from more than 130 countries to experience "Tomorrow's Technology Today" independent of business interests, corporate biases, or sales pitches.
ApacheCon showcases the latest breakthroughs from dozens of Apache projects, with content selected entirely by Apache projects and their communities. ApacheCon Asia joins ApacheCon@Home, taking place online 21-23 September, to meet the educational demands of the growing Apache community of developers, users, and enthusiasts worldwide.
"Tune in to ApacheCon Asia's 140+ sessions to learn the latest developments, best practices, and lessons learned with Apache projects, incubating podlings, and community-led development 'The Apache Way',” said Willem Jiang, ApacheCon Asia co-Chair and initiator of Apache Local Community Beijing. "Participants can also connect and network virtually with attendees, speakers, and sponsors in real-time, as well as revisit presentations and explore additional tracks after the event."
Participants at all levels will learn about Apache project innovations in categories that include: APIs and Microservices; Big Data; Community; Culture; Data Visualization; Incubator; Integration; IoT and IIoT; Messaging; Middleware; Observability; Streaming; Servers; Workflow and Data Governance.
Featured Apache projects include Airflow, APISIX, Arrow, Atlas, Bigtop, BookKeeper, brpc (incubating), Camel, CarbonData, Cassandra, Commons, DolphinScheduler, Doris (incubating), Druid, Dubbo, ECharts, Flink, Hadoop, HBase, Hive, HUDI, Ignite, Impala, InLong (incubating), IoTDB, Kafka, Kudu, Kylin, Liminal (incubating), MXNet (incubating), Nemo (incubating), Ozone, Pegasus (incubating), Pinot (incubating), PLC4X, Pulsar, RocketMQ, ServiceComb, ShardingSphere, SkyWalking, Sling, Spark, StreamPipes (incubating), Superset, Teaclave (incubating), Tomcat, YuniKorn (Incubating), and more.
Keynote presentations will be delivered by Dongxu Huang, CTO of PingCAP; Jianmin Wang, Dean, School of Software at Tsinghua University; Sharan Foga, ASF Board Member; and Sheng Wu, ASF Board Member. Plenary sessions will be presented by AliCloud, API7, DiDi Chuxing, Huawei, Kyligence, and Tencent Cloud.
The full program is available at https://apachecon.com/acasia2021/tracks.html
ApacheCon Asia sponsors include Strategic Sponsor Huawei; Platinum Sponsors AliCloud, API7, DiDi Chuxing, Kyligence, and Tencent Cloud; Gold Sponsors AWS and Baidu; and Silver Sponsors Imply and SphereEx. Huawei, Tencent, DiDi, AWS, Baidu, Imply, and SphereEx are also Sponsors of ApacheCon@Home at the above levels.
To sponsor ApacheCon Asia and/or ApacheCon@Home, visit https://www.apachecon.com/acah2021/2021_ApacheCon_prospectus.pdf
Register today at https://apachecon.com/acasia2021/register.html .
About ApacheCon
ApacheCon is the official global conference series of The Apache Software Foundation. Since 1998 ApacheCon has been drawing participants at all levels to explore "Tomorrow's Technology Today" across 350+ Apache projects and their diverse communities. In 2020 and 2021 ApacheCon events showcase ubiquitous Apache projects and emerging innovations virtually through sessions, keynotes, real-world case studies, community events, and more, all online and free of charge. For more information, visit http://apachecon.com/ and https://twitter.com/ApacheCon .
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 850+ individual Members and 200 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 8,200+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Confluent, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Namebase, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF .
© The Apache Software Foundation. "Apache", "Airflow", "Apache Airflow", "APISIX", "Apache APISIX", "Arrow", "Apache Arrow", "Atlas", "Apache Atlas", "Bigtop", "Apache Bigtop", "BookKeeper", "Apache BookKeeper", "Camel", "Apache Camel", "CarbonData", "Apache CarbonData", "Cassandra", "Apache Cassandra", "Commons", "Apache Commons", "DolphinScheduler", "Apache DolphinScheduler", "Druid", "Apache Druid", "Dubbo", "Apache Dubbo", "ECharts", "Apache ECharts", "Flink", "Apache Flink", "Hadoop", "Apache Hadoop", "HBase", "Apache HBase", "Hive", "Apache Hive", "HUDI", "Apache HUDI", "Ignite", "Apache Ignite", "Impala", "Apache Impala", "IoTDB", "Apache IoTDB", "Kafka", "Apache Kafka", "Kudu", "Apache Kudu", "Kylin", "Apache Kylin", "Ozone", "Apache Ozone", "PLC4X", "Apache PLC4X", "Pulsar", "Apache Pulsar", "RocketMQ", "Apache RocketMQ", "ServiceComb", "Apache ServiceComb", "ShardingSphere", "Apache ShardingSphere", "SkyWalking", "Apache SkyWalking", "Sling", "Apache Sling", "Spark", "Apache Spark", "Superset", "Apache Superset", "Tomcat", "Apache Tomcat", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 04:00PM Jun 09, 2021
by Sally Khudairi in ApacheCon |
|
The Apache Software Foundation Announces Apache® Gobblin™ as a Top-Level Project
Open Source distributed Big Data integration framework in use at Apple, CERN, Comcast, Intel, LinkedIn, Nerdwallet, PayPal, Prezi, Roku, Sandia National Labs, Swisscom, Verizon, and more.
Wilmington, DE —16 February 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Gobblin™ as a Top-Level Project (TLP).
Apache Gobblin is a distributed Big Data integration framework used in both streaming and batch data ecosystems. The project originated at LinkedIn in 2014, was open-sourced in 2015, and entered the Apache Incubator in February 2017.
"We are excited that Gobblin has completed the incubation process and is now an Apache Top-Level Project," said Abhishek Tiwari, Vice President of Apache Gobblin and software engineering manager at LinkedIn. "Since entering the Apache Incubator, we have completed four releases and grown our community the Apache Way to more than 75 contributors from around the world."
Apache Gobblin is used to integrate hundreds of terabytes and thousands of datasets per day by simplifying the ingestion, replication, organization, and lifecycle management processes across numerous execution environments, data velocities, scale, connectors, and more.
"Originally creating this project, seeing it come to life and solve mission-critical problems at many companies has been a very gratifying experience for me and the entire Gobblin team," said Shirshanka Das, Founder and CTO at Acryl Data, and member of the Apache Gobblin Project Management Committee.
As a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems, Apache Gobblin makes the arduous task of creating and maintaining a modern data lake easy. It supports the three main capabilities required by every data team:
- Ingestion and export of data from a variety of sources and sinks into and out of the data lake while supporting simple transformations.
- Data Organization within the lake (e.g. compaction, partitioning, deduplication).
- Lifecycle and Compliance Management of data within the lake (e.g. data retention, fine-grain data deletions) driven by metadata.
"Apache Gobblin supports deployment models all the way from a single-process standalone application to thousands of containers running in cloud-native environments, ensuring that your data plane can scale with your company’s growth," added Das.
Apache Gobblin is in use at Apple, CERN, Comcast, Intel, LinkedIn, Nerdwallet, PayPal, Prezi, Roku, Sandia National Laboratories, Swisscom, and Verizon, among many others.
"We chose Apache Gobblin as our primary data ingestion tool at Prezi because it proved to scale, and it is a swiss army knife of data ingestion," said Tamas Nemeth, Tech Lead and Manager at Prezi. "Today, we ingest, deduplicate, and compact more than 1200 Apache Kafka topics with its help, and this number is still growing. We are looking forward to continuing to contribute to the project and helping the community enable other companies to use Apache Gobblin."
"Apache Gobblin has been at the center stage of the data management story at LinkedIn. We leverage it for various use-cases ranging from ingestion, replication, compaction, retention, and more," said Kapil Surlaker, Vice President of Engineering at LinkedIn. "It is battle-tested and serves us well at exabyte scale. We firmly believe in the data wrangling capabilities that Gobblin has to offer, and we will continue to contribute heavily and collaborate with the Apache Gobblin community. We are happy to see that Gobblin has established itself as an industry standard and is now an Apache Top-Level Project."
"Open community and meritocracy are the key drivers for Apache Gobblin's success," added Tiwari. "We invite everyone interested in the data management space to join us and help shape the future of Gobblin."
Catch Apache Gobblin in action in the upcoming hackathon planned for late Q1 2021. Details will be posted on the Apache Gobblin mailing lists and Twitter feed listed below.
Availability and Oversight
Apache Gobblin software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Gobblin, visit https://gobblin.apache.org/ and https://twitter.com/ApacheGobblin
About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,000 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "Gobblin", "Apache Gobblin", "Hadoop", "Apache Hadoop", "MapReduce", "Apache MapReduce", "Mesos", "Apache Mesos", "YARN", "Apache YARN", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 02:00PM Feb 16, 2021
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® Superset™ as a Top-Level Project
Open Source enterprise-grade Big Data visualization and business intelligence Web application in use at Airbnb, American Express, Dropbox, Lyft, Netflix, Nielsen, Rakuten Viki, Twitter, and Udemy, among others.
Wilmington, DE —21 January 2021— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Superset™ as a Top-Level Project (TLP).
Apache Superset is a modern, Open Source data exploration and visualization platform that enables users to easily and quickly build and explore dashboards using its simple no-code visualization builder and state-of-the-art SQL editor. The project originated at Airbnb in 2015 and entered into the Apache Incubator program in May 2017.
"It's been amazing to be an active part of growing a welcoming, diverse and engaged community over the past five years while following the ASF principles around inclusion, openness and collaboration," said Maxime Beauchemin, Vice President of Apache Superset. "At the scale and level of diversity that the Superset project has achieved, it's critical to have a solid governance model in place like the one prescribed by the ASF."
Apache Superset v1.0
Superset helps streamline the analytics process by providing an intuitive interface to rapidly explore and visualize datasets, create interactive dashboards, and model real-time business intelligence insights at scale. The platform integrates with most SQL speaking data sources, including modern cloud-native databases, data warehouses, and engines at petabyte scale.
The Project also celebrates a major milestone with the release of Apache Superset 1.0. Features include:
- Rich library of visualizations with support for integrating custom visualizations
- Thin caching layer to optimize performance of charts and dashboards
- Code-free visualization builder
- State-of-the-art SQL editor and metadata workflow
- Extensible enterprise authentication and security model
- Easy-to-use, lightweight semantic layer
- Notification alerts and scheduled reports
"Apache Superset 1.0 is a solid, mature, self-standing solution that fully solves business intelligence and data visualization needs for modern data teams," added Beauchemin. "Superset not only covers the table stakes, but also offers guarantees, features and a fresh approach that existing BI solutions can't match."
Apache Superset is in use at Airbnb, American Express, Dropbox, Lyft, Netflix, Nielsen, Rakuten Viki, Twitter, and Udemy, among others. A list of known users is available at https://github.com/apache/superset/blob/master/INTHEWILD.md .
"Apache Superset helps Airbnb democratize data insights and make data-informed decisions," said Jeff Feng, Product Lead at Airbnb and member of the Apache Superset Project Management Committee. "Superset uniquely connects SQL analysis with data exploration for thousands of our employees each week. It also serves as a flexible and reliable platform for visualizing metrics, helping executives and knowledge workers see and understand data."
"We had an amazing journey with Superset at Dropbox," said Chloe Wang, Senior Product Manager, Data Insights Platform at Dropbox. "Superset got introduced in 2019 and soon became the most widely adopted query engine within the analytical organization. As a result, our analysts are able to make timely and high confidence product decisions."
"Before Superset, we were paying for a patchwork of proprietary tools and we kept running into limitations when it came to customizing charts and dashboards," said Amit Miran, Software Team Lead for Media Application Framework group at Nielsen. "Once the Superset project supported adding of custom visualizations, that was the turning point for us at Nielsen to start adopting Superset in large projects. We’re very excited about native dashboard filters and future support for cross filtering, which will make our viz plugins even more powerful. The excitement for the project drove me to become involved in my first open source project."
"Apache Superset is an amazing project that enables engineers to easily execute data analysis," said Grace Guo, member of the Apache Superset Project Management Committee. "I have been a Superset user and a Superset builder for a few years. I run queries in SQL Lab, visualize data using one of the many supported chart types, and build dashboards, specifically focusing on performance and product adoption metrics. As an engineer, I appreciate the ability to contribute to the product. If I see some area to improve, or need a feature which doesn’t exist, I am happy to create a PR to fix it for myself and benefit other users."
"Apache Superset’s strength lies in its community," added Beauchemin. "We invite those interested in data visualization to join our mailing lists and help shape future versions of Superset."
Learn more about the latest in v1.0 at the Apache Superset community global MeetUp on 28 January. Registration is open to all and free of charge https://s.apache.org/3cm4f
Availability and Oversight
Apache Superset software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Superset, visit https://superset.apache.org/
About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with nearly 8,000 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cloudera, Comcast, Didi Chuxing, Facebook, Google, Handshake, Huawei, IBM, Microsoft, Pineapple Fund, Red Hat, Reprise Software, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "Superset", "Apache Superset", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 02:00PM Jan 21, 2021
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® IoTDB™ as a Top-Level Project
Open Source Internet of Things-native database integrates with the Apache Big Data ecosystem for high-speed data ingestion, massive data storage, and complex data analysis in the cloud, in the field, and on the edge.
Wakefield, MA —23 September 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® IoTDB™ as a Top-Level Project (TLP).
Apache IoTDB is an Open Source IoT database designed to meet the rigorous data, storage, and analytics requirements of large-scale Internet of Things (IoT) and Industrial Internet of Things (IIoT) applications. The project was first developed as a research project at Tsinghua University and entered the Apache Incubator in November 2018.
"The Internet of Things, especially Industrial IoT, has swept the globe with unimaginable volumes of data,” said Xiangdong Huang, Vice President of Apache IoTDB. "To date, both Relational and Key Value-based database solutions struggle to meet the demands of IoT data management. Apache IoTDB is the missing link between current IoT data and IoT applications, and is redefining how IoT data is managed, both in the cloud and on the edge. We are proud to graduate as an Apache Top-Level Project, which is an important milestone in our project’s maturity."
Apache IoTDB provides a compact and time series optimized columnar data file, which is able to efficiently store and access time series data. The database engine is specially optimized for time series-oriented operations, such as aggregations query, down-sampling, and time alignment query. Due to its lightweight structure, high performance, and deep integration with Apache Big Data ecosystem projects (such as Flink, Hadoop, and Spark), Apache IoTDB easily meets the requirements of storing massive data sets, ingesting high-speed data, and analyzing complex data, both on the edge and the cloud. Features include:
- High-throughput read and write: supports high-speed write access for millions of low-power and intelligently networked devices, and provides lightning-quick read access for retrieving data on billions of data points.
- Efficient directory structure: organizes complex metadata structure from IoT devices and large scale time series data, with fuzzy searching strategy for complex directory of time series data.
- Rich query semantics: supports time alignment for time series data across devices and sensors, computation in time series field, and abundant aggregation functions in time dimension.
- Flexible deployment: supports running on the edge (e.g., running on a Raspberry Pi), as well as forming a cluster in the cloud. It also provides a bridge tool between cloud platforms and data synchronization on premise machines.
- Deep integration with Open Source Big Data projects: supports analysis ecosystems, including Apache Flink, Hadoop, PLC4X and Spark, as well as other Open Source applications.
- Low hardware cost: reaches a high compression ratio of disk storage.
Apache IoTDB is in use at dozens of organizations that include ArcelorMittal AMERICA, BONC Ltd., the China Meteorological Administration, Datang Xianyi, Goldwind, Haier, Lenovo, NAVINFO, pragmatic industries GMBH, Shanghai Metro, Tsinghua University, Yangtze Optical Fiber and Cable Company, and more.
"IoTDB has attained Apache Top Level project status at a time of confluence of database, IoT and AI technologies in conjunction with a wider adoption of Industry 4.0 and automation approaches to further enable remote work and increased efficiencies," said Prof. C. Mohan, recently retired IBM Fellow, Former Chief Scientist of IBM India, and a member of the US National Academy of Engineering. "I am excited since this is the first Chinese University originated open-source project to reach this status. While I have been associated with the researchers behind IoTDB as a Distinguished Visiting Professor of the School of Software at China's prestigious Tsinghua University, I have seen this project reach maturity and build up a vibrant OSS community around it. It has a bright future ahead of it and I plan to collaborate on it."
"Apache IoTDB is a perfect fit for edge computing," said Dr. Julian Feinauer, CEO at pragmatic industries GmbH. "The high compression helps to use the (limited) amount of memory we have very efficiently. IoTDB is a perfect fit, especially in IIoT use cases, where network and compute capabilities are limited on the edge."
"Apache IoTDB was initially launched by a Chinese University and then incubated successfully in the Apache Community," said Prof. Hong Mei, an academician of the Chinese Academy of Sciences. "Following the Apache Way, it has created a healthy and active international open source community. It is a successful practice of open source education and culture advancement in China."
"Apache IoTDB has made many optimizations for different runtime environments, operating systems, and workloads in both the edge and the cloud. As a core infrastructure software in Industrial Internet, it innovates a series of IoT data management and analysis techniques," said Prof. Xiangke Liao, an academician of the Chinese Academy of Engineering. "Through the open source model, Apache IoTDB shares its creative techniques to the world."
"With the continuous growth of intelligent devices, machine-generated data is growing day by day, which poses extraordinary challenges on storing process, query speed, and storage space," said Dawei Liu, architect at AutoAI Inc., a subsidiary of NAVINFO, and member of the Apache IoTDB Project Management Committee. "We tried and tested a variety of solutions and finally chose IoTDB as our core database for its high performance, openness to the enterprise, and its active community. We built our Wecloud platform based on Apache IoTDB, which has served well for BMW, Toyota, and Great Wall Motors, among other auto manufacturers. The project deeply attracted me to become a part of the community. The coolest thing is that I finally became an IoTDB committer and now share our ideas to the community."
"Apache IoTDB is an open source project and software technology innovation developed for the need of AIoT Big Data applications," said Prof. Jianmin Wang, Dean of the Tsinghua University School of Software, who originally decided to donate the project to the ASF. "It is also a very beneficial attempt for training leading talents. There will be a long way to go and the future is promising."
"Apache IoTDB is on its way to becoming a standard IoT data management and analysis solution, and we’re excited to build upon our work thus far," added Huang. "We believe Apache IoTDB will help more users and companies to solve their real problems. The process to achieve the goal is exciting and honorable, and we invite more contributors to join us. Following the Apache Way, let's bring this interesting, meaningful, and powerful software to the whole world."
A published paper on Apache IoTDB written by members of the Apache IoTDB Project Management Committee is available at http://www.vldb.org/pvldb/vol13/p2901-wang.pdf . An introduction to Apache IoTDB from ApacheCon Europe 2019 is available on Feathercast https://feathercast.apache.org/2019/09/12/hello-world-introducing-apache-iotdb-a-database-for-the-internet-of-things-xiangdong-huang-julian-feinauer/
Catch Apache IoTDB in action at ApacheCon@Home, 29 September-1 October 2020 https://www.apachecon.com/acah2020/tracks/iot.html
Availability and Oversight
Apache IoTDB software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache IoTDB, visit http://iotdb.apache.org/ and https://twitter.com/ApacheIoTDB
About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 227M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 7,800+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Inspur, Pineapple Fund, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "IoTDB", "Apache IoTDB", "Flink", "Apache Flink", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 11:01AM Sep 23, 2020
by Sally Khudairi in General |
|
The Apache Software Foundation Announces the 10th Anniversary of Apache® HBase™
Open Source distributed, scalable Big Data store celebrates a decade of processing zettabytes of data across highly scalable large tables for the Apache Hadoop ecosystem
Wakefield, MA —13 May 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the tenth Anniversary of Apache® HBase™, the distributed, scalable data store for the Apache Hadoop Big Data ecosystem.
"The success of Apache HBase is the success of Open Source," said Duo Zhang, Vice President of Apache HBase. "Ten years after graduating as a TLP, HBase is still among the most active projects at the ASF. We have hundreds of contributors all around the world. We speak different languages, we have different skills, but we all work together to make HBase better and better. Ten year anniversary is not the end, but a new beginning, I believe our strong community will lead the project to a bright future."
HBase originated at Powerset in 2006 as an Open Source system to run on Apache Hadoop’s Distributed File System (HDFS), similar to how BigTable ran on top of the Google File System. In 2007, a significant code contribution was added to the Apache Hadoop codebase and was integrated into the Apache Hadoop 0.15.0 release later that year. Development on HBase continued as a sub-project of Apache Hadoop, and graduated as an Apache Top-Level Project (TLP) in April 2010.
An Open Source, versioned, non-relational database, Apache HBase provides low latency random access to very large tables —billions of rows and millions of columns— atop clusters of non-specialized, commodity hardware. HBase reads, writes, and processes structured, semi-structured, and unstructured data in real-time environments.
Apache HBase is in use at thousands of organizations, including Adobe, Airbnb, Alibaba, Bloomberg, Flipkart, Huawei, HP, Hubspot, IBM, Microsoft, NetEase, Pinterest, Salesforce, Shopee, Tencent, Twitter, Xiaomi, and Yahoo! (now Verizon Media), among others.
Testimonials
"Congratulations on the 10th birthday of Apache HBase! Alibaba started to use HBase since January 2011 and has witnessed its growth and come along with the community through the years. The Apache HBase community has always been an open and powerful team that produced many stable, production-ready and widely used versions. Today at Alibaba, we have HBase clusters with more than 10k nodes serving hundreds of petabytes of data, as well as more than 1,000 enterprise HBase users on Alibaba Cloud. We will continue collaborating with and contributing to the HBase community and wish us all ongoing success in future!"
—Chunhui Shen and Yu Li, members of the HBase team at Alibaba
"I have worked with Apache HBase for many years and I think it is a great product. it does what it says on the tin so to speak. Ironically if you look around the NoSQL competitors, most of them are supported by start-ups, whereas HBase is only supported as part of Apache suite of products by vendors like Cloudera, Hortonworks, MapR, etc. For those who would prefer to use SQL on top, there is Apache Phoenix around which makes life easier for the most SQL-savvy world to work on HBase: problem solved. For TCO, HBase is still value for money compared to others. You don't need expensive RAM or SSD with HBase. That makes it easy to onboard it in no time. Also HBase can be used in a variety of different business applications, whereas other commercial ones are focused on narrower niche markets. Least but last happy 10th anniversary and hope HBase will go from strength to strength and we will keep using it for years to come!"
—Dr. Mich Talebzadeh, Chief Data Architect, Big Data
"Congratulations on the 10th anniversary of Apache HBase! Xiaomi started to use HBase in 2012, when our business started booming. Many key Xiaomi products and services, as well as Xiaomi's data analytics platform, require a new system to provide quick and random access to billions of rows of structured and semi-structured data. Traditional solutions are not able to handle the large volume of data brought by the quickly increasing Xiaomi user base. Among several available options, we choose HBase not only because it provides a rich set of features and excellent performance specs, but also because it has a very active, open and friendly community. Embracing open source has been part of Xiaomi's engineering culture, and our deep involvement in the development of Apache HBase demonstrates the best practices of Xiaomi's open source strategy. In the past several years, we have contributed tons of bug fixes and important features to HBase, and, in the meantime, we have contributed 9 committers and 3 PMC members to the HBase community. Looking forward, we will continue to work closely with the Apache HBase community to help the project grow, and we wish Apache HBase a wonderful future!"
—Dr. Baoqiu Cui, Vice President of Xiaomi Corporation and Technical Committee Chairman
“Congratulations on the 10th anniversary of Apache HBase, it’s great to see how the project has developed over the years and continues to have good community support around it! Salesforce has a large global footprint of Apache HBase in production storing multiple petabytes of customer data and serving several billions of queries per day for a wide variety of use cases including security, monitoring, collaboration portals, and performance caches to scale over RDBMS limitations. HBase has played a major role in Salesforce’s customer success in the BigData storage space and we continue to invest in it as one of the pillars of our multi-substrate database strategy along with Apache Phoenix for SQL access to data stored in HBase. We have contributed many features and bug fixes to HBase over the last several years, and we look forward to continue working with the Apache HBase community to develop the project further. Here’s to many more successful years for Apache HBase!”
—Sanjeev Lakshmanan, Senior Director, Software Development, Salesforce
“Happy 10th Apache HBase! It was around 8 years ago that we started looking at HBase to include as part of our Hosted Big Data Services stack. Fast-forward to today and it continues to be a critical offering in our stack, powering a diverse set of use cases and workloads such as ad targeting, content personalization, analytics, security, monitoring, etc. HBase enables these diverse workloads thanks to it’s high-scalability, feature set and performance, all of which have been continuously refined through the years. In turn our footprint continues to grow storing petabytes of data across thousands of machines. Our success is in part thanks to the project’s success as we benefit from our collaborations, the contributions and other efforts by the community (eg mailing list, meetups, HBaseCon, etc). This is a testament to the open, friendly and dedicated community around Apache HBase which is necessary for the success of any open source project. We wish the project continued success for years to come as we continue to collaborate with and be part of the community cultivating the project.”
—Francis Liu and Thiruvel Thirumoolan, HBase Big Data Team Members, Yahoo! (now Verizon Media)
“Congratulations on the 10th anniversary of Apache HBase! It’s great to see how this project has evolved from a big data project to one that runs business critical systems and continues to accelerate with a growing community and increasing pace of development! Cloudera has over 500 customers in production using it for a range of use cases ranging from mission critical transactional applications to supporting data warehousing. Our largest customers have footprints in excess of 7,000 nodes storing over 70PB of data. Our customers choose HBase because of its resilience with some customers able to realize 100% application uptime using HBase (over the past 3 years). We plan to continue to invest in HBase (and Apache Phoenix) to ensure that we can continue to both broaden support for a variety of hybrid transactional and analytical use cases and deepen support for existing use cases. Here's to many more successful years!"
—Arun C. Murthy, Chief Product Officer, Cloudera
“Many Congratulations to the Apache HBase community on the 10th anniversary. Apache HBase provides rich functions and excellent performance, and has an open and friendly community. Huawei started using HBase since 2010: HBase is widely used by multiple solutions of Huawei running on more than 10,000 nodes, storing hundreds of PBs data to meet our requirements. Huawei FusionInsight provides the Best Practices of Huawei for HBase, which serves a lot of customers across many industries such as finance, operators, government, energy, medical, manufacturing, and transportation. Meanwhile, Huawei team members contributed a lot of bug fixes and features to HBase, successfully hosted the first HBase Asia Technology Conference HBaseCon Asia 2017 at Shenzhen. Going forward, Huawei will continue to work closely with the Apache HBase community to promote community development.”
—Wei Zhi, Kai Mo and Pankaj Kumar, members of the HBase team at Huawei
“Happy 10th anniversary, HBase! At Ultra Tendency, you have been the backbone of our Dual Lambda Streaming Architecture for many years! You have served billions of queries to our customers without interruption and at low latency. Your architecture guaranteed that you were always there when we needed you, never letting us or our customers down. You are the reason why our European clients today are running flourishing new business models backed by low-latency streaming products. Our committers and contributors will continue to fix bugs and provide feature enhancements. Ultra Tendency wishes you a bright and successful future!”
—Jan Hentschel, Chief Information Officer, Ultra Tendency
“Congratulations on the 10th anniversary of Apache HBase, I can't believe it's been 10 years since the first day when I tried to use Apache HBase and its ecosystem to help the business and company. Also, it is so great to see many colleagues and friends work, discuss, cooperate together to make this system become better. Some of them also make great career development and some are still progress. Shopee, one of the biggest e-commerce platforms in Southeast Asia, has several large Apache HBase clusters in production to support businesses that depend on several billions of queries per day. Apache HBase has played a significant role in Shopee and it is still in expansion along with the business growth of Shopee. Apache HBase, as well as the community, helps us a lot and we also will continue to make contributions to Apache HBase. Looking forward to keeping working with the Apache HBase community to develop the project and its ecosystem further.”
—Li Luo, Manager of Data Infra department, Shopee
”At Microsoft, our mission is to empower every person and every organization on the planet to achieve more, and it’s this mission that drives our commitment to open source. Congratulations to the Apache HBase community on its 10th anniversary. Microsoft has been part of the vibrant HBase community since 2014, today we are proud to serve the numerous enterprise customers across industries who are leveraging HBase in Azure HDInsight for their most critical business applications.”
—Tomas Talius, Director of Engineering, Azure Data Services, Microsoft
Availability and Oversight
Apache HBase software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache HBase, visit http://hbase.apache.org/ and https://twitter.com/HBase
About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 200M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 7,600+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "HBase", "Apache HBase", "Hadoop", "Apache Hadoop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 01:00PM May 13, 2020
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® ShardingSphere™ as a Top-Level Project
- ShardingSphere-JDBC —a lightweight Java framework that provides extra service at the Java JDBC (“Java Database Connectivity”) layer. It provides service in the form of JAR (“Java ARchive”) that requires no additional deployment or dependencies. It can be considered as an enhanced JDBC driver, which is fully compatible with JDBC and all kinds of ORM (Object/Relational Mapping) frameworks.
- ShardingSphere-Proxy —database proxy that provides a database server that encapsulates database binary protocol to support all developed languages and any terminal.
- ShardingSphere-Sidecar (TODO) —a Cloud-native database agent of the Kubernetes environment that controls the access to the database in the form of sidecar (supporting services deployed with the main application). It provides a mesh layer interacting with the database, known as “Database Mesh”.
- Completely distributed database solution that provides data sharding, distributed transactions, data migration, as well as database and data governance features.
- Independent SQL parser for multiple SQL dialects that can be used independently of ShardingSphere.
- Pluggable micro-kernel that enables all SQL dialects, database protocols and features to be plugged-in and pulled-out by service provider interfaces.
Posted at 03:38PM Apr 16, 2020
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® Rya® as a Top-Level Project
Scalable Open Source Big Data database processes queries in milliseconds; used in autonomous drones, federated situation-aware access control systems, and petabyte-scale graphs modeling, among many other applications.
Wakefield, MA —24 September 2019— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Rya® as a Top-Level Project (TLP).
Posted at 12:59PM Sep 24, 2019
by Sally Khudairi in General |
|
The Apache® Software Foundation Announces Apache Arrow™ Momentum
- C# Library
- Gandiva LLVM-based Expression Compiler
- Go Library
- Javascript Library
- Plasma Shared Memory Object Store
- Ruby Libraries (Apache Arrow and Apache Parquet)
- Rust Libraries (Parquet and DataFusion Query Engine)
Posted at 11:00AM Feb 19, 2019
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® Hadoop® v3.2.0
- ABFS Filesystem connector —supports the latest Azure Datalake Gen2 Storage;
- Enhanced S3A connector —including better resilience to throttled AWS S3 and DynamoDB IO;
- Node Attributes Support in YARN —helps to tag multiple labels on the nodes based on its attributes and supports placing the containers based on expression of these labels;
- Storage Policy Satisfier —supports HDFS (Hadoop Distributed File System) applications to move the blocks between storage types as they set the storage policies on files/directories;
- Hadoop Submarine —enables data engineers to easily develop, train and deploy deep learning models (in TensorFlow) on very same Hadoop YARN cluster;
- C++ HDFS client —helps to do async IO to HDFS which helps downstream projects such as Apache ORC;
- Upgrades for long running services —supports in-place seamless upgrades of long running containers via YARN Native Service API (application program interface) and CLI (command-line interface).
Posted at 11:00AM Jan 23, 2019
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® Airflow™ as a Top-Level Project
Open Source Big Data workflow management system in use at Adobe, Airbnb, Etsy, Google, ING, Lyft, PayPal, Reddit, Square, Twitter, and United Airlines, among others.
Wakefield, MA —8 January 2019— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Airflow™ as a Top-Level Project (TLP).
Apache Airflow is a flexible, scalable workflow automation and scheduling system for authoring and managing Big Data processing pipelines of hundreds of petabytes. Graduation from the Apache Incubator as a Top-Level Project signifies that the Apache Airflow community and products have been well-governed under the ASF's meritocratic process and principles.
"Since its inception, Apache Airflow has quickly become the de-facto standard for workflow orchestration," said Bolke de Bruin, Vice President of Apache Airflow. "Airflow has gained adoption among developers and data scientists alike thanks to its focus on configuration-as-code. That has gained us a community during incubation at the ASF that not only uses Apache Airflow but also contributes back. This reflects Airflow’s ease of use, scalability, and power of our diverse community; that it is embraced by enterprises and start-ups alike, allows us to now graduate to a Top-Level Project."
Apache Airflow is used to easily orchestrate complex computational workflows. Through smart scheduling, database and dependency management, error handling and logging, Airflow automates resource management, from single servers to large-scale clusters. Written in Python, the project is highly extensible and able to run tasks written in other languages, allowing integration with commonly used architectures and projects such as AWS S3, Docker, Apache Hadoop HDFS, Apache Hive, Kubernetes, MySQL, Postgres, Apache Zeppelin, and more. Airflow originated at Airbnb in 2014 and was submitted to the Apache Incubator March 2016.
Apache Airflow is in use at more than 200 organizations, including Adobe, Airbnb, Astronomer, Etsy, Google, ING, Lyft, NYC City Planning, Paypal, Polidea, Qubole, Quizlet, Reddit, Reply, Solita, Square, Twitter, and United Airlines, among others. A list of known users can be found at https://github.com/apache/incubator-airflow#who-uses-apache-airflow
"Adobe Experience Platform is built on cloud infrastructure leveraging open source technologies such as Apache Spark, Kafka, Hadoop, Storm, and more," said Hitesh Shah, Principal Architect of Adobe Experience Platform. "Apache Airflow is a great new addition to the ecosystem of orchestration engines for Big Data processing pipelines. We have been leveraging Airflow for various use cases in Adobe Experience Cloud and will soon be looking to share the results of our experiments of running Airflow on Kubernetes."
"Our clients just love Apache Airflow. Airflow has been a part of all our Data pipelines created in past 2 years acting as the ring-master and taming our Machine Learning and ETL Pipelines," said Kaxil Naik, Data Engineer at Data Reply. "It has helped us create a Single View for our client's entire data ecosystem. Airflow's Data-aware scheduling and error-handling helped automate entire report generation process reliably without any human-intervention. It easily integrates with Google Cloud (and other major cloud providers) as well and allows non-technical personnel to use it without a steep learning curve because of Airflow’s configuration-as-a-code paradigm."
"With over 250 PB of data under management, PayPal relies on workflow schedulers such as Apache Airflow to manage its data movement needs reliably," said Sid Anand, Chief Data Engineer at PayPal. "Additionally, Airflow is used for a range of system orchestration needs across many of our distributed systems: needs include self-healing, autoscaling, and reliable [re-]provisioning."
"Since our offering of Apache Airflow as a service in Sept 2016, a lot of big and small enterprises have successfully shifted all of their workflow needs to Airflow," said Sumit Maheshwari, Engineering Manager at Qubole. "At Qubole, not only are we a provider, but also a big consumer of Airflow as well. For example, our whole Insight and Recommendations platform is built around Airflow only, where we process billions of events every month from hundreds of enterprises and generate insights for them on big data solutions like Apache Hadoop, Apache Spark, and Presto. We are very impressed by the simplicity of Airflow and ease at which it can be integrated with other solutions like clouds, monitoring systems or various data sources."
"At ING, we use Apache Airflow to orchestrate our core processes, transforming billions of records from across the globe each day," said Rob Keevil, Data Analytics Platform Lead at ING WB Advanced Analytics. "Its feature set, Open Source heritage and extensibility make it well suited to coordinate the wide variety of batch processes we operate, including ETL workflows, model training, integration scripting, data integrity testing, and alerting. We have played an active role in Airflow development from the onset, having submitted hundreds of pull requests to ensure that the community benefits from the Airflow improvements created at ING. We are delighted to see Airflow graduate from the Apache Incubator, and look forward to see where this exciting project will be taken in future!"
"We saw immediately the value of Apache Airflow as an orchestrator when we started contributing and using it," said Jarek Potiuk, Principal Software Engineer at Polidea. "Being able to develop and maintain the whole workflow by engineers is usually a challenge when you have a huge configuration to maintain. Airflow allows your DevOps to have a lot of fun and still use the standard coding tools to evolve your infrastructure. This is 'infrastructure as a code' at its best."
"Workflow orchestration is essential to the (big) data era that we live in," added de Bruin. "The field is evolving quite fast and the new data thinking is just starting to make an impact. Apache Airflow is a child of the data era and therefore very well positioned, and is also young so a lot of development can still happen. Airflow can use bright minds from scientific computing, enterprises, and start-ups to further improve it. Join the community, it is easy to hop on!"
Availability and Oversight
Apache Airflow software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Airflow, visit http://airflow.apache.org/ and https://twitter.com/ApacheAirflow
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members and 7,000 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks, Huawei, IBM, Indeed, Inspur, LeaseWeb, Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, and Union Investment. For more information, visit http://apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "Airflow", "Apache Airflow", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
# # #
Posted at 11:00AM Jan 08, 2019
by Sally Khudairi in General |
|
The Apache Software Foundation Announces Apache® HAWQ® as a Top-Level Project
- Exceptional performance: parallel processing architecture delivers high performance throughput and low latency —potentially near real time— query responses that can scale to petabyte-sized datasets;
- Robust ANSI SQL compliance: leverage familiar skills. Achieve higher levels of compatibility for SQL-based applications and BI/data visualization tools. Execute complex queries and joins, including roll-ups and nested queries; and
- Apache Hadoop ecosystem integration: integrate and manage with Apache YARN. Provision with Apache Ambari. Interface with Apache HCatalog. Supports Apache Parquet, Apache HBase, and others. Easily scales nodes up or down to meet performance or capacity requirements.
Apache HAWQ is in use at Alibaba, Haier, VMware, ZTESoft, and hundreds of users around the world.
"We admire Apache HAWQ's flexible framework and ability to scale up in a Cloud ecosystem. HAWQ helps those seeking a heterogeneous computing system to handle ad-hoc queries and heavy batch workloads," said Kuien Liu, Computing Platform Architect at Alibaba. "Alibaba encourages more and more engineers to continue to embrace Open Source, and Apache HAWQ stands out as a star project. We are proud to have been collaborating with this community since 2015."
"Apache HAWQ is an attractive technology for Big Data applications," said Zixu Zhao, Architect at ZTESoft. "HAWQ serves as the foundation of our Big Data platform and it has been used in a lot of applications, such as interactive analytics and BI on telecom data. We congratulate HAWQ on becoming an Apache Top-Level Project."
"Becoming an Apache Top-Level Project is an important milestone," added Chang. "There is much work ahead of us, and we look forward to growing the HAWQ community and codebase."
Apache HAWQ software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache HAWQ, visit http://hawq.apache.org/ and https://twitter.com/ApacheHAWQ .
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members and 6,800 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Anonymous, ARM, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, Huawei, IBM, Indeed, Inspur, LeaseWeb, Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, and Union Investment. For more information, visit http://apache.org/ and https://twitter.com/TheASF
# # #
Posted at 10:00AM Aug 23, 2018
by Sally Khudairi in General |
|