Entries tagged [yarn]

Thursday May 29, 2014

BigTop hackathon at Hadoop Summit !

Time for another hackathon.

There are alot of companies who contribute to BigTop.  Pivotal, Cloudera, Red Hat, WanDisco, Amazon and so on... if I left yours out feel free to leave a comment below and I'll update this post.  And today, we are proud  to announce that Red Hat is hosting the next bigtop hackathon, immediately following Hadoop Summit 2014, in San Jose.  Hadoop is about alot more than just soucre code - its about packaging, deployment, configuration, and so on.  And BigTop has embraced the difficult task of tying all this together.

Apache BigTop makes Hadoop deployment transparent

All source code is complex, regardless of the language.  But whats even more complex is the deployment of code on a distributed system.  While vendors have come a long way making it easy to DEPLOY hadoop with black box administrative or cloud tools, nobody has really opened up hadoop by building a culture around the deployment of it.

Enter Apache BigTop. 

BigTop contains all the stuff thats not in the hadoop docs.  For example:

  • Puppet modules for installation and configuration of hadoop without using a tarball.
  • Vagrant recipes for deployment-from-zero.
  • Smoke tests for fine grained testing.
  • The intersection of Java and RPM .
  • bigpetstore app for demonstrating to the business community how to actually use hadoop, gradle, pig, and google's javascript visualization widgets tools to build, test, and deploy a reference "hadoop app".

BigTop is working to embrace HCFS

The community has put alot of work into testing different hadoop stacks on different file systems (https://wiki.apache.org/hadoop/HCFS/Progress), and the bigtop community has embraced this effort - to their own higher cost of having to support a generic filesystem deployment, and also, at the cost of alot of JIRA reviewing.  For example, with the recent BIGTOP-952 and BIGTOP-1200 JIRAs, we're now packaging HDFS independent artifacts into BigTop.  That paves the way for more competition, more choice, and more hadoop hacking - which ultimately translates to a better end-user experiences, around hadoop.

BigTop builds OS+Admin freindly packages for emerging ecosystem projects, fast !

If you compare apache bigtop with other hadoop vendor distributions, you'll find that it is the bleeding edge.  For example, you can watch this recent video demonstration of spinning up Storm on BigTop, from ApacheCon 2014: https://www.youtube.com/watch?v=VZzJxsMJahc, to see just how easy it is to deploy spark out of the box using BigTop's deployment recipes.  As new projects come forward in the upstream, the first place to put them is into apache BigTop.  This means that if you want to try out a new animal in hadoop's stack, you can easily do so with the bigtop stack.  And again : the infrastructure around vagrant makes it easy to build maintainable VM workflows around hadoop app and distribution development tasks, which easily be modified to include/exlude whichever bleeding edge packages.   Think of bigtop's approach to packaging and deployment as a lower-level version of apache ambari.

Sounds Interesting? come to the hackathon in Mountain View after Hadoop Summit  ! 

So this is all prelude to the BigTop hackathon that we are hosting at  Red Hat.  The focus will be on hacking - not presentations.  But that doesn't mean you have to be an expert to get involved.  Coming to this hackathon will give you a chance to pair program with the BigTop commiters, and try your hand at a working directly on a JIRA.  I think most would agree that hacking around on apache bigtop is an excellent introduction to that hadoop ecosystem.  

The NEXT HACKATHON will be from June 6th - JUNE 9th at the Red Hat Offices in Mountain View California.  For more details, ping us on the BigTop mailing list and check the meetup URL : http://www.meetup.com/Bay-Area-Bigtop-Meetup/events/184893732/ .

See you there ! 

Saturday June 22, 2013

Apache Bigtop 0.6.0 has been released

Just in time for Hadoop Summit 2013, Apache Bigtop team is very pleased to announce the release of Apache Bigtop 0.6.0. The very first release of a fully integrated Big data management distribution built on the currently most advanced Hadoop 2.x -- Hadoop 2.0.5-alpha.

Apache Bigtop, as many of you might already know, is a project aimed at creating 100% open source and community driven Big data management distribution based on Apache Hadoop. You can learn more about it by reading one of our earlier blog posts on Apache Blogs.

The very astute readers of this blog would notice that given our quarterly release schedule Bigtop 0.6.0 should have been called Bigtop 0.7.0. It is true that we skipped a quarter. Our excuse is that we spent all this extra time on helping Hadoop community stabilize the Hadoop 2.x code line and making it a robust kernel for all the applications that are now part of the Bigtop distribution. And speaking of applications, we haven’t forgotten to grow the Bigtop family. Bigtop 0.6.0 adds Apache HCatalog and Apache Giraph to the mix. The full list of Hadoop applications available as part of Bigtop 0.6.0 release is now:

  • Apache Zookeeper 3.4.5Apache Flume 1.3.1

  • Apache HBase 0.94.5

  • Apache Pig 0.11.1

  • Apache Hive 0.10.0

  • Apache Sqoop 2 (AKA 1.99.2)

  • Apache Oozie 3.3.2

  • Apache Whirr 0.8.2

  • Apache Mahout 0.7

  • Apache Solr (SolrCloud) 4.2.1

  • Apache Crunch (incubating) 0.5.0

  • Apache HCatalog 0.5.0

  • Apache Giraph 1.0.0

  • LinkedIn DataFu 0.0.6

  • Cloudera Hue 2.3.0

The list of supported Linux platforms has expanded to include:

  • CentOS/RHEL 5 and 6

  • Fedora 17 and 18

  • SuSE Linux Enterprise 11

  • OpenSUSE 12.2

  • Ubuntu LTS Lucid (10.04) and Precise (12.04)

  • Ubuntu Quantal (12.10)

We would like to invite everybody to give the Bigtop 0.6.0 binary distribution a try. All you have to do is to pick your favorite Linux distribution, follow our wiki instructions and you will have your first pseudo-distributed cluster computing Pi in no time.

If you’re thinking about deploying Bigtop to a fully-distributed cluster you might find our Puppet code to be useful — after all we use it all the time ourselves to test Bigtop. There is brief documentation on how to run our Puppet recipes in a master-less puppet configuration, but a typical Puppet master setup should work as well. Bigtop plays an important role in CDH which leverages all its packaging code from Bigtop.

Finally, Apache Bigtop would not have been possible without the tireless work of all the volunteer developers. This is an amazing community to be part of, and if you would like to join us, now is the time. In fact, we decided to take advantage of Hadoop Summit drawing a lot of Hadoop developers to the San Francisco Bay Area and have our first meeting of the Apache Bigtop Working Group on Thu, Jun 27 2013. Come join us! It is a lot of fun to build the future of bigdata management together!

Happy Big Data discoveries,
Your faithful and tireless Bigtop development team!



Hot Blogs (today's hits)

Tag Cloud