Zeppelin

is open source data analysis environment on top of Hadoop.

Learn more ยป

What is Zeppelin?

Analytical environment on top of Hive (and Hive like systems).

Zeppelin provides

  • Web based user interface for Hive. History, Job management.
  • Support multiple different system similar to hive through pluggable driver. Currently Hive, Shark.
  • Pluggable visualization
  • Pluggable algorithm
  • Online archive of visualization, algorithm. ZAN (Zeppelin Archive Network)
  • Cron like scheduler embedded
  • Report generation (Share)

Can be used for

  • Lightweigt web interface for Hive and similar systems
  • Visualize data on Hadoop using Hive
  • Sharing visualization through http link
  • Share queries, algorithms trough online archive
  • Schedule queries for automate the job
  • Create custom visualizations using d3, google chart and any html/javascript

Checkout screenshots.

Zeppelin Stack

Zeppelin
CLI GUI ZAN
ZQL
Zengine
Hive ...
Hadoop

Zeppelin stack

  • Zengine is an framework for Java to simplify data analytics on Hadoop. Zeppelin generate Hive query.
  • ZQL is extension of HiveQL. Designed for easy data analysis.
  • ZAN is Zeppelin Archive Network, think npm for sharing libraries.

Who uses it?

  • NFLabs - Zeppelin automates regular analytical query execution via embedded scheduler. Also our data analyist take care of on-demand analysis request from customer using Zeppelin.

Lates News

[2014.03.29] Zeppelin 0.3.3 Released!

Check here

To-Do

This project still is in early stages. If you'd like to be added as a contributor, please fork!