Demystifying | Using H2O Flow

Demystifying | Using H2O Flow

26 Nov 2019 | Team Wavelabs

H2O Flow is an open-source user interface for H2O. It is a web-based interactive environment that allows you to combine code execution, text, mathematics, plots, and rich media in a single document.

With H2O Flow, one can capture, rerun, annotate, present, and share their workflow. H2O Flow allows us to use H2O interactively to import files, build models, and iteratively improve them. Based on your models, you can make predictions and add rich text to create vignettes of your work — all within Flow’s browser-based environment.

H2O Flow sends commands to H2O as a sequence of executable cells. The cells can be modified, rearranged, or saved to a library. Each cell contains an input field that allows you to enter commands, define functions, call other functions and access other cells or objects on the page. When you execute the cell, the output is a graphical object, which can be inspected to view additional details.

No programming experience is required to run H2O Flow. You can click your way through any H2O operation without ever writing a single line of code. You can even disable the input cells to run H2O Flow using only the GUI.

Installing H2O:

1. Download H2O. This is a zip file that contains everything you need to get started.

2. From your terminal, run:

cd ~/Downloads
cd h2o-
java -jar h2o.jar

3. Point your browser to http://localhost:54321 to open H2O Flow dashboard page.

H2O Flow is designed to guide you every step of the way, by providing input prompts, interactive help, and example flows. So I am not going to explain every feature of this Flow interface. I will run a simple regression example from the previous python API post but without writing a single line of code.

Note: Full features and functionality are common to all interfaces(Python, R, Scala etc) along with H2O Flow. So every feature and customization you have seen in the previous post is available in Flow too.

Step 1: Import files to H2O.

Give a web source or local file to import data into H2O.

Next step is to parse the imported files.

H2O provides functionality to change the auto parsed data if needed. We can also specify how the parsing can be done too.

This will ingest the file and create an H2O Dataframe.

When we view the data frame we get a set of options that can be applied on the data frame.

The next step is to split the data frame into Train and Test sets. We click on “Split” button to do that.

Now Trainset is frame_0.750 and Test set is frame_0.250. Now click on “Build_Model” button to select the algorithm.

There are a whole bunch of settings/parameters/hyper-parameters available to tune. I am going to go with default settings here. But make sure to assign the training frame, validation frame and response column for sure. At the end of all the settings click the “Build Model” button.

As you can see, the processing time is pretty fast because of the cluster ’s collective performance.

Clicking on view gives us a nice dashboard for the model object with a bunch of useful functions.

Download POJO, MOJO gives us the java object with the model’s weights and parameters and can be used to do predictions in any java running production environment writing wrapper classes.

This post gives a quick overview of how to use H2O’s Flow interface without writing a single line of code. This is especially useful for domain experts with little knowledge of programming to quickly spin up models. H2O makes it really easy and convenient with interactive help and prompts to guide us along the way.