Seahorse Overview

Table of Contents


Seahorse is an open-source visual framework allowing you to create Apache Spark applications in a fast, simple and interactive way. Creating Spark applications with Seahorse is as easy as dragging and dropping operation on the canvas, all while connected to any Spark Cluster (YARN, Mesos, Standalone) or to a bundled local Spark.

Seahorse’s interface

A Glimpse of Seahorse’s Features

About the Product

Using Seahorse, you can create complex dataflows for ETL (Extract, Transform and Load) and machine learning without knowing Spark’s internals. Seahorse provides tools to tackle real world Big Data problems while letting the user experience a very gentle learning curve. Seahorse takes care of many complicated concepts and presents a simple, clean interface.

Seahorse emphasizes a visual approach to programming. This results in user’s applications being extremely readable: the logic driving the entire program is visible at first glance.

What’s important, while promoting a code-free working style, Seahorse does not limit users to a predefined set of actions. Whenever the user encounters a necessity to include a non-standard action in their application - something that is not covered by Seahorse’s palette of operations - they can write their own transformations in Python and R.

Seahorse offers a web-based interface that presents a Spark application as a graph of operations - a workflow. A typical Seahorse session consists of three alternating phases: adding operations to the workflow, executing the part of it that’s already been created and exploring the results of the execution. This establishes an interactive process during which the user is able to track what happens at each step.

Finally, after the workflow has been constructed, it can be exported and deployed as a standalone Spark application on production clusters.

Learn More

Learn more about Seahorse enterprise-scale deployments - includes customized set-up, security, integration and 24/7 support.