Graal Platform Documentation

Graal Platform Documentation

  • Docs
  • Help

›Overview

Overview

  • What is Graal Platform?
  • Why use our platform?
  • How Graal Platform works?
  • Concepts
  • Jobs & workflows
  • Security

Quickstart

  • Quickstart

Tutorials

  • Get started with Python
  • Get started with Dask
  • Get started with XGBoost
  • Get started with Apache Spark and Maven
  • Get started with Apache PySpark
  • Get started with Apache Beam and Gradle
  • Use the API
  • Using the command line tool (graalctl)
  • Using secrets
  • Migration from Databricks
  • Get started with Tensorflow
  • Get started with Pytorch
  • Get started with Mxnet
  • Setting up the Hadoop bridge
  • Get started with Apache Flink and Maven
  • Get started with Dbt
  • Get started with Pulsar
  • Get started with Apache Spark Streaming Pulsar
  • Get started with Debezium
  • Get started with the SDK

How-to guides

  • Using Graal Platform with Azure Data Factory
  • Publishing your artefacts with Azure DevOps
  • Using Graal Platform with Apache Airflow
  • Publishing your artefacts with Jenkins
  • Spark
  • Network, VPN, gateway and firewall
  • Logs
  • Pricing

Security

  • Overview
  • Comply with requirements
  • Infrastructures under Graal Systems
  • Responsibilities

Troubleshoot & debug

  • Troubleshooting
  • Common issues
  • Debug jobs

Why use our platform?

We take a different approach to helping people build extraordinary data-driven platforms. It was built out of frustration with existing "data-hub", "data platform", "big data lake" that are too complex, too expensive, and don't help their users actually deploy workloads and run complex workflows.

Graal Platform is:

Simple to use

Graal Platform enables the enterprises to quickly run data workloads in the cloud while optimizing the use of cloud resources. With our API, you can configure your resources in a consistent, repeatable fashion.

Powerful and scalable

Optimize cloud resource usage by seamlessly adjusting our resources as workload and activity changes. Allows you to respond faster to new business requirements.

Hybrid

Graal Platform provides you with the flexibility to run the same platform to gain data insights in the data center as well as on the public cloud of your choice. Whether it is Microsoft Azure, Amazon Web Services or Google Cloud Platform, your organization can harness the agility of the cloud, while at the same time run your on-premise workloads to become a hybrid, data-driven enterprise. With our enterprise ready modern data architecture, customers will have a consistent experience across both the data center and the cloud.

Job scheduling

Graal Platform enables developers to build complex data transformations out of multiple component tasks, enabling greater control over complex jobs and also making it easier to schedule repetitions of those jobs.

Management and monitoring

Graal Platform provides an open operational framework for provisioning, managing and monitoring resources. It includes a web interface that enables administrators to manage services, change configurations, and control the ongoing growth of the capacities.

Optimized for troubleshooting

Operations teams deploy, monitor and manage your data projects and associated cloud resources. Graal Platform simplifies this experience. Our console provides a management platform for provisioning, managing, monitoring and securing your next Data Platform.

Governance

Graal Platform extends data access and management with powerful tools for data governance and integration. They provide a reliable, repeatable, and simple framework for managing the flow of data in and out of our platform. Graal Platform has engineering relationships with many leading data management providers to enable their tools to work and integrate with HDP.

Comparisons to other solutions

Kubernetes

Kubernetes automates operational tasks of container management and includes built-in commands for deploying applications, rolling out changes to your applications, scaling your applications up and down to fit changing needs, monitoring your applications, and more—making it easier to manage applications.

In default Kubernetes scheduler, it simply schedules pod by pod, without any context about user, app, queue. Kubernetes doesn't provide fine-grained controls on resource quotas, resource fairness and priorities, which are the most important requirements for a multi-tenancy computing system. In a multi-tenant environment, a lot of users are sharing cluster resources. With consideration of weights or priorities, some more important applications can get high demand resources that stand over its share.

Hadoop

Hadoop is an open source framework for large-scale data processing. Hadoop enables companies to retain and make use of all the data they collect, performing complex analysis quickly and storing results securely over a number of distributed servers.

Using a Hadoop framework, large-scale data processing can respond to increased demand by "scaling out": if the data set doubles, you distribute processing over two servers; if the data set quadruples, you distribute processing over four servers. This eliminates the strategy of growing computing capacity by throwing more expensive hardware at the problem.

Mesos

Mesos is an open-source cluster manager, developed originally at UC Berkeley. It provides applications with APIs for resource management and scheduling across the cluster. Mesos gives us the flexibility to run both containerized and non-containerized workload in a distributed manner. Marathon is a container orchestration framework which runs on Mesos. In this regard, Marathon acts as a framework for the Mesos cluster. Marathon provides several benefits which we typically expect from an orchestration platform like service discovery, load balancing, metrics, and container management APIs.

Container orchestration is not exactly the core strength of Mesos. Compared to other solutions, the Mesos learning curve is steep and quite complex. Mesos does too much and is too generic that requires a framework for most cases.

← What is Graal Platform?How Graal Platform works? →
  • Simple to use
  • Powerful and scalable
  • Hybrid
  • Job scheduling
  • Management and monitoring
  • Optimized for troubleshooting
  • Governance
  • Comparisons to other solutions
    • Kubernetes
    • Hadoop
    • Mesos
Graal Platform Documentation
Overview
What is Graal Platform?
Quickstart
Apache SparkApache FlinkApache BeamPythonTensorflowDaskDistributed XGBoost
Links
HomeConsoleCopyrights
Copyright © 2023 Graal Systems