Simple, Scalable Pricing for Hive on MR3

Whether you're just getting started or or running at scale,

choose a plan that fits your team's needs

— with full access to all features.

Free

$0

For teams getting started

  • Community support
  • Worker capacity of 512GB
  • Full access to all features
Start Free

Business

$1,700 / TB of worker capacity / month

For teams to run Hive on MR3 in production

  • Onboarding support
  • On-demand video support
  • Priority email support
  • Worker capacity from 1TB
  • Full access to all features
Contact Us

Enterprise

Custom

For large organizations with custom needs

  • Custom features on request
  • Private builds on request
  • Dedicated support engineer
  • On-demand video support
  • Priority email support
  • Unlimited worker capacity
  • Full access to all features
Contact Us

Feature Comparison

Feature FreeBusinessEnterprise
Hive on MR3 on Hadoop
Hive on MR3 on Kubernetes
Hive on MR3 in Standalone Mode
Fault Tolerance
Speculative Execution
Capacity Scheduling
High Availability
Autoscaling
In-memory/NVMe Caching (LLAP I/O)
Built-in Shuffle Service
Backpressure Handling
Secure Shuffle
Remote Shuffle Service (Celeborn)
MR3-UI (Web UI)
Prometheus Clients

How Licensing Works

Hive on MR3 is not a hosted service. Instead, users install and run it on their own infrastructure — whether on-premises, in the cloud, or in hybrid environments.

How do I get started?

To get started, users download the MR3 release from the public repository and deploy it in the way that best fits their environment. The default release corresponds to the Free plan and includes a worker capacity of 512GB. For Kubernetes, users can use the public Docker image available on DockerHub.

How do I upgrade the plan?

Upgrading is simple. We ship a prebuilt binary that unlocks additional worker capacity. With minimal changes, you continue running Hive on MR3 — just with higher capacity.

Hive on MR3 gives you complete control. You’re not relying on a vendor-controlled black box.

Cost & Scaling

Predictable pricing and control over resource usage.

Security & Location

Deploy on-premises, in the cloud, or in hybrid environments.

Workload Flexibility

Tune your deployment and environment for your own workloads.

Frequently Asked Questions

Pricing and Plans

What does worker capacity mean?

Worker capacity is the total amount of memory that your workers can use at the same time. You can choose how many workers to run and how much memory each one uses, as long as the total stays within your worker capacity. For example, on the Free plan with a worker capacity of 512GB, you can run 8 workers, each using 64GB of memory.

Can I evaluate Hive on MR3 with more than 512GB of worker capacity?

We offer a 30-day evaluation period with unlimited worker capacity, allowing you to test Hive on MR3 at full scale. Throughout the evaluation period, you’ll also receive custom onboarding support to help tune Hive on MR3 for optimal performance. If Hive on MR3 meets your needs, you can choose to upgrade to the Business plan.

Do you offer a discount for larger worker capacity?

Yes, we provide discounted pricing for larger deployments starting at 2TB of worker capacity. As your worker capacity increases, the rate becomes more favorable and you pay less per terabyte.

What kind of support is available for the Business plan?

The Business plan includes comprehensive support designed to help you run Hive on MR3 smoothly in production. You get:

  • Priority email support with responses within 12 hours across time zones — and often much faster.
  • On-demand video support via Zoom or another platform of your choice.
  • Pre-deployment planning, covering deployment method, prerequisites, and resource sizing.
  • Deployment assistance, including configuration review and validation of your first query.
  • Performance tuning baseline, helping you adjust settings such as fault tolerance, speculative execution, and shuffle behavior.
  • Competitive benchmarking, offering help to compare Hive on MR3 against Trino or Spark using your own queries or the TPC-DS benchmark.

This support complements the resources available to all users — including quick start guides, operations guides, community Slack, and the MR3 Google Group.

Product Capabilities

How does Hive on MR3 compare with Trino and Spark in performance?

Hive on MR3 delivers strong performance across both sequential and concurrent workloads. Based on the 10TB TPC-DS benchmark:

  • For sequential runs, Hive on MR3 performs slightly slower than Trino but significantly faster than Spark.
  • For concurrent workloads, Hive on MR3 significantly outperforms both Trino and Spark.

Unlike Trino, which may return incorrect results for some queries, Hive on MR3 consistently produces correct answers.

With Hive on MR3, you don’t have to choose between performance and correctness.

Can Hive on MR3 run batch and interactive queries in the same system?

Yes. Hive on MR3 is architected from the outset to support both interactive and batch queries in a single unified system. Its fault-tolerant execution and built-in capacity scheduling allow different types of workloads to run together efficiently. Interactive queries can be prioritized for faster response times, while batch jobs continue running reliably in the background — ensuring smooth operation without the need to manage separate systems.

What environments does Hive on MR3 support, and does it work with S3?

Yes. Hive on MR3 runs in any environment — on Hadoop, on Kubernetes, or even in standalone mode without a resource manager. It supports both HDFS and S3, enabling full separation of compute and storage. You can deploy Hive on MR3 on-premises, in the cloud, or in hybrid environments. This flexibility allows you to tailor deployment to any infrastructure.

Operational Advantages

How does Hive on MR3 help simplify operations and reduce costs?

In many organizations, interactive and batch workloads are handled by separate systems — one optimized for responsiveness, the other for throughput. This approach adds complexity, increases infrastructure costs, and requires maintaining multiple platforms.

Hive on MR3 eliminates this divide by supporting both types of queries in a single fault-tolerant system. With built-in capacity scheduling, it allows interactive queries to take priority without delaying batch jobs. This unified design simplifies operations, reduces infrastructure costs, and eliminates the need to maintain multiple platforms.

With Hive on MR3, one system is all you need.

How does Hive on MR3 improve resource efficiency in the cloud?

Hive on MR3 improves resource efficiency through fast autoscaling and smart caching. In cloud environments, it can scale quickly based on workload demand, efficiently combining spot and on-demand instances without risking query interruption. Selective caching can reduce repeated access to storage like S3, minimizing both latency and cost.

How easy is it to deploy Hive on MR3?

Hive on MR3 offers multiple deployment options: shell scripts for all environments, and Helm charts and a custom TypeScript generator for Kubernetes. With quick start guides and production-ready configurations, data engineers familiar with distributed systems can typically get Hive on MR3 running in about 30 minutes, given a suitable on-premises environment. In cloud environments, setup may take longer depending on provisioning, network configuration, and cloud-specific security settings.

Getting Started

What are the basic requirements for running Hive on MR3?

Hive on MR3 requires a working cluster and a database for the Hive Metastore (such as MySQL or PostgreSQL). You’ll also need writable local disks on every worker node to store intermediate query data. For shared temporary data, you can use either a PersistentVolume on Kubernetes, or distributed storage like HDFS or S3. Detailed prerequisites are available in the quick start guides.

Can I migrate my existing Hive Metastore and workloads to Hive on MR3?

Yes, migrating your existing Hive Metastore and workloads to Hive on MR3 is straightforward. Hive on MR3 uses the same Metastore schema as Apache Hive, with no differences at all. If your Apache Hive version matches the version of Hive on MR3 you intend to use, you can directly reuse your existing Metastore. For older versions such as Hive 2 or 3, you can follow the standard Hive upgrade procedure.

Existing Hive queries and User-Defined Functions (UDFs) work without changes, as Hive on MR3 preserves the same interface — only replacing the underlying execution layer. For users running Hive 3.1, a compatible build of Hive 3.1 on MR3 is available upon request.

What is the first step after meeting the requirements?

Once you meet the requirements, the first step is to download the MR3 release from the public repository and follow the quick start guides. In most cases, users find the guides clear enough to get started without needing additional help. One team even adopted Hive on MR3 in production without ever contacting us!

Where should I go for help while testing Hive on MR3?

If you need help, the best place to start is the MR3 Slack, where you can ask questions and get real-time help from the team. You can also post in the MR3 Google Group for longer discussions or support. If you prefer, you can contact us directly by email as well.

Openness and Flexibility

Is Hive on MR3 open source?

While the MR3 execution engine is not open source, Hive on MR3 is fully open source, with its source code publicly available on GitHub. It consists of two customizable components: a fork of Apache Hive extended to run on MR3, and a runtime library originally based on Apache Tez but significantly evolved over time. We provide shell scripts for rebuilding Hive on MR3 using these components — so you can apply patches or tailor the system to your needs without vendor involvement.

How is Hive on MR3 different from other open source products in practice?

Unlike some open source products that restrict key features to paid editions, Hive on MR3 offers all features even in the Free plan. It gives users substantial and targeted control over the system, with the ability to modify and rebuild key components for query compilation and execution.

The only part not open is the MR3 execution engine, which manages low-level operations like resource management, fault tolerance, and task scheduling — areas that users don’t typically need to modify. If you do need changes to the execution engine, however, you can simply reach out. We are happy to implement new features at no cost.

In practice, Hive on MR3 is more open than most open source products.

Can enterprise customers review the MR3 source code?

We understand that enterprise teams may require a deeper technical evaluation of the MR3 execution engine before adoption. While MR3 is not open source, we offer source code walkthrough sessions — live, developer-led calls that explain the architecture using detailed design documents and walk through key parts of the source code. These sessions are designed to give your team the confidence needed to evaluate MR3 for production use.

Can I apply security policies and controls of my choice?

Yes. Hive on MR3 inherits its security capabilities directly from Apache Hive, which means it supports the same integrations for authentication, authorization, and encryption — including tools like Apache Ranger, LDAP, Kerberos, SAML, and more. Because Hive on MR3 builds on Apache Hive, it benefits from the full range of security features maintained by the Hive community. As Hive continues to evolve, those improvements are naturally reflected in Hive on MR3 as well.

Am I locked in if I use Hive on MR3?

No — there is no vendor lock-in with Hive on MR3. Since it works with the standard Hive Metastore, you can switch back to Apache Hive or move to another technology whenever you choose. This flexibility is even greater if you use an open table format like Apache Iceberg.

Project Background

What is the history behind MR3?

The development of MR3 began in July 2015, following two years of preliminary research. The first official release, MR3 0.1, was launched in March 2018 and featured Hive on MR3 as its first application. Since then, we have been actively contributing to Apache Hive and expanding MR3 with new features and improvements. MR3 reflects nearly a decade of focused engineering, hands-on experience with Hive, and a long-term commitment to performance and stability.

Is Hive on MR3 used in production by other companies?

Yes. Hive on MR3 has been used in production by several companies, and a few continue to run it in production today. The system has matured through years of feedback from real-world deployments. With the release of MR3 2.0, we’re focused on making Hive on MR3 more broadly accessible to teams who can benefit from it.

Who is behind MR3?

MR3 is actively maintained by a small, dedicated team led by its original architect — a PhD in computer science from Carnegie Mellon University (CMU), USA. He has been the driving force behind MR3 since its inception in 2015, and continues to guide its development and offer hands-on support to users.

A decade of focused work, continually driving MR3’s evolution.

Still Have Questions?

We're here to help. Get in touch with our team.