Whether you're just getting started or or running at scale,
choose a plan that fits your team's needs
— with full access to all features.
$0
For teams getting started
$1,700 / TB of worker capacity Total memory allocated to all workers / month
For teams to run Hive on MR3 in production
Custom
For large organizations with custom needs
Feature | Free | Business | Enterprise |
---|---|---|---|
Hive on MR3 on Hadoop | |||
Hive on MR3 on Kubernetes | |||
Hive on MR3 in Standalone Mode | |||
Fault Tolerance | |||
Speculative Execution | |||
Capacity Scheduling | |||
High Availability | |||
Autoscaling | |||
In-memory/NVMe Caching (LLAP I/O) | |||
Built-in Shuffle Service | |||
Backpressure Handling | |||
Secure Shuffle | |||
Remote Shuffle Service (Celeborn) | |||
MR3-UI (Web UI) | |||
Prometheus Clients | |||
Hive on MR3 is not a hosted service. Instead, users install and run it on their own infrastructure — whether on-premises, in the cloud, or in hybrid environments.
How do I get started?
To get started, users download the MR3 release from the public repository and deploy it in the way that best fits their environment. The default release corresponds to the Free plan and includes a worker capacity of 512GB. For Kubernetes, users can use the public Docker image available on DockerHub.How do I upgrade the plan?
Upgrading is simple. We ship a prebuilt binary that unlocks additional worker capacity. With minimal changes, you continue running Hive on MR3 — just with higher capacity.Hive on MR3 gives you complete control. You’re not relying on a vendor-controlled black box.
Predictable pricing and control over resource usage.
Deploy on-premises, in the cloud, or in hybrid environments.
Tune your deployment and environment for your own workloads.
Worker capacity is the total amount of memory that your workers can use at the same time. You can choose how many workers to run and how much memory each one uses, as long as the total stays within your worker capacity. For example, on the Free plan with a worker capacity of 512GB, you can run 8 workers, each using 64GB of memory.
We offer a 30-day evaluation period with unlimited worker capacity, allowing you to test Hive on MR3 at full scale. Throughout the evaluation period, you’ll also receive custom onboarding support to help tune Hive on MR3 for optimal performance. If Hive on MR3 meets your needs, you can choose to upgrade to the Business plan.
Yes, we provide discounted pricing for larger deployments starting at 2TB of worker capacity. As your worker capacity increases, the rate becomes more favorable and you pay less per terabyte.
The Business plan includes comprehensive support designed to help you run Hive on MR3 smoothly in production. You get:
This support complements the resources available to all users — including quick start guides, operations guides, community Slack, and the MR3 Google Group.
Hive on MR3 delivers strong performance across both sequential and concurrent workloads. Based on the 10TB TPC-DS benchmark:
Unlike Trino, which may return incorrect results for some queries, Hive on MR3 consistently produces correct answers.
Yes. Hive on MR3 is architected from the outset to support both interactive and batch queries in a single unified system. Its fault-tolerant execution and built-in capacity scheduling allow different types of workloads to run together efficiently. Interactive queries can be prioritized for faster response times, while batch jobs continue running reliably in the background — ensuring smooth operation without the need to manage separate systems.
Yes. Hive on MR3 runs in any environment — on Hadoop, on Kubernetes, or even in standalone mode without a resource manager. It supports both HDFS and S3, enabling full separation of compute and storage. You can deploy Hive on MR3 on-premises, in the cloud, or in hybrid environments. This flexibility allows you to tailor deployment to any infrastructure.
In many organizations, interactive and batch workloads are handled by separate systems — one optimized for responsiveness, the other for throughput. This approach adds complexity, increases infrastructure costs, and requires maintaining multiple platforms.
Hive on MR3 eliminates this divide by supporting both types of queries in a single fault-tolerant system. With built-in capacity scheduling, it allows interactive queries to take priority without delaying batch jobs. This unified design simplifies operations, reduces infrastructure costs, and eliminates the need to maintain multiple platforms.
Hive on MR3 improves resource efficiency through fast autoscaling and smart caching. In cloud environments, it can scale quickly based on workload demand, efficiently combining spot and on-demand instances without risking query interruption. Selective caching can reduce repeated access to storage like S3, minimizing both latency and cost.
Hive on MR3 offers multiple deployment options: shell scripts for all environments, and Helm charts and a custom TypeScript generator for Kubernetes. With quick start guides and production-ready configurations, data engineers familiar with distributed systems can typically get Hive on MR3 running in about 30 minutes, given a suitable on-premises environment. In cloud environments, setup may take longer depending on provisioning, network configuration, and cloud-specific security settings.
Hive on MR3 requires a working cluster and a database for the Hive Metastore (such as MySQL or PostgreSQL). You’ll also need writable local disks on every worker node to store intermediate query data. For shared temporary data, you can use either a PersistentVolume on Kubernetes, or distributed storage like HDFS or S3. Detailed prerequisites are available in the quick start guides.
Yes, migrating your existing Hive Metastore and workloads to Hive on MR3 is straightforward. Hive on MR3 uses the same Metastore schema as Apache Hive, with no differences at all. If your Apache Hive version matches the version of Hive on MR3 you intend to use, you can directly reuse your existing Metastore. For older versions such as Hive 2 or 3, you can follow the standard Hive upgrade procedure.
Existing Hive queries and User-Defined Functions (UDFs) work without changes, as Hive on MR3 preserves the same interface — only replacing the underlying execution layer. For users running Hive 3.1, a compatible build of Hive 3.1 on MR3 is available upon request.
Once you meet the requirements, the first step is to download the MR3 release from the public repository and follow the quick start guides. In most cases, users find the guides clear enough to get started without needing additional help. One team even adopted Hive on MR3 in production without ever contacting us!
If you need help, the best place to start is the MR3 Slack, where you can ask questions and get real-time help from the team. You can also post in the MR3 Google Group for longer discussions or support. If you prefer, you can contact us directly by email as well.
While the MR3 execution engine is not open source, Hive on MR3 is fully open source, with its source code publicly available on GitHub. It consists of two customizable components: a fork of Apache Hive extended to run on MR3, and a runtime library originally based on Apache Tez but significantly evolved over time. We provide shell scripts for rebuilding Hive on MR3 using these components — so you can apply patches or tailor the system to your needs without vendor involvement.
Unlike some open source products that restrict key features to paid editions, Hive on MR3 offers all features even in the Free plan. It gives users substantial and targeted control over the system, with the ability to modify and rebuild key components for query compilation and execution.
The only part not open is the MR3 execution engine, which manages low-level operations like resource management, fault tolerance, and task scheduling — areas that users don’t typically need to modify. If you do need changes to the execution engine, however, you can simply reach out. We are happy to implement new features at no cost.
We understand that enterprise teams may require a deeper technical evaluation of the MR3 execution engine before adoption. While MR3 is not open source, we offer source code walkthrough sessions — live, developer-led calls that explain the architecture using detailed design documents and walk through key parts of the source code. These sessions are designed to give your team the confidence needed to evaluate MR3 for production use.
Yes. Hive on MR3 inherits its security capabilities directly from Apache Hive, which means it supports the same integrations for authentication, authorization, and encryption — including tools like Apache Ranger, LDAP, Kerberos, SAML, and more. Because Hive on MR3 builds on Apache Hive, it benefits from the full range of security features maintained by the Hive community. As Hive continues to evolve, those improvements are naturally reflected in Hive on MR3 as well.
No — there is no vendor lock-in with Hive on MR3. Since it works with the standard Hive Metastore, you can switch back to Apache Hive or move to another technology whenever you choose. This flexibility is even greater if you use an open table format like Apache Iceberg.
The development of MR3 began in July 2015, following two years of preliminary research. The first official release, MR3 0.1, was launched in March 2018 and featured Hive on MR3 as its first application. Since then, we have been actively contributing to Apache Hive and expanding MR3 with new features and improvements. MR3 reflects nearly a decade of focused engineering, hands-on experience with Hive, and a long-term commitment to performance and stability.
Yes. Hive on MR3 has been used in production by several companies, and a few continue to run it in production today. The system has matured through years of feedback from real-world deployments. With the release of MR3 2.0, we’re focused on making Hive on MR3 more broadly accessible to teams who can benefit from it.