Multi-Tenant Applications in OrientDB
Dave Bechberger

OrientDB is one of several popular graph data stores on the market today. It provides a multi-model approach with the powerful nature of a graph database and the flexibility of a document data store. If you have decided to build out your multi-tenant application on top of OrientDB, you are in luck as it has several built-in, out-of-the-box methods for handling multi-tenancy. In this post, I’ll look at three specific approaches:

Note: For this blog series we will talk about a multi-tenant application as an application in which two or more tenants are served software by a set of resources. A "tenant" in this case is a group of users with common access to a dedicated share of the data, configuration and resources of the system. For an overview on reducing complexity in multi-tenant applications, read Part 1 of this series, Multi-Tenant Applications: Reduce the Complexity.

Graph Partitioning

OrientDB has support for Record-Level Security that can be leveraged to create a Partitioned Graph per tenant. A Partitioned Graph is a subgraph of data that is only available to specifically authorized subsets of users. You accomplish this by leveraging the robust security model that OrientDB provides to logically isolate one tenant’s data from other tenants. In order to enable this functionality, there are three basic steps:

  1. Create a database and extend the V and E super classes with ORestricted. ORestricted is a special property that will now be appended to all vertexes and edges in the graph and is used to restrict access to those entities.
  2. Create users and roles in that database. OrientDB allows you to create roles that can then be assigned specific permissions (All, Read, Update, Delete). You are then able to create users who are assigned to those roles. With this robust security model you can not only isolate one tenant from another but also enforce application roles within a single tenant. For example, administrators can modify master data but all others can only read it.).
  3. Finally, create a new graph for each tenant and associate a user/role with it. When you connect as a specific user, you will automatically be restricted to viewing only the Partitioned Graph associated with that user.

Since the partitioning of the graph occurs at the low level within the core of OrientDB, this approach also has the advantage of being enforced on other clients such as those using Gremlin or the Java API. However, there is a price to pay here as each record returned by a query has a low-level hook that is called to process the security model. In most cases this is not a problem, but if you have a complicated security model such as those containing multiple levels of inheritance, the cost to process these can become noticeable.

Note: An excellent walkthrough of how to create Partitioned Graphs is available here.

Pros

Cons

Clustering

One of the unique features of OrientDB is Clusters, which allow you to specify how the data in each class (vertex) is grouped on the physical disc. By creating a separate cluster for each class per tenant, you can physically isolate one tenant’s data from another. One major drawback to using this methodology is that due to the class inheritance structure you will also need to logically isolate that tenant’s data. This means that you must explicitly include that tenant’s cluster identifier in each query. The added need for logical isolation increases the developmental complexity of this method but allows for additional flexibility.

For example, given a database with a class called Customer, you would create a custom cluster for each tenant Customer_Tenant1Customer_Tenant2, etc., as they are added to your system. You would include the cluster identifier in each query to limit the results to just a single tenant's data, e.g., SELECT FROM cluster:Customer_Tenant1.

One advantage of using this methodology is that aggregating data across all tenants is as simple as only using the base class name in the query instead of the tenant cluster identifier, e.g., SELECT FROM Customer.

Pros

Cons

Separate Databases

OrientDB has the capability to run multiple databases on a single OrientDB server. You can leverage this functionality to give each tenant in your system a unique database. This provides each tenant with complete physical isolation of their own from other tenants' data while allowing you to leverage a shared infrastructure. This simplifies the development of applications by removing the need to worry about handling any logical isolation but adds some additional complexity to the operational aspects of the system. With this method you will have to coordinate database upgrades/migrations, handle tenant resource concerns, as well as handle routing tenants to different databases. While all of these operational aspects are well understood and not unique to OrientDB, it does add to the overall workload required to make this method work efficiently.

Pros

Cons

Conclusions

OrientDB has robust support for both physical and logical isolation using a variety of different methods. Each of the options presented here have pros and cons but each are also the right fit for certain use cases. If you are looking to completely physically separate tenant data, then you should look into using Separate Databases. If you are fine with only a logical separation of data, then Graph Partitioning is probably the best option. If you end up needing something in between, then you can look at Clustering as the possible solution.

For other posts in this series, see:

Multi-Tenant Applications: Reduce the Complexity

Multi-Tenant Applications in Neo4j

Multi-Tenant Applications in DataStax Graph

RECENT POSTS FROM
THIS AUTHOR