In this blog series we will take a look at some of the technical challenges and complexities of building a multi-tenant application as well as some strategies to help reduce those complexities. In subsequent posts we will discuss building multi-tenant apps with a graph data store and some of the additional levels of complexity that arises when using two common graph data stores, Neo4j andDataStax Enterprise Graph.
So you’re going to build a multi-tenant application and now it’s up to you to figure out how to make it all work. Ask any software engineer who has built one and they will tell you that multi-tenant applications are inherently more complicated than single-tenant applications. That complexity comes from the added overhead required to ensure that your tenants’ data are secured and isolated from one another (e.g., Tenant 1 can’t see Tenant 2’s customer list) and that large tenants don’t adversely affect other tenants in the same environment (e.g., Tenant 1 does not use all the resources, thereby slowing the performance for Tenant 2). The overhead caused by these requirements may take the form of either operational or developmental complexity, but the key to building an effective system in any multi-tenant scenario is to reduce that complexity.
Note: For this blog series we are going to talk about a multi-tenant application as an application in which two or more tenants are served software by a set of resources. A tenant in this case is a group of users with common access to a dedicated share of the data, configuration and resources of the system.
The goal of any sort of multi-tenant project is to provide the tenant with an isolated set of data and resources while providing the business with the ability to scale the number of tenants using the system with the minimum amount of developmental and operational overhead. In order to achieve these competing goals, the architecture of multi-tenant applications run the spectrum in methodology, design and implementation details. However, they can often be bucketed into a few different categories:
- Shared Nothing: A single instance per customer with no sharing of resources. In this scenario each tenant is given their own vertical slice of hardware resources and are physically isolated from one another. This leads to a lower developmental complexity but higher operation costs through operational complexity and the lack of hardware reuse.
- Shared Everything: A single instance for all customers with all resources shared. In this scenario each tenant shares the same vertical slice of hardware resources, and they are logically isolated from one another by the application. This leads to higher developmental complexity but lower overall operational complexity with a high degree of hardware reuse.
- Somewhere on the spectrum of options in between: This hybrid approach usually consists of having certain resources shared between instances and separated logically, and other resources being physically separated. There is a wide range of approaches in this arena, and it’s one that most applications find themselves in. The overall complexity is driven by the final solution but there is often compromise between data security and operational cost
Each of these different architectural approaches can be mixed and matched to solve a specific problem, and each brings with it varying levels of both operational complexity and developmental complexity. Operational complexity includes managing aspects of a system such as provisioning, infrastructure, data migration and/or configuration management. Developmental complexity includes managing aspects of the system such as data security, authorization/authentication, business logic complexity and/or customer isolation. There is a variety of different techniques you can use to help reduce these complexities, but the end goal is to strike a balance that minimizes the operational and developmental complexity within the time and budget constraints.
Development tooling reduces complexity by making it easier to provide security and fine-grained access control on the data. These tools can either be in the application or in the data store, but both have the same unifying goal of simplifying and reducing the amount of programming required to provide the proper data isolation and security between customers’ data.
Application tooling usually takes one of two forms. The first form is a framework or library that is integrated into your application (e.g., Spring Boot [Java], DropWizard [Java], Express [NodeJS]) to help the development team centralize and standardize the enforcement of data isolation and security policies. The second form is manipulation of the data model to allow for easier logical separation of tenants’ data. This may mean something as simple as adding a tenant_id property to each entity or something far more complex, but the end goal of this sort of tooling is to provide development teams an easy method to shard tenant data to ensure that it is isolated. Most commonly, these tools are used in conjunction with each other, but integrating these tools requires development, testing and maintenance resources that tend to increase developmental complexity while lowering the operational complexity.
Data-store tooling tends to require minimal development effort but higher levels of operational effort to install and maintain. In the relational world, this sort of tooling exists in some enterprise-level databases in features such as Row Level Security (SQL Server) and VPD (Oracle), and in many other databases that support creating separate databases/schemas on the same server. Each of these sorts of tools works at the data-store level to provide logical separation of tenants’ data. These tools lower developmental complexity, and are frequently transparent from the development perspective, but they increase operational complexity, as they sometimes require significant effort to install, configure and maintain. With these tools you are also constrained to the features and functionality supported by the chosen data store, which can vary greatly. However, the trade-off here is that these tools tend to provide a higher level of assurance of data security because, once installed and configured, they are implicitly called and do not require any explicit actions to invoke.
Dealing with multi-tenant environments is inherently an operationally complex undertaking and requires a highly skilled team to effectively navigate the difficulties involved. There is rich tooling support for these sorts of complex deployments and they have been maturing rapidly over the past few years to tackle exactly these sorts of problems. Tools such as continuous integration/deployment platforms (e.g., Electric Flow, Jenkins, Travis CI) help automate deployment and migration tasks while minimizing deployment cycle time; infrastructure management tools (e.g., Kubernetes, DC/OS, Nomad) automate the management of your virtualization and/or container infrastructure; infrastructure automation (e.g., Chef, Puppet) manages your configuration as code; and configuration management systems (e.g., Zookeeper, Consul) provide centralized services to retrieve dynamically changing configuration information across instances. Using these tools in combination with scalable infrastructure like AWS or Azure can drastically reduce the operational overhead involved in maintaining these sorts of complex systems and provide architectural models that were unachievable even just a few years ago.
Divide and Conquer
Another way to simplify the complexities of multi-tenant applications is to look for areas where a commonality in functionality and data can be broken out and shared among all the users. For example, maybe you have a section of your application that deals with providing a set of master data such as country names, or units of measure that are shared across clients and not specific to any single one. This data rarely changes and is not usually unique to a specific customer. If you break this out from the main application and create only a single instance that all customers integrate with, you have effectively reduced the footprint of your system. This method is especially effective if you are already building in small, deployable units such as with microservices or an SOA. Breaking your application along some of these logical boundaries can decrease the system’s footprint while retaining the functionality and data that needs to be unique among customers.