ZooKeeper Usage 2: Observable<ZooKeeper>
Sebastian Good

This is a multi-part series:

1. ephemeral nodes, 2. observable<zookeeper>, 3. where is zookeeper?, 4. .net api and tasks, 5. c# iobservable

In the last post, we used ZooKeeper as a service registry. When services started, they registered with ZooKeeper at a pre-agreed place. (/services/{dataset-name}). Clients could list the data servers available and decide which ones to connect to, or request that new ones could be launched. Thanks to ephemeral nodes, servers can crash and their registry entries are automatically deleted. Today we’ll talk about three use cases for watching changes in ZooKeeper.

Service Locator Updates

Man Overboard!

If a client is self-routing to a server via our service registry, he will want to know as soon as possible that his server has shut down or crashed. How do we accomplish this?

Loss of ephemeral node is reported to client
Loss of ephemeral node is reported to client

It turns out to be straightforward. When the client first requests information about /service/X, he does so and requests a watch. (Step 1 above.) In the native Java interface, this requires a callback to be registered. In Step 2, the server crashes, causing ephemeral node /service/X to be deleted. In Step 3, the callback registered in Step 1 is called, and the client is notified that his server is gone.

What should the client do? Zookeeper doesn’t solve that problem, but it at least gives you the tools to decide.

Tell me everything

In the case of a service manager responsible for monitoring or launching the necessary services to satisfy client demand, we may want to be notified when any server launches or shuts down. Here we discover one of the reasons the ZooKeeper document registry is hierarchical. It turns out you can watch a document and all its children. A service manager would probably launch by doing a read on the /services/document and all its children. Any time a child node was created, changed, or destroyed, the manager would receive a callback notifying it that something had changed. It could then take appropriate action.

Note that because this is a distributed system, the manager would have to consider that its information was stale by the time it got it. Zookeeper makes some specific guarantees about the order in which clients are notified of changes. In particular, this watch will be triggered before the client sees any updates on the child nodes. In fact, the client would probably need to go re-query child nodes to figure out the state of the world.

Through careful structuring of the document hierarchy, clients can observe critical changes in state, such as service availability changes.

Job Progress

If we can watch servers come and go, can we do something a little more fine-grained? Sure! In our case, our data servers served not only interactive clients, but also ran batch jobs which could take several minutes or hours. Clients wanted to observe progress and report it graphically to the user, up in a web browser.

A progress bar

Rather than poll the server (and interrupt its work), we can simply use ZooKeeper as a place to exchange this information. Periodically, the batch job on the server would update a document in Zookeeper with information like this:

: { status: 'running', complete: 0.25 }

The client could observe changes on this document and update HTML nearly directly in place, with the following pseudocode. (Ignoring error conditions and the like for the purpose of clarity.)

observeJobStatus("12345").subscribe(  function(status) {     mvvmModel.percentComplete(status.complete);   });

Latency in this model is a few milliseconds, and code is extremely clear to read.

Should the job node be ephemeral or permanent? This is a design question for your team, but it’s likely it should be ephemeral. Zookeeper isn’t meant to be a long-term database, e.g. of all jobs ever run. It’s fundamentally meant to coordinate distributed processes and hold just enough state for that purpose. One nice thing about using this approach instead of a message bus (which, for instance, just published status updates), is that clients can connect before there is data, or after messages have already been sent, and ZooKeeper will hold all state necessary for the notifications. (If you make the job nodes ephemeral, you must account for the possibility that a job which has already run is now absent from ZooKeeper, and show the user something sensible. Perhaps when jobs are complete their information is sent to a permanent database.)

Configuration Data

Configuration data is another great use case for ZooKeeper. As if configuration weren’t complicated enough, it’s even worse in a distributed system or service-oriented system. You can’t really check most information in config files into source control, or at least you shouldn’t. The acceptance database server doesn’t move from data-center-1 to data-center-2 just because you checked in a new feature. And no matter what branch a user is developing on locally, it’s now in data-center-2. Some configuration information really needs to come from a central authority.

In this case, arguably the central use case for ZooKeeper when first implemented, clients all know how to find ZooKeeper, and then load configuration data (e.g. as XML or JSON) when they startup, rather than from a file. If you want all your production services to start using a different SMTP server, then you just update the /configuration/production/smtp node on ZooKeeper.

Your service might read configuration once on startup, but it’s even better if it watches for changes in the relevant configuration nodes. New service requests could start using the new values. Or a naive approach in a cloud-like deployment model might just suicide the whole service whenever configurations change, confident a new service will be launched to handle incoming requests.

In the next post we’ll talk about how you find ZooKeeper to begin with. After all, who watches the watchers?