The Apache Zookeeper was started as a software project by the Apache Software Foundation. It offers a number of services for large distribution systems including open source configuration service, naming registry and synchronization service. Initially rolled-out as a sub-project of Hadoop, the ZooKeeper is now a high-end project in itself. At present, ZooKeeper is used by a range of top level companies including the likes of Yahoo!, Reddit, eBay, Rackspace, etc. Besides this, it is also used by Solr - an open source search system.
All of the services offered by ZooKeeper are used by the different distributed applications. The redundant services offered by the ZooKeeper's architecture render support to the high availability cluster, which provides the clients an option to take help of another ZooKeeper leader if the initial one is not able to answer. In ZooKeeper, a tree data structure is used by the nodes to store the data in name space, somewhat similar to a file system or a hierarchical data structure. A shared configuration service set-up is offered to the clients from where they can write to and read from the nodes.
The aim of ZooKeeper services is to refine the distillate complex services like synchronization and open source configuration in to a centralized coordination service with an easy to use interface. There is no need to implement group management, consensus and present protocol services by the applications on their own; all these are implemented by the ZooKeeper services. It is used by the applications for mediating and storing the updates to imperative configuration information.
Below are the benefits of the simple service and interface of ZooKeeper -
1) Speed – Having an ideal read / write ratio of 10:1, the ZooKeeper operates fast with the workloads where reads to the data are much higher than the writes.
2) Reliability – With having no single point of failure, the ZooKeeper services are highly reliable. The complete set of hosts, on which ZooKeeper is replicated, remains aware of each other. The service remains available until servers' critical mass is available.
3) Simplicity – To keep it simple for users, nodes in ZooKeeper architecture use a tree data structure to store the data in name space, somewhat similar to a file system or a hierarchical data structure.
4) Organized – All transaction records are maintained by the ZooKeeper services for its possible use for high-level abstractions.
How ZooKeeper Works?
With using ZooKeeper znodes - a shared hierarchical name space of data registers, the distributed processes are able to coordinate with each other. The znodes, which are recognized by their respective paths, carry “/” (slash) separated path elements. All znodes, excluding the root, have their respective parent. A znode, which has children, can't be deleted. The architecture of the ZooKeeper looks like a normal file system; however, its redundant services make the ZooKeeper super reliable.
The set of machines, over which a service is replicated, maintains an in-memory image of the transaction logs and data tree. The sending of requests and receiving of responses are done through a TCP connection maintained by the clients along with connecting to a single ZooKeeper server. The ZooKeeper offers a high availability and throughput along with lower latency, thanks to its architecture. The size of the ZooKeeper managed database gets limited by the memory.
If you require further information on ZooKeeper. Please feel free to contact us.