Distributed Concurrency Control and Distributed Reliability Protocols
Distributed Concurrency Control
Concurrency control involves the synchronization of concurrent accesses to the distributed database, such that the integrity of the database is maintained. If the result on the database of concurrent execution of a set of transactions is equivalent to some serial (one after the other) execution of the same set of transactions, then the concurrently executing transactions are said to have maintained the consistency of the distributed database. This is known as the serializability condition. In terms of the ACID properties, concurrency control algorithms maintain the consistency and isolation properties of transactions.
Concurrency control of distributed transactions requires a distributed synchronization algorithm which has to ensure that concurrent transactions are not only serializable at each site where they execute, but that they are also globally serializable. This implies that the order in which they execute at each site have to be identical.
Distributed concurrency control algorithms can be grouped into two general classes as pessimistic, which synchronize the execution of user requests before the transaction starts, and optimistic, which execute the requests and then perform a validation check to ensure that the execution has not compromised the consistency of the database. Two fundamental primitives that can be used with both approaches are locking, which is based on the mutual exclusion of accesses to data items (similar to semaphores in operating systems), and timestamping, where the transactions are executed in some order. There are variations of these schemes as well as hybrid algorithms that attempt to combine the two basic mechanisms. Locking-based algorithms cause distributed deadlocks requiring protocols to handle them (see Distributed Deadlock Detection).
Distributed Reliability Protocols
Distributed database systems are potentially more reliable since there are multiple copies of each system component, which eliminates single points-of-failure, and data may be replicated to ensure that access can be provided in case of system failures. Distributed reliability protocols maintain the atomicity and durability properties by (a) ensuring that either all of the actions of a transaction that is executing at different sites complete successfully (called commit) or none of them complete successfully (called abort), and (b) ensuring that the modifications made to the database by committed transactions survive failures. The first requires atomic commit protocols (see Commit Protocols), while the second requires recovery protocols.
The most common atomic commit protocol is the two-phase commit (2PC) where the transaction is committed in two steps: first all the sites where the transaction has executed are polled to make sure they are ready to commit, and then they are instructed to actually commit the transaction (see Two-Phase Commit). The result is uniform commitment (or abort) at every site where the transaction executes.
Recovery protocols are the converse of atomic commitment protocols. They ensure that the system can recover to a consistent state following a failure.