A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (distributed DBMS) is the software system that permits the management of the distributed database and makes the distribution transparent to the users . The term distributed database system (DDBS) is typically used to refer to the combination of DDB and the distributed DBMS. Distributed
DBMSs are similar to distributed file systems (see Distributed File Systems) in that both facilitate access to distributed data. However, there are important differences in structure and functionality, and these characterize a distributed database system:
- Distributed file systems simply allow users to access files that are located on machines other than their own. These files have no explicit structure (i.e., they are flat) and the relationships among data in different files (if there are any) are not managed by the system and are the users responsibility. A DDB, on the other hand, is organized according to a schema that defines both the structure of the distributed data, and the relationships among the data.
The schema is defined according to some data model, which is usually relational or object-oriented (s e e Distributed Database Schemas).
- A distributed file system provides a simple interface to users which allows them to open, read/write (records or bytes), and close files. A distributed DBMS system has the full functionality of a DBMS. It provides high-level, declarative query capability, transaction management (both concurrency control and recovery), and integrity enforcement. In this regard, distributed DBMSs are different from transaction processing systems as well, since the latter provide only some of these functions.
- A distributed DBMS provides transparent access to data, while in a distributed file system the user has to know
(to some extent) the location of the data. A DDB may be partitioned (called fragmentation) and replicated in addition to being distributed across multiple sites. All of this is not visible to the users. In this sense, the distributed database technology extends the concept of data independence, which is a central notion of database management, to environments where data are distributed and replicated over a number of machines connected by a network. Thus, from a user s perspective, a DDB is logically a single database even if physically it is distributed.