What is a distributed database?

Definition “Distributed Database” What is a distributed database?

17.12.2021By chrissikraus

A distributed database is not stored centrally, but distributed to different storage locations. The administration is handled by a distributed database management system.

Related companies

When working with a distributed database, you usually don’t notice a possible spatial separation.

As a rule, one imagines a database in such a way that all stored information is stored on a central server. There you can access the data and work with them. In the case of a distributed database, parts of the information are stored in different physical locations. Depending on the need, it can be replication of the totality of the data or individual fragments.

As physical locations, different computers at a company location as well as geographically clearly separated locations are possible. The structure can range from a very local context with a few stations to a global distributed database with hundreds or thousands of locations.

The data is often processed locally at the location where the corresponding information is stored. This is done independently of the other distributed segments on the respective local system. The overarching administration, on the other hand, is organized centrally.

Management of distributed databases

In order to be able to efficiently manage the totality of the distributed organized data, a central system is helpful. This is referred to as a distributed database management system or Distributed Database Management System (DDBMS for short). For example, such a system allows users to access all data on demand, despite distributed storage.

For this, the DDBMS must map the logical connection that the fragmented data have with each other. On the part of the user, working with the database prepared in this way feels comparable to working with a centrally stored database. Access to the entirety of the data via the DDBMS is conveniently possible and does not require any special knowledge about the physical organization of the distributed database.

Another important task of the DDBMS is to synchronize the distributed data at regular intervals. If data is changed, deleted or newly created at a storage location, these updates must also be taken over at all storage locations where the relevant information is stored. This is the only way to ensure that every office is working with up-to-date and correct data. A DDBMS can do this synchronization automatically and thus make working with the data more convenient and reliable.

Types of distributed data storage

As already mentioned, distributed databases can be used either by replicating the entire data or by fragmentation into sections.

Replication

The database is stored in its entirety in various locations. This creates redundancy and thus avoids data loss if a site should fail. Furthermore, the data volume can be distributed geographically in a targeted manner in order to ensure high availability locally.

If the entire database is to be accessible quickly regardless of location, this type of distributed database is useful. One disadvantage is that the entire database at each location must be kept up-to-date so that users do not receive outdated information.

Fragmentation

The database is broken down into sections and distributed to different physical locations. This is advantageous, for example, if only certain data is regularly required at the individual locations. This means that only much smaller parts of the database have to be stored locally and you can work more independently of the overall system.

In addition, the station has the data quickly available on site, which are needed for daily work. The DDBMS ensures the synchronization of the relevant data and, if necessary, can ensure the integration of all data.

The two approaches can be used separately or as a hybrid solution.

Different types of distributed databases

Distributed databases can be structurally homogeneous or heterogeneously organized.

Homogeneous distributed databases

With a homogeneous structure, each site is equipped identically. This means that all locations use the same hardware, the same operating system and the same database applications. So all stations speak “the same language”. Data exchange and integration via the DDBMS are uncomplicated to implement.

Heterogeneous distributed databases

In the case of a heterogeneous structure, the hardware, software and operating system of each locality can be individual, e.g. according to the respective needs on site. The data models could also vary from place to place. So it can happen that incompatible technologies collide here. In these cases, translation must be carried out between the individual stations if communication is to take place, e.g. via individually manufactured interfaces.

The DDBMS must be able to integrate various technologies in such a way that administration and insight into the overall system is possible smoothly. In practice, this approach can become so complex that the maintenance effort and thus the cost-effectiveness of such a system is in question.