Data Modeling as a missing link between databases and applications Accelerate database queries with data modeling
Data modeling is crucial for the smooth interaction of applications and databases. After all, users hate long waits and want information to be available to them immediately.
Companies on the topic
The performance and variability of database queries can be significantly increased with data modelling.
(©monsitj – stock.adobe.com)
Data models are basic data management units in which data is stored and made available again. At the same time, they are the link between databases and applications. Structured storage allows application developers to implement targeted access to the data in their programs.
The way they are stored in databases typically depends on the requirements specified in the service level agreements (SLA) of the business application. The database takes over the data management and processing as well as feature provision for the apps.
The aim is to be able to process the read and write accesses of the applications with the lowest possible latency time of a few milliseconds, and thus to provide customers with the most comfortable user experience possible with minimized waiting times. For example, if an application has a distribution of 70 percent read and 30 percent write operations, the data model is ideally designed exactly for this.
The biggest time-wasters in data processing are JOIN operations. They may be necessary, for example, if several addresses of a customer are stored in the database. An optimized data modeling will therefore try to reduce these JOIN accesses.
However, in relational database systems (RDBMS), they are inevitable. As soon as an attribute (such as the said address) appears several times, references must be swapped out and merged again. All these JOINs take time and the application has to wait for them to complete.
JSON documents as desired objects
In NoSQL databases, on the other hand, data is not stored in rigid tables, but in JSON documents. In it, all the data is already there. JOIN operations are only necessary if the information is spread over several documents. Every JSON document is an object from the application’s point of view.
Object-based programming languages such as Python, Java, C++ or C# prefer database objects such as JSON when programming and executing applications. And the developers working with it do not think in relations, but in objects. Data modeling in the form of JSON documents therefore makes your work much easier. The data model of a NoSQL database can be defined and implemented from the application development.
Examples of the primary data structures in NoSQL databases: maps, lists, sets, and queues.
NoSQL databases that allow data modeling directly from the application go one step further-and in real time. It can be optimized at app runtime without the need for a database administrator. For example, a new attribute in a JSON document, such as a Twitter or LinkedIn nickname, is simply adopted without the data structure having to be laboriously changed.
Administrators define the frameworks
Even if database administrators are relieved of many routine tasks, they will not be superfluous. As a data engineer, your task is to ensure that the data structures and data types selected by the various applications are harmonized across the applications so that they are compatible and interoperable with one another. The considerations therefore go in the direction of a data model framework based on best practices for the respective requirements of the company.
A good database helps with so – called query and index planners and advisors. They provide information on specific database queries and help to uncover potential bottlenecks that cause latency during application development. Query and index advisors then offer corresponding suggestions on how these bottlenecks can be avoided or bypassed – always based on the selected data model or framework.
Help through full-text search
For queries in colloquial language, as they are typically used in search engines, full-text search (FTS) within the database is a valuable feature. Although it is not directly part of the data model, it helps to increase the performance of indices and thus also reduce latencies.
FTS is similar to fuzzy, usually very time-consuming LIKE queries in SQL, which do not specifically search for a term, but for term complexes, such as “book”, “books”, “books”, … For this, the FTS function forms a reverse index, which divides the search into parallel individual queries and thus accelerates enormously.
Steffen Schneider (Image: Couchbase)
As a positive side effect, the scope of terms can be extended more easily, in the example also “library”. A database platform that supports both application-controlled data models and AGVS is the prerequisite for fulfilling the user’s desire for as latency-free answers to their queries as possible.
* Steffen Schneider is Head of Solutions Engineering Central Europe at Couchbase