Last year (2020), Tellusant had a visitor with a computer science background. Among other things, we showed a few of our databases. He marveled and said, “this is so cool, I don’t think I have ever seen computational databases!” We asked what he meant. “Databases are used for queries. You slice and dice the data to find what you are looking for. You guys add on complex calculations within the queries. I have never seen this so thoroughly implemented.”
In 2007, one of our principals had a similar reaction from a developer who was commissioned to develop the C-GIDD database.* He failed and apologized: “I and my team have never worked with such extensive computations within the queries. This is not how databases are meant to be used.”
So, what is the difference between computational and query databases? The table below give a high-level definition.
To start with the more common query databases. The science and mathematics behind them are well known and queries are, given the complexity of making them, astonishingly fast. Think of a Google Search query.
In its purest form, a query database selects a subset of the data in the database based on a user (human or not) query. We encounter such databases many times a day.
They may have some non-trivial computations performed after the query, but this is not the main purpose.
In its purest form, there are no queries in a computational database. It computes on the entire dataset. In practice though, they are dominated by the computations, but queries also play an important role.
Why not separate the two logical tasks? Start with the query and then do the computation. This is often possible, but not always. For Tellusant’s needs, such separation is usually not possible.
This is because our models depend on calculations on both the queried and the non-queried data simultaneously. And they have to perform those calculations quickly, so users are not frustrated.
We estimate that more than 90 per cent of databases are query oriented, but computational databases are growing. The growth is because models are becoming more complex, and the slice and dice (query) delivery of services are being commoditized.
This growth is especially the case for higher-order cognitive processes within global corporations, especially those processes that need to be updated frequently. Making such processes more efficient and accurate is exactly what Tellusant’s mission is.
The table below shows some computational and query database uses.
The examples show that computational databases are most important when data are dynamic, and variables are dependent on each other. They are also advantageous when a user’s query is “free form” and requires significant post-query calculations. That is, one cannot have pre-calculated tables since the database does not know what the user wants.
Our experience is that computational databases benefit from cloud hosting and that the standard development tools handle the computations well (in our case mainly Python code applied to an MS Azure PostgreSQL database (we also perform queries in Python.))
As we build our SaaS platform, this approach will be crucial to us.
• • •
* Acquired in 2015 by The Economist Group.