DATA QUALITY
What does data quality mean?

We have already touched on this subject in our last blog. In this entry, we would like to discuss it in more detail. If you add up our collective IT experience, you will probably get a result of several hundred years, maybe even a whole millennium. During all this time, in all our discussions, one topic was always sure to pop up: Data – and not because data are a classic!

No matter where you go, no matter who you talk to, everyone keeps saying that the data is bad, and it is immaterial whether data is/was provided via a data transfer or a data source for analyses or marketing activities. When you have listened to this complaint for so many years, two questions are begging to be asked:

– What exactly constitutes good data anyway?
– How can I ensure that data does not turn “bad“?

What exactly constitutes good data?

Although it is tempting, let us not start a philosophical discussion about “good” and “bad.” Data are not a four-course meal that is supposed to taste good, nor are they a manufactured object that can be validated under ISO parameters for their production quality. Data are data, nothing more and nothing less. And in my opinion, they are neither good nor bad.

Data are always based in a context or ecosystem. Both are crucial for one of the two criteria. Data are information, and information needs to be interpreted. If the interpretation is conclusive and the data are useful for the intended purpose, then the data are good, and vice versa. And since there is an “AND“ relationship between these two conditions, and the connection can only be logically valid if both are true, data are more often bad than good.

But let us take a closer look at these individual requirements.

Conclusiveness

A single data record does not tell you anything; it only makes sense in a chain of additional information. A name and an address, for instance, may be interesting, but only if they are used in the context of an offer, an order, or a purchase order. And even more so, when payments are posted to individual transactions. This is, of course, only a small example. Data only appear conclusive if this form of correlation is understood.

You always need a summarizing signal to tell you how to evaluate any given status. This has to be quickly apparent and understandable because only then will the whole become conclusive and thus correct. However, this is already the preliminary stage for our next step.

Interpretation

Conclusiveness refers to the processing of individual operations. This is also a form of analysis where information is summarized and forwarded. Just think, by way of example, of the netted position lists per customer, the total annual revenue, etc. displayed directly in the customer mask.

The actual interpretation, however, happens in the statistics and reports. Here, large amounts of data are consolidated, grouped, and displayed depending on their meaning. The data are only good when the result of these processes is both sufficient and understandable.

Usefulness

It is not easy to distinguish the individual parameters, since everything is so closely related. Analyses alone are helpful and necessary, but here they are an integral part of further processing.

A system like Odoo is a rather extensive data kraken that does not only help to transfer data within departments, to control the subsequent necessary steps, and, ultimately, to make them transparent, but also to process them for sales and marketing.

In both cases, knowledge of the customer base is essential. Let us find out why.

Sales

Good customer, bad customer – how much attention do I give to whom? Who has priority? How much does the customer owe me? These are essential questions in sales. If a customer calls, and I already have the relevant data, I save on ”postage,“ as it were, and do not need to chase him up by phone.

Marketing

Here, things become more complicated because marketing requires correspondingly larger amounts of data for a very specific purpose, such as newsletters, campaigns, or trade fairs. Consequently, both customer profiles and transactions are selected according to various criteria. This process a) necessitates complex queries, and b) has to be filtered additionally at different points.

In this case, the data can only be called good if large amounts of information have been processed, and the campaign has been successful.

Quality processes

Now we come to another question, how do we keep our data “clean“?

The better question here would be, what can go wrong? If you summarize the possibilities, you will come up with the following options:

– Insufficient or incomplete information
– Incorrect information
– Incorrectly processed transactions

Let us consider these points and what tools we can use to avoid them.

Insufficient or incomplete information

This problem often applies to addresses, but also at various other stages. The first countervailing measure that usually comes to everybody’s mind here is the introduction of mandatory fields. This may be a good solution in a specific application since this scenario is a small delimited space where a mandatory field might make sense. But in an integrated system, it is challenging to distinguish between information missing because someone forgot to enter the required data or whether it is missing because it was unknown at the time of input.

Going back to our “Address“ example, it is also possible that the inquiry was received via e-mail, the signature was missing, leaving the address incomplete. Could we have solved the issue if we had introduced mandatory fields? Of course, you can put in anything to complete the address, but this would not meet the purpose of a mandatory field.

From a technical point of view, it is easier to check whether a space is empty and may need to be filled than to check whether it has been filled AND filled correctly. It is always better not to enforce a procedure.

An alternative could be to integrate a checkbox at the status change of a process and inform the user about it. In the previous case, an address should be complete as soon as an order confirmation is issued. At that time, you can check whether all fields have been filled and provide a corresponding reminder.

So, how does Odoo’s suggested solution look like?
There are no mandatory fields, and you can choose the following options:

  1. A filter is set for each plausibility check, including the desired criteria. A user group regularly controls the filter for data records and edits them.
  2. A software tool checks critical points and highlights messages.
  3. Nothing replaces a human better than another human. In other words, all essential information is displayed centrally in a mask so that the user himself can quickly check the data for completeness. As an example, look at the tax ID that should be checked once a year. Odoo solves this issue by displaying the tax number directly below the invoicing address. In other words, the task here is to check the tax number manually before confirming the invoice.

IMPORTANT NOTE:

At this point, we also need to note that each additional new field should be considered carefully. The more fields, the higher the probability of incomplete data entries.

Incorrect Information

As already mentioned in the previous paragraph, it is much more difficult to identify data as “false“ from a technical perspective. Technology can only represent an abstraction of reality, so it is difficult to evaluate whether a data entry is incorrect, or merely strange but explainable.

In this case, you can only work with plausibility checks. As described above, the best way to realize this in Odoo is probably to create a pre-configured filter that selects according to specified criteria and displays the data for manual assessment. You should also review regularly whether other plausibility checks have been added in the meantime.

The best tool here is the grouping in Odoo. For example, if you group by country or city, you will quickly notice if “other countries” or “other cities” have been added.

Incorrectly processed transactions

Suppose the process becomes more complicated, and something has gone wrong somewhere, and you have already tried unsuccessfully to correct the problem. In that case, the probability is high that every following step will lead you further astray.

How can you identify this problem? This process is not easy since most of the time, the processes have already been completed and only appear by chance. If it does happen frequently, you have an excellent training topic, of course. Or you have to ask whether the process and the processing chain might be the source of evil. However, such cases are, more of an exception than a rule, which makes them even more challenging to detect.

You might say that things like this simply happen, and that’s how it is. But it is advisable not to dismiss such issues entirely. Time changes processes and considerations; in other words, revisions should be performed regularly, if only to ensure that systems are still used correctly.

The best solution here is to consolidate data via reports or statistics according to different parameters. As a result, there should be no deviations or at least only deviations that are expected or easy to explain. If this is not the case, troubleshooting is worth your while; you will find incorrect processes in most cases.

Conclusion

The purpose of IT is to support people and processes. Look at the development in the automotive sector: Cars are not yet able to drive autonomously, or at least only in delimited, controllable, and predictable scenarios, even though reports from car manufacturers try to suggest differently. In most cases, even after so many years of dedicated development, the only available tools are instruments that provide the user with additional information or support.

In other words, those who intend to increase data quality will find their solutions, not in the programming but the organization of processes. Unfortunately, Odoo cannot do anything to change this.

15 September, 2020
share article
categories
Bearbeiten
Archive
Sign in to leave a comment
WALLED IN
the risks of classic systems
Top