
On the Internet, expert information has a lot of repetition in different sources. Common knowledge should be gathered, repetition of the same information, erroneous and uncertain information should be eliminated and clarification of expressions must be done. When a collection of the same facts could be obtained, different actors could link to it when making their home pages. And from that top level, one could also go down to the deeper level and look at the information and goods provided by the various actors. Basic image material could also be common. Each actor would only produce the part of the data and images where their data differs from the basic data. Different industries have common basic data and they could use the same pieces of information. If the data were split into pieces of data, different combinations could be put together.
For example, Finnish-language medical and environmental information could be combined into a single data repository, which could be supplemented with information available from foreign sources and translated into Finnish. In this context, you could check the facts. There may be different facts in different situations and cases. The data structure should be built with artificial intelligence programs and logical reasoning in mind: If so, then it is so or not. The probability of this is, for example, 50%, etc. Once the information was compressed, it would be faster to retrieve, read and understand. The information should be updated all the time, in which case you only need to do it in one place. At the very least, public sector bodies could combine the same information. Similarly, private actors in the same industry could pool their common basic knowledge, but maintaining competition could prevent, could become monopolies. However, it would be good for society not to waste resources on duplicate work on information resources, but to use the resources they save to do something else.
Similarly, there could be only one directory of companies, actors, bodies on top and bottom portals. However, this was not accepted by different competitors, everyone wants to produce their own kind and there are several competitors to directory services. Even if they lost their income in this area, they could concentrate on something else. Exploiting and retrieving information requires a great deal of development of artificial intelligence applications and would be facilitated by the collecting a common repository of information that artificial intelligence utilizes. The creation of a common basic data repository would require the cooperation of several actors. The idea would be to avoid having to do the same work many times.
We would need a whole new Internet where all the information would be just once. Of course, there should be multiple backups and copies to prevent data loss in the event of an accident. The problem with data collection and aggregation is that not everyone agrees on the truthfulness of the data and that not all actors are not got to work together. There is a lot of uncertain and empirical information, which are in some cases true and not true in others. It takes time, money, independent experts to verify, validate, and compile the information. At least on a small scale, this idea could be implemented, for example, thinking user-friendliness. If we limit the information we just focus on. In solving problems and development, all you need is related information: Causes, problems, development needs, solutions and their risks, pros and cons. But even so, knowledge also swells because the problems are directly or indirectly related to everything else. There should be simplified and concise information for ordinary users, more in-depth and broader knowledge should be available to experts. Copyright issues can also be a problem, and there is a wealth of copyrighted information, for which copying requires authorization and copyright fees. There are also professional secrets and secret information that cannot be given to the general public for free. You need to define what is copyrighted and who owns the copyright even when the data is combined and edited. If operators link to shared information, they should pay for it to those who have collected and maintained the information. A well-summarized, simple set of basic information from a variety of sources, with the source information marked up, and the copyright to the compilers. Therefore, this knowledge structure requires collaboration and cost, revenue sharing and copyright agreements between different parties.
Knowledge itself, understanding language expressions, how to structure and transform information into objects requires a lot of research, time and development, standardization of sentences, as the same thing can be expressed in many different phrases. Also, when the same sentence is put in a different context, it may already be different information. Phrases are thus contextually bound, just as words, and can have other meanings in different contexts. The information also needs to be evaluated in terms of significance, importance, weight, what is taken and what can be omitted. Especially free-formed information can have a lot of filler words and expressions, which makes the reading more natural and pleasant. As facts, their information content could be shorter, but it could be catalog-like and pushy. The same information may be more meaningful to some than to others. Here comes the appreciation of people’s different things and styles, the kind of text they prefer to read, which is most clearly stated. An expert may understand even the incomplete text in his field, but the person who does not know anything about it will have to be explained thoroughly because of a lack of basic knowledge and understanding of concepts in the field.
When only the very brief summaries of established basic data are compiled into the top portal, they are no longer copyrighted by anyone other than the author of the summary. And when the abstracts provide links to the websites of different actors and contributors, the owners of the pages that open those links are responsible for their in-depth knowledge, both in terms of copyright and trustworthiness. However, this is not the ideal model of knowledge and the internet.
Artificial intelligence is developing, and good information resources would be needed for them. Creating the appropriate, large-scale information resources requires a great deal of time, money, expertise. Good data warehouses and well-developed artificial intelligence would enable faster and better services. But there too, there is a risk of distorting decision-making, because information and programs are never perfect, and finally one has to evaluate the results of reasoning and automatic choices. It is important to know what information the result is based on whether the result is based on biased or uncertain information, at least in part.
Veikko J. Pyhtilä, 6/9/2017