Endeavor of Gathering Knowledge
Ever since man started to pass on his knowledge (especially in writing it down), he stood in front of the challenging endeavor to not only to gather it. Instead he also needed to order it and to make sense of it by connecting it to pieces of knowledge he already had acquired. Would you like to arrange and interconnect your collected knowledge in a safe place?
When the first libraries were created the need for ordering books according to particular themes became paramount. Not only was it important to find books by the name of a specific author, but also by the a specific subject like books on „Mathematics“ or even narrower like „Algebra“.
Internet to the Rescue
At that time it was impossible to find all books or articles that mentioned a specific idea in their text. The time of the internet and computers had to come up. Now we are able to index the large quantity of books and other content pieces. In the meanwhile most of the content is searchable. Even we common people can easily search for content, not only scientist and researchers.
But the internet and computers still only help marginally in the way they are used today: As of this writing (in 2020), there will be more than 2.65 million new books published in this year alone around the globe. And people around the world will publish more than 1.2 billion new blog posts during the year. The sheer mass of these bits and piece of more or less reliable, serious and useful information makes it very daunting to find your way around and pick what helps you.
Internet Search Is Not Enough
To be able to find anything useful we need to get a grip on the quality of those pieces of information. We need to be able to hunt down connections between related pieces, that help us to form a comprehensive mental model of the world surrounding us.
These four tools help us to stay on top of the challenge:
- automatic categorization
- semantic linking and recommendation
- reliability and trustworthiness index
- historic relevancy checking
Let us analyze briefly how these means can help us.
Making Sense of Information
Automatic Categorization
By analyzing any specific text it is today quite easily possible to extract concepts from it and build categories that evolve over time. The tedious task of classifying texts according to a specific (and often individual) scheme is done for you automatically.
Note: The following features are still under development.
Instead of just looking for a specific set of keywords you can search for specific concepts like „Self-organization“ in the context of „Agility“, „Agile Software Engineering“, „Teamwork“, „Organizational Structure“.
Semantic Linking and Recommendation
Categorization of information is just the starting point that enable further processing and analysis. By linking concepts semantically together we are able to not only accrue new information but to transform it into new knowledge, connecting two or more otherwise seemingly not related concepts. An example for this would be the concept of „Organic Growth“ in the context of „Corporate Expansion“ with its original root meaning in the „Biology“ realm and thus helping you to arrive at new conclusions.
Reliability and Trustworthiness Index
Information sources have different trustworthiness and reliability. Think of a block post written by yourself. When you have a background in software engineering and authentication systems and you are writing in your post about these concepts, then you would have quite a high trustworthiness as a subject matter expert.
But if you write about what to do during a pandemic, then you might not be a reliable information source. But you could possibly improve your reliability rating by citing well known and accepted sources.
It is very difficult to measure reliability and trustworthiness even in academic research papers. So our approach has to be seen is just an experimental approximation. We will try to improve on it, based on your feedback.
There are different factors contributing to the RTI as follows:
The rating is higher when…
- …the book or article is cited from within other books and articles. Not only the number of citations is key, but also whether the other books or articles are published by trustworthy and well-known publishers.
- …the book or article contains citations of other books and articles that have a high rating themselves.
- …the blog post is citing / linking to blog posts, articles or books, that have a high rating themselves.
- …the blog post is linked to / cited by other blog posts that have a high rating.
- …the author of the blog post, article or book has a high reputation.
The rating is lower when…
- …purely advertisement oriented websites (we usually filter out) are referencing the article, blog post or book.
- …the blog post, article does not contain any citations.
- …the author is little known or unknown.
Historic Relevancy Checking
Information ages – unfortunately not like a good wine, becoming better with age – but to the contrary often it becomes outdated or even irrelevant within a few years. Think about a Manual for a specific software version. Because the update cycles of software shortening continuously, this kind of information looses quickly its value.
We observe, that this is also valid for concepts that we substitute with more recent concepts. They may have historical value (giving us the possibility to follow the evolution of ideas). But the knowledge of humanity is in constant growth, ideas building on new ideas. It is useful for us to be able to judge the value of a publication or blog post on the timeline.
Unouit makes it possible to implement a system that is able to automatically examine the relevancy of information as time goes by and to visualize the aging of a digital document. We do this by indicating to a user how useful a specific piece of information is in the present. Finally, that makes it possible for you to arrange and interconnect your collected knowledge.