Human ingenuity and computer technology are creating more and more novel products and services in our digital age. Over the past decade, technological advances and computer science have ushered in an era in which we have drones that deliver pizza and robo-advisers that make investments for us.
These technological breakthroughs have, inevitably, given rise to many buzzwords. The jury is still out on one of the latest - Artificial Intelligence or AI.
Long before Artificial Intelligence became the hype that is today, computer science had advanced to create a “language” for machines to enable them to understand the meaning of words and the relationships between concepts. Indeed, Semantic Technology - which is the driving force behind these advances - has been developing for more than a decade.
Some of the largest and best-known organizations in media, scientific publishing, healthcare and digital humanities - to name just a few - have been using Ontotext’s Semantic Technology offerings for years. Both large enterprises and SMEs have been using the underlying principles of the Semantic Web and Linked Data to manage their information assets and to seamlessly integrate and cost-efficiently analyze data from proprietary datasets and from external sources.
Semantic Technology develops languages to express rich, self-describing interrelations of data in a form that machines can understand and process. Therefore, machines are not only able to index tons of data, but they are also capable of storing, managing and retrieving information based on meaning and logical relationships. Meaning and relationships in machine-readable format is what this technology is all about. Semtech products are based on universally recognized rules - a kind of a “textbook” for this language - which describe the world in which concepts exist.
Let’s take a look at one very short statement sentence and see how it can be interpreted by people and machines.
“Jon Snow is in King’s Landing.”
To the human brain, it’s immediately obvious that Jon Snow is a person, a male, and he is at some place.
To a fan of the Game of Thrones fantasy books or TV series, who has knowledge about the named entities in the statement, that place would be familiar - the capital and largest city in the Seven Kingdoms in Westeros. The fan would also know who Jon Snow is and how he is connected to the other fictional characters in the books or the TV series.
To a machine, however, it can be confusing to process this statement. Fortunately, there’s Named Entity Recognition, which recognizes sequences of words in a text that are names of people, places or companies. NER is only the beginning of the Semantic Technology process to discover relationships between concepts.
A founding principles of Linked Open Data is the Uniform Resource Identifier (URI). It’s a single global identification, a kind of a unique ID, for all things linked, so that we can distinguish between them or know that one thing from one dataset is the same as another in a different dataset because they have one and the same URI.
Semantic Technology uses universal standards to represent data on the Web and to link one concept to another. To do this, it uses ontologies - the ‘textbooks’ of rules that help machines to know how one person is connected to another, or to a place or organization.
There are Linked Open Data databases that are being continuously developed. One is DBpedia - an open, free and comprehensive reference database that extracts structured information from Wikipedia and makes it available on the Semantic Web.
Back to our ‘Jon Snow’ example, the entry in DBpedia contains a lot of information about who this guy is. It also has a list of many properties pertaining to the fictional character in A Song of Ice and Fire series of fantasy novels by the US author George R. R. Martin, and its television adaptation Game of Thrones. This list describes the relationships of Jon Snow, and we don’t mean just the ‘significant other’ relationship.
The DBpedia list describes - via properties and values - the relationship between Jon Snow and his creator (the author of the books), for example, or the actor who portrays him in the series. We also have a property and a value of the statement what gender Jon Snow is, what titles he holds, what his aliases are. We also have the owl:sameAs property - one of the most important properties in Semantic Technology. It maps the same concepts from two or more datasets, where each of these concepts can have different features and relations to other concepts. The union between these datasets results in significantly richer data.
In Jon Snow’s case, we have many values for this property, for example, in Spanish his last name ‘Snow’ is Nieve (the Spanish words for snow), while in Italian, this fictional character is simply known as Jon Snow. The same as mapping allows machines to know that regardless of the name in various languages, we’re still talking about the same fictional character.
This same as the description of Semantic Technologies, together with all other properties of a concept, allows for rich expression of relationships on the Semantic Web, where people are connected to many other people or places, and one named entity links to many others. Thanks to these relationships, machines are able to infer knowledge that is not explicitly stated in the source material.
This same as description of Semantic Technologies, together with all other properties of a concept, allows for rich expression of relationships on the Semantic Web, where people are connected to many other people or places, and one named entity links to many others. Thanks to these relationships, machines are able to infer knowledge that is not explicitly stated in the source material.
If you still think that Semantic Technology is just another buzzword and something fictional like Jon Snow and the fantasy world he ‘lives’ in, you may have to think again. These technologies and the way Ontotext applies them in its semantic graph database GraphDB are used by organizations such as the BBC, Elsevier, Springer Nature and AstraZeneca to create smarter and more interlinked content.
The BBC made its content smarter with the use of the Ontotext Platform architecture. One of the biggest scientific publishers uses GraphDB to power its data management platform – Springer Nature SciGraph. AstraZeneca remodeled its knowledge repository with the iSIM (intelligent study information mining) system to quickly identify patterns and relationships, important in studies and drug therapies.
As you can see, Semantic Technology is not just a fashionable hype or another buzzword. It is transforming content and knowledge management. Semantic Technology has been in use for years, and will be used for years to come. They are changing our digital world now and they are the future.
135 "Tsarigradsko Shose" blvd., 1784 Sofia, Bulgaria tel: +359-2-976-8310
202 N 9th St, Suite 201B, Boise, ID 83702, USA Tel: (800) 701-3710 ext. 101
© Sirma Group 2018
I can answer questions like:
Who are the board of directors of Sirma?
Who is Tsvetan Alexiev?
Click Help for more examples