Exploring the intrinsic semantics of data

Exploring the intrinsic semantics of data

Pierre Lévy , in his work *The Semantic Sphere *, proposes a work that combines human sciences with computer science and cognitive sciences, beginning the theoretical and conceptual work of the collaborative construction of a "global hypercortex coordinated by a metalanguage ".

The Web has a serious problem to solve : the absurd amount of unstructured documents that prevent computers understanding what they mean.

Remember this word: meaning.

One field of knowledge that is underexplored, despite its importance, is that which deals with the use of semantics embedded in documents themselves. The intrinsic potential of texts, created by the use of natural language, connected with the potential of technologies such as Artificial Intelligence , Thesauri , ontologies and data markup , promises to revolutionize the way we index , organize, and retrieve information .

Traditional information retrieval systems have been superseded by the increasing use of semantic retrieval techniques (PINHEIRO DE MELO GOMES; MARTINS DE ARAÚJO ALTOUNIAN, 2016).

(…) enable the understanding of concepts in their context and purpose. Some technologies have contributed to this reality, such as semantic data , used in the semantic web , natural language processing , and neural networks. The thesaurus also presents itself as a semantic component that impacts the performance of SRIs. Thesauri are artificial language tools in a specific domain, formed by a system of interrelated concepts. 1

Semantic information retrieval in the context of external control.

Information retrieval systems

Information retrieval systems (IRS) generally use isolated words as descriptors and retrieval units.

Although they work well for information retrieval purposes, the main reason for failure is that they do not consider the implicit information context in the entire query . This happens because they are not prepared to handle how these words or concepts are related.

In practice, the relationships between terms

Research on information and data

Research in this area ranges from the use of deep natural language structures, such as verbal and nominal phrases, used in indexing and information retrieval, as in the work of Kuramoto, Moreiro and Souza (KURAMOTO, 1996 and 1999; MOREIRO et al, 2003; SOUZA, 2005); to the use of tools that create representations of semantic and conceptual relationships, such as thesauri and ontologies, which have long been used to expand the range of information retrieved and to assess contexts.

Library Science and Information Science use methodologies and techniques such as those described above and play a fundamental role in the theoretical and methodological definition of this field.

And what does SEO have to do with this?

How many times have you, fellow SEO , worked on a project where all the information is correctly tagged, structured, and related?

How many times has the project you were asked to optimize had a defined ontology, taxonomy or thesaurus?

If you answered anything other than zero, you're lucky.

In the vast majority of cases, we work on projects where the content is defined in pages , spreadsheets, files (such as PDFs), and databases, without structure, relationships, or description. Therefore, it lacks any semantic meaning.

Remember when I asked you to keep that word in mind: Semantics? Well, that's what it's for. It's to understand that the Web needs indexing, organization, and information retrieval work using semantic retrieval techniques.

Works like that of Pierre Levy, on ontology creation systems with automated tools, need to be on our radar. These are the tools that will allow us to transform the volume of unstructured data into semantically relevant information.


1- PINHEIRO DE MELO GOMES, B; MARTINS DE ARAÚJO ALTOUNIAN, M. Semantic information retrieval in the context of external control. Revista do TCU, September/December – 2016. Available at: https://revista.tcu.gov.br/ojs/index.php/RTCU/article/view/1376/1522 . Accessed on: April 3, 2021.


Hello, I'm Alexander Rodrigues Silva, SEO specialist and author of the book "Semantic SEO: Semantic Workflow". I've worked in the digital world for over two decades, focusing on website optimization since 2009. My choices have led me to delve into the intersection between user experience and content marketing strategies, always with a focus on increasing organic traffic in the long term. My research and specialization focus on Semantic SEO, where I investigate and apply semantics and connected data to website optimization. It's a fascinating field that allows me to combine my background in advertising with library science. In my second degree, in Library and Information Science, I seek to expand my knowledge in Indexing, Classification, and Categorization of Information, seeing an intrinsic connection and great application of these concepts to SEO work. I have been researching and connecting Library Science tools (such as Domain Analysis, Controlled Vocabulary, Taxonomies, and Ontologies) with new Artificial Intelligence (AI) tools and Large-Scale Language Models (LLMs), exploring everything from Knowledge Graphs to the role of autonomous agents. In my role as an SEO consultant, I seek to bring a new perspective to optimization, integrating a long-term vision, content engineering, and the possibilities offered by artificial intelligence. For me, SEO work is a strategy that needs to be aligned with your business objectives, but it requires a deep understanding of how search engines work and an ability to understand search results.

Semantic Blog
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognizing you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.