The paradox of information gain and what SEO needs to learn from information theory.
When we think of " information ," we usually imagine an accumulation of facts, data , or news. But what if the true nature of information is not accumulation, but transformation? What if gaining information is not like filling a bucket, but about changing the shape of the bucket itself?
For us SEO , this distinction is the difference between SEO of the past and Semantic SEO , which, from my point of view, is the SEO of the present and points to the future.
The concept of "information gain" is one of the most counterintuitive, yet simultaneously one of the most powerful in modern science. To begin to understand it, we need to abandon the idea that information is merely content . Instead, we will explore it as a process of substantial transformation, guided by three revolutionary perspectives.
Remember that we have an article here on the Semantic Blog that talks more specifically about what data, information, and knowledge are from our point of view: Difference between data, information, and knowledge .
First, Claude Shannon , the father of information theory, taught us that information is that which reduces uncertainty . Then, Norbert Wiener , the founder of cybernetics, defined it in a complementary way: just as the amount of information in a system is a measure of its degree of organization , entropy is a measure of its disorganization; one is simply the negative of the other. Finally, Information Science , through thinkers like B.C. Brookes, gave us the most radical definition: information is that which transforms a person's state of knowledge .
In this video, you'll get a more complete overview of the process I briefly described, focusing on Wiener's work due to its seminal importance in understanding information as something quantifiable and systematizable.
These three views are not merely academic; they perfectly describe many computerized systems and, in particular, one that interests us greatly: the Google . A search algorithm is, in essence, a cybernetic system (Wiener) that seeks to reduce user uncertainty (Shannon) in order to, ideally, transform their state of knowledge (Brookes).
We need, once and for all, in our field, to put an end to this nonsense that academic knowledge is detached from our daily lives. I have written article after article that presents many theories and applications that prove to be more than useful for our work.
That being said, I ask that you prepare to discover that the information is not in the data, but in the surprise; not in what is said, but in the change it provokes in the listener. Prepare to discover what it truly means to "optimize" content.
The Zero Point of Information
Why predictability is useless and "Keyword Stuffing" is dead.
It may seem strange to think this way, but a message carries no information. Think about it: if I tell you something you already know with 100% certainty, what do you gain? Nothing. Your uncertainty hasn't been reduced; your knowledge hasn't changed.
Once we understand this, it seems simple, but reaching that point of understanding is not easy at all.
This is the basis of Claude Shannon's theory. For him, the informational content of a message is directly proportional to its improbability. The more surprising the message, the more information it contains. An event with a probability of 1 (absolute certainty) has, by definition, zero information.
The classic analogy is the toss of a coin. A fair coin, with a 50% chance of heads and a 50% chance of tails, presents the maximum uncertainty and therefore the greatest potential for information in a single toss. A coin with two heads, on the other hand, offers no information at all, as the result is always the same. The surprise has been eliminated.
Here, we find the first essential lesson for SEO.
For years, early SEO relied on certainty keyword stuffing" was an attempt to create a 100% predictable message. If an SEO professional wanted a good ranking for "red shoes," they would create text that said: "Buy our red shoes. Our red shoes are the best red shoes on the market."
For a modern search algorithm, this message is like a two-sided coin: it offers no information. It doesn't reduce user uncertainty; it merely repeats what they've already typed. The information gain is zero.
We often seek certainty, but communication, learning, and science itself thrive in the unexpected. If you understand Google as a mechanism for repetition, you're missing the most fundamental part of its usefulness. See Google as a discovery engine, and new opportunities will appear. In this way, it becomes clear that it searches for pages that surprise the user with knowledge they don't possess.
This principle is not merely a philosophical curiosity; it is the mathematical foundation of many modern communication technologies, including search algorithms.
Shannon's entropy for a set of possible outcomes is given by the formula:
$$H(X) = – \sum_{i=1}^{n} P(x_i) \log_b P(x_i)$$
- H(X) : is the entropy of the system;
- N is the number of possible outcomes;
- P(i) is the probability of outcome (i).
Shannon formalized this in his entropy measure (H), which represents uncertainty. H = 0 if and only if all probabilities p(i), except one, are zero, with this value being 1. Thus, H disappears only when we are certain of the outcome. Otherwise, H is positive.
This seems complicated, and it really is if we don't dedicate time to understanding it, so let's use an analogy with our world of SEO:
Imagine an ideal search engine results page (SERP), from Google's point of view: it's one with high entropy, as it offers a variety of answers that cover the multiple facets of user uncertainty, maximizing the potential for information gain. A SERP where all 10 results say exactly the same thing is a low-informative SERP.
Let's pause for a moment and jump to the future, or rather, to our present: if we recall the concept of Query Fan-out , we see AI models doing exactly that. Multiple questions are formulated to generate information gain and reduce uncertainty. But let's go back to Shannon!
But this mathematical definition, focused on the message, is only the beginning of the story. It tells us nothing about the meaning of the message or what makes it useful. For that, we need to look at the structure of communication itself.
The surprising truth about language.
And what does Google BERT have to do with this?
Here's a fact that might shake your perception of language: according to Shannon, about 50% of common English is redundant. In Portuguese, I haven't found any studies on this subject, but this statement doesn't mean that half of what we say is useless; rather, it means that redundancy is determined by the statistical structure of the language itself.
In his theory of communication, "redundancy" is the part of a message that is not freely chosen, but follows the rules and patterns of language. For example, in the sentence "the cat climbed onto the roof," the grammatical structure and the likelihood of certain words following others (such as an article before a noun) fill in much of the content.
I want you to keep this information in mind: in a sentence, certain words follow others. This will explain something important about AI models, which, apparently, our SEO friends stubbornly refuse to understand.
But, going back to redundancy, it's important to clarify that, far from being a flaw, it's an important and even brilliant characteristic. It's what allows us to understand a conversation in a noisy environment (even when we don't hear all the words in a sentence), mentally correct typos, and even complete words or phrases that are cut off, as in games where words are mixed up or even omitted. Even so, we manage to understand the meaning .
Redundancy is the language's defense mechanism against noise and error , ensuring that the message reaches its destination.
This “redundancy” is exactly what Natural Language Processing (NLP) algorithms, such as Google’s BERT (Bidirectional Encoder Representations from Transformers), exploit. BERT doesn’t “read” your content like a human. It analyzes statistical patterns. It predicts missing words based on the context provided by the surrounding redundant words.
This brings us back to what I asked you to keep in mind: BERT, being a Transformer like GPT, works in a very similar way: it predicts, based on statistics, what the next words in the context will be and generates a sentence.
Think about it: do you really think it's possible to influence a system that generates this kind of response by creating FAQs? By structuring lists? Or by any other strategy created out of sheer desperation stemming from a lack of understanding of how the models work?
When we write, half of what we write is determined by the structure of the language and the other half is freely chosen. For semantic SEO, this is a turning point. Google uses the redundant part (the structure) to understand the syntax and the "free" part (your choice of words, your entities ) to understand the meaning. That's why, in the Semantic Workflow, we recommend using expert writers.
In a project related to medicine (our Domain of Knowledge), having doctors or residents involved makes all the difference. The specialist's mental and linguistic framework already contains all the entities and concepts, with their definitions, as well as all the relationships between these concepts. When this person writes, all this knowledge is presented in the content in a very natural way, expressing the structure and its relationships of meaning.
Now you've gained a little information; you know that redundancy shows us how the structure of language helps us receive the message, but this still doesn't explain what happens in our minds when we receive it. What, in fact, constitutes the "gain" of information?
Information is what transforms what you know.
Up to now, we have focused on information as a property of the message. But Information Science (IS) invites us to take a step further and focus on the effect that the message has on the receiver, although it does not use these terms, which I imported from Communication Theory.
In this view, information is not an object to be transferred, but rather a force that promotes cognitive change. This is the perspective that can be connected to Semantic SEO and Google Helpful Content Update.
Information scientist BC Brookes summarized this idea in a "fundamental equation":
K(S) + ΔI = K(S + ΔS)
Let's translate: a knowledge structure K(S) is transformed by an information increment ΔI, resulting in a new knowledge structure K(S + ΔS). Information is not simply "added" to a pile of facts; it reorganizes, restructures, and sometimes even demolishes what we knew before. Now I hope what you read has the same impact it had on me when I realized this:
Traditional SEO focused on K(S). It optimized for what the user already knew (the keyword they typed).
Semantic SEO focuses on the Δ transition to achieve K(S + ΔS).
Our job isn't to optimize a page about "The paradox of information gain and SEO." Our job is to create a page that takes the user's K(S) (their basic notion of what "information" means) and transforms it into K(S + ΔS), their new understanding of how Shannon, Brookes, and subjectivity impact SEO.
This perspective connects directly to other important theories. Here, Shannon's concept of "uncertainty" is radically reimagined. It is no longer a mathematical uncertainty in the transmission of a signal, but a cognitive gap, a "state of uncertainty" to be resolved.
And it is for this reason (proven scientifically and academically) that semantic SEO projects cause each optimized article, page, or piece of content to bring in hundreds, and even hundreds more, of different searches. Because we generate a new knowledge structure in each piece of content, we provide algorithms with information for various types of searches. We maximize information gain, reduce uncertainty, and the algorithm loves that.
NJ Belkin's "Anomalous State of Knowledge" (ASK) theory perfectly describes why someone resorts to a search engine. The user searches because they perceive an anomaly in their mental map of the world. Information, therefore, is the solution to a cognitive problem.
There are other valid theories to understand why we create and use search tools. In this article you will find a very interesting one: Kuhlthau's information retrieval process
This means that gaining information is a deeply personal event. The same document can be transformative for one person and irrelevant for another. The focus shifts from what is written on paper to the change that occurs in the reader's mind. And, in the article I mentioned above, you will find a proposal for including users' feelings in your content strategy!
The concept of information, from the perspective of information science, must satisfy a dual requirement: on the one hand, the information must be the result of a transformation of the generator's knowledge structures… and, on the other hand, it must be something that, when perceived, affects and transforms the receiver's state of knowledge, with a profound impact on how they felt when they realized they needed information they didn't have.
The Relativity of Information
Keywords are brilliant and useless at the same time.
If information is what transforms us, then its value is completely relative and contextual. And this is where SEO based purely on keywords fails irrecoverably.
A striking example of this is the "Case of Mark Twain's Painting," described by researcher Peter Ingwersen.
This imaginative exercise is a classic example from the field of Information Science, used to illustrate and explain the process of searching for and retrieving information. Although it is not a real historical event involving the author, the story is used as an allegory to demonstrate the challenges and dynamics of human behavior when searching for data in information systems. Let's look at it:
Twain describes an oil painting of the last meeting between Generals Lee and Jackson. He observes that, without a caption, the painting means nothing. The same image (the raw data) could be interpreted in countless ways, some even contradictory:
- First meeting between Lee and Jackson
- Last meeting between Lee and Jackson
- Jackson asking Lee for a match.
- Jackson reporting a major victory
- Jackson apologizing for a major defeat.
Each of these "captions" generates completely different information in the viewer's mind.
In semantic SEO, your content (article, video, image) is the painting. The "captions" are the entities you use to provide context. If your article is only about the keyword "Jackson," Google has no way of knowing if the user is searching for the entity "Michael Jackson" or "Andrew Jackson." The keyword, by itself, is ambiguous and doesn't provide any additional information. The information obtained depends entirely on the "pre-understanding" and context of the viewer.
The case of Mark Twain exposes the basic limitation of the purely mathematical theory of information. The reduced uncertainty lies not in the "painting" as a sign, but in the mind of the observer.
How does Google solve this? By mapping entities in its knowledge graph . The job of semantic SEO is to provide the appropriate captions for our "painting" (our publication).
We can do this in several ways:
- By using structured data and tagging our content with schema.org, we will be explicitly "captioning" our article for the search algorithm. We are saying: "This article is not about just any Jackson; it's about Andrew Jackson [Entity: Person], the seventh president of the USA [Entity: Title]."
- Creating a knowledge graph with systems like Wordlift, connecting words that represent entities in that graph and exposing it to algorithms.
- Structuring the content strategy and the entire project based on a thorough analysis of the Knowledge Domain, and presenting this structure in the form of menus, categories, and guidelines for content creation.
To learn how to do this, I recommend reading my book: Semantic SEO: Semantic Workflow.
This idea connects to the concept of the "three worlds" by philosopher Karl Popper, which Brookes applied to information science.
Our reality consists of… three interconnected and somehow interdependent worlds, which in part interpenetrate each other. These three worlds are: the Physical World, World 1, of bodies and physical states, phenomena and forces; the Psychic World, World 2, of emotions and unconscious psychic processes; and World 3, of Intellectual Products.
Karl Popper
- World 1: the physical world.
- World 2: the subjective world of our mental states (where the Need for Information and the Gain of Information ).
- World 3: the world of objective and recorded knowledge (books, art, science… and your website).
The acquisition of information, the transformation, is an entirely subjective event that occurs in each viewer's World 2. Our website (World 3) and the structured data (World 3) are the tools we use to influence the user's World 2.
Ingwersen, a key figure in the study of the intersection between human cognition and information retrieval, uses the "Case of Mark Twain's Painting" to highlight several concepts that are fundamental to understanding modern search:
The representation of knowledge: this is the most important point for us. The case illustrates how knowledge is represented (or “modeled”) in an information system and how this representation directly influences the user's ability to find it. Information about the painting (the main entity) can be cataloged in various ways: by the author's name (another entity), by the date of the work (an attribute), or by the person portrayed (a third entity). An efficient search algorithm and a competent semantic SEO strategy need to consider, connect, and disambiguate all these different representations. This is precisely what structured data is for: it provides the unambiguous “legend” that connects the points in the Knowledge Graph.
The dynamic nature of the need for information: this case clearly demonstrates that the search for information is rarely a linear process. It is not a matter of a user knowing exactly what they want, nor of a system simply delivering. On the contrary, it is a substantial cycle of trial, error, and learning. The user's understanding, and therefore their search intent, which is anchored in the need for information, evolves and transforms with each new interaction with the search algorithm and the results it presents.
"Cognitive Interaction" or the human factor: Ingwersen emphasizes that the success of a search is not solely due to the system's technology. The determining factor is how the user's brain (the subjective "World 2") interacts with the presented information. Intuition, the ability to interpret ambiguous contexts, and the skill to make unexpected connections are crucial to the process. The search engine is not dealing with a static query; it is dealing with a functioning mind.
The role of the system as a facilitator of discovery : if the search is a discovery, the ideal search algorithm should act as a facilitator. The system should be designed not only to "answer" but also to actively assist the user in the cognitive process. It does this by offering relevant suggestions (such as "People also ask" or related searches), organizing results in a useful way (grouping topics and entities), and allowing flexible queries that adapt to the constantly evolving need for information.
The implication for our digital age is therefore profound.
This subjectivity is the reason why "relevance" is such a complex problem for search algorithms and artificial intelligence. Information gain is not an inherent property of a document, but something created in the dynamic interaction between a text and a specific user at a specific moment.
A good, legible label is often worth, for informational purposes, a ton of meaningful attitude and expression within a historical context.
Panofsky, E. (1955). Meaning in the Visual Arts. Doubleday
Twain used to quote this phrase from Panofsky in a witty way, connecting the idea that clear labeling is very valuable for those who need the information.
For SEO purposes, we can paraphrase: "Good structured data and clear context are worth a ton of keywords to a search algorithm."
Navigating the ocean of uncertainty
SEO as transformation
Phew, I hope you've stayed with me on this real journey. Yes, going through so many complicated things was, for me, like an epic. Writing these articles that require so much research would have been impossible for me years ago. But using the agent+semantic approach makes it easier.

This research led me from a rigorous, mathematical definition of information as surprise, courtesy of Claude Shannon, to a profoundly human and cognitive vision, in which information is a transformative force. I hope it has brought you a great informational boost, a real gain.
So far we have seen that real information does not arise from certainty, but from the reduction of uncertainty. And this connects to a fact that triggers all of this: the perception that we are lacking something, which generates the need for information.
We discovered that the redundancy in our language, far from being a flaw, is what makes it robust and allows Google to understand it.
And most importantly, we understand that true informational gain is not about accumulating data, but about allowing our knowledge to be actively restructured.
SEO, therefore, is not the practice of having content; it is the practice of designing content that happens with the user. Information is the change, the reorganization, the mental "click" that alters our worldview.
This leaves us with a final thought:
If the true measure of information is the change it brings about, how can we design our SEO strategies and our own websites to be more open to transformation?
The answer is to stop focusing solely on K(S) (what the user typed) and start obsessively focusing on creating Δ (surprising, useful, and contextualized content) that leads them to K(S + ΔS) (the transformed state of knowledge).
And for that, semantic SEO is unbeatable!





Post comment