New proposal for Web Content Management

New proposal for Web Content Management

This article aims to propose a new methodology for planning and managing web content optimized for search engines, as an alternative to the search intent-focused model created by Google. I will critique the search intent model through a brief historical analysis of the documentation created by the company, based on the cognitivist paradigm of Information Science .

I will describe Kuhlthau's information retrieval process, its influences, and possible implications for content , based on the book "Manual de Estudos de Usuário da Informação" by Murilo Bastos da Cunha, Sueli Angelica do Amaral, and Edmundo Brandão Dantas. Finally, I will propose an alternative methodology to the search intent-focused model.

Critique of the search intent model

Currently, the model that seeks to understand why people perform online , based on the intent behind each user action, is an standard in content creation and SEO.

This model, despite its undeniable contribution, has a problem that has not been explored by professionals in our field: the use of only one dimension in understanding the information search process (intention) not only limits our understanding but also ignores other aspects of the subjectivity of those seeking information, especially on the web.

But first, let's look to the past and understand how we got here.

Historical analysis

Google began using search intent to try to predict what each search meant, as we can see in the Search Quality Evaluator Guidelines documentation.

According to the Search Engine Journal website , there is a science behind using intent in information retrieval on the web. The article "Utilizing Search Intent in Topic Ontology -based User Profile for Web Mining" by Xujuan Zhou, Sheng-Tang Wu, Yuefeng Li, Yue Xu, Raymond YK Lau, and Peter D. Bruza investigates the effectiveness of information retrieval processes. It demonstrates that there is a problem with using generic user profiles in web mining systems, proposing the use of search intent to enrich these profiles.

Reading the research abstract helps us to understand it better:

It is knowledge that taking web user profiles into account can increase the effectiveness of web mining systems. However, due to the dynamic and complex nature of network users, the automatic acquisition of valuable user profiles was very challenging. Ontology-based user profiles can provide more accurate user information. This research emphasizes the acquisition of information about search intents. This article presents a new approach to developing user profiles for web search. The model considers user search intents through the Pattern-Taxonomy Model (PTM) process. Initial experiments show that the user profile based on search intent is more valuable than the generic PTM user profile. Developing a user profile that contains user search intents is essential for effective web search and retrieval .

In the document we saw that research segments search intent into two objectives, at a basic level:

  • When a user uses a term and searches for specific information about that term;
  • When a user uses a term to represent a topic and therefore wants more general information.

Let's give a simple example:

I need to buy a new refrigerator for my house. My first search might be using "refrigerator" or "refrigerators," simply because I need to explore the topic.

After searching in several places, I go back to the search and use a more specific term: 40-liter stainless steel refrigerator.

It is clear that this research was used as a basis for studies to try to understand users' search intent and what the impact was on search effectiveness.

I want to highlight another excerpt from the research:

Currently, web search does not consider the user's search intent. Most search engines simply disregard the web user's profile. User profiles are an important source of metadata for Information Retrieval (IR) processes. To improve accuracy and increase the efficiency of information access, the web search process needs to evolve further with the ability to incorporate the user's search intent. However, valuable web user profiles are difficult to acquire without manual intervention.

I want to highlight from this passage the concept of precision in information retrieval. As Araújo Júnior states in "Precision in the information search and retrieval process":

Precision is a fundamental concept for evaluating the quality of information retrieval, while also representing the measure of interest (useful information) of what was found in a search and information retrieval process for the user, who qualifies the retrieved information as useful or useless according to their needs.

In the same book, Araújo Júnior recalls that Lancaster (1998) defines precision “as the extent to which the items retrieved in a search and information retrieval process in a database are considered useful. High precision is given when most or all of the items (…) retrieved are considered useful.”

The point here is that we need to go beyond understanding the results page—whether it returned the millions of results it should have and in how many milliseconds. We need to know if those results were actually relevant to the people who performed those searches.

How do we know that? We'll talk more about that later.

In 2016, Paul Haahr (one of Google's leading research engineers and with the company for over a decade) and Gary Illyes (Webmaster Trends Analyst) gave a presentation called " How Google returns results from a ranking engineer's perspective," which can be viewed here:

How Google returns results from a ranking engineer's perspective

I did some research and here's a summary for you:

In their presentation, Illyes and Haahr provide an engineer's perspective on how Google classifies information, makes algorithm changes, and how this process determines information retrieval in the SERP, demonstrating the challenges involved in delivering the most relevant results to users.

In this presentation, we heard about the famous more than 200 ranking signals that determine the order in search results, and that search algorithms are constantly evolving.

We also hear about small changes to the algorithms, which are made almost daily, and updates to the core, which can profoundly alter the SERP.

Although the presentation is old, I recommend watching it because it shows the basic process of a retrieval tool, using our search engines: tracking, analysis of tracked documents, classification, and retrieval. This doesn't change and won't change.

(I know you're thinking: what about Google Play? Well, it's not an information retrieval tool, it generates information.)

In short, this was the approach of the study that inspired Google, so to speak, to expand this model and create the four search intents.

Description of the Google documentation

As is common knowledge, Google uses human reviewers to perform quality control on the information it retrieves. And these reviewers have extensive documentation they must follow to ensure the quality of their work.

I don't intend to describe all the processes that are detailed in the documentation (which every SEO professional should read). For the purpose of this article, we will focus on the process that uses the user's search intent.

The passage I want to highlight begins by analyzing the meaning they gave to the term "multiple meanings of a search," with a visual example, which I reproduce below:

Searches with multiple meanings

Queries with Multiple Meanings

Many queries have more than one meaning. For example, the query [apple] might refer to the computer brand or the fruit. We will call these possible meanings query interpretations. Dominant Interpretation: The dominant interpretation of a query is what most users mean when they type the query. Not all queries have a dominant interpretation. The dominant interpretation should be clear to you, especially after doing a little web research. Common Interpretation: A common interpretation of a query is what many or some users mean when they type a query. A query can have multiple common interpretations. Minor Interpretations: Sometimes you will find less common interpretations. These are interpretations that few users have in mind. We will call these minor interpretations.

They divide searches (queries) related to meanings into four segments:

  • Queries with multiple meanings : Many queries have more than one meaning. For example, the query [apple] could refer to the computer brand or the fruit. We will call these possible meanings query interpretations.
  • Dominant interpretation : The dominant interpretation of a query is what most users mean when they type the query. Not all queries have a dominant interpretation. The dominant interpretation should be clear to you, especially after doing a little web research.
  • Common interpretation : A common interpretation of a query is what many or some users mean when they type a query. A query can have several common interpretations.
  • Minor interpretations : Sometimes you will find less common interpretations. These are interpretations that few users keep in mind. We will call these minor interpretations.

And they go on to say that searches change over time, and they give us the example of the former US president, Bush.

In 1994, a search for the President of the United States of America returned information about George Bush. In 2004, it returned information about George Bush, but they were not the same person. The first was the father of the second.

Search for President Bush

To solve a semantic , to understand the meaning behind each query, Google engineers sought a way to understand what each search entails, something the system couldn't take into account before.

Trying to understand the intent behind each search was the approach taken. Which, at that time, with the resources of the algorithms, was the appropriate solution.

In my view, considering the challenges of a search engine, it's a smart thing to do because it's a simple approach that reduces the number of variables to be taken into account, it's programmable, and it makes the work of the evaluators more standardized.

As we all know, the intents used by Google are:

  • Searches to find out something : some of which are simple, well-known searches;
  • Inquiry to do something : when the user is trying to achieve a goal or engage in an activity;
  • Website query : when the user searches for a specific website or page;
  • In-person visit consultations : some of which are looking for a specific company or organization, others looking for a category of companies.

To cater to all these types of searches, seeking to understand the search intent of its visitors.

Companies have created platforms that attempt to infer, based on a wide variety of criteria, how to categorize each search within these four intents, which have been summarized as follows:

  • Navigational;
  • Informational
  • Commercial;
  • Transactional;

I see that the simplicity of this approach is also its weakness. The problem I perceive in this approach is that it only considers one aspect of people's motivations in seeking information, and summarizes it under the label of intention, without even bothering to define what intention means for this model.

This article does not intend to debate or discover what intent means to Google; that is a task that its researchers should have undertaken. Or, if they did, they should have published it in the documentation. There are 371 occurrences of the word "intent" in the documentation, and none of them address the definition or meaning of the term.

The widespread dissemination of this Google model has generated a blind race in the content creation market, a frantic search to understand user intent without bothering to reflect on what that really means.

Critique of the model

As I briefly discussed earlier in this text, I understand that the search intent model was important for its pioneering nature, but that it has weaknesses that can be resolved with the use of more robust models.

I need to raise a point that, in my opinion, greatly helps us understand the information retrieval process.

The beginning of this process is triggered by what Brenda Dervin, in 1983, called a cognitive void, which occurs when we are in a situation where the answers we have to our questions are no longer sufficient to fill the gap between what we know and what we need to know.

Information Gap

Extracted from the book "Information User Study Manual".

This feeling of uncertainty leads us to wander in search of information that already exists, but is not part of our existing set of known information.

Understanding search intent doesn't take into account this void, nor the feelings it triggers.

Therefore, this article proposes an alternative to this model, based on the work of an American researcher and educator, Carol Kuhlthau, who understands the importance of identifying the information needs of the various user segments.

Identifying the user's information needs is the vital point of my proposal, and we agree with Cunha (2015) when he states that this is a vital step for understanding the strategic and overriding objectives in defining the collection development policy and for knowing whether the demands are met (emphasis added).

In our work environment, we need to understand that the collection includes all types of content created and made available for users (current, potential, and non-users) to access.

Therefore, bringing studies from Information Science back to our job market: we need to understand the information needs of those searching for what we are offering on the web, since the content we publish on this network is our collection, which requires more appropriate policies for its development.

This article does not address this broad subject, but it does use the model created by Kuhlthau to begin this process.

Who is Carol Kuhlthau?

Carol Collier Kuhlthau (born December 2, 1937) is a retired American educator, researcher, and international speaker on learning in school libraries, information literacy, and information-seeking behavior.

Biography

Kuhlthau was born in New Brunswick, New Jersey, USA. Kuhlthau graduated from Kean University in 1959. She completed a master's degree in Library Science at Rutgers University in 1974 and a doctorate in Education in 1983. She was in the Department of Library and Information at Rutgers University a Professor of Science for over 20 years and has been a professor emeritus since 2006.

Carol Kuhlthau founded the Center for International Scholarship in School Libraries.

Kuhlthau's information retrieval process

In terms , this article, and the proposals we make in it, are linked to the cognitivist paradigm, in which Ellis, Dervin, Kuhlthau, and Wilson are the researchers chosen as a basis.

We caution that care must be taken not to innovate in theory and repeat previous models in application, as Carlos Alberto Ávila Araújo in his text " User Studies: Theoretical Plurality, Diversity of Objects" on page 8:

The cognitive model of these studies, by prioritizing the understanding of information needs based on a gap, an absence of certain knowledge to perform a specific activity, ends up rigidifying a way of understanding users as beings endowed with a specific need that would be satisfied by a specific source of information.

Therefore, this article, in addition to contrasting the search intent-based web content creation model with our proposal, proposes a new content creation model that needs to be tested and validated in practice.

Influence of Jean Piaget and Brenda Dervin on Kuhlthau

In 1991, Kuhlthau's model was introduced under the name Information Seeking Process (ISP). It describes how feelings, thoughts, and actions are present in six stages of information seeking.

The ISP model is based on the work of Jean Piaget, that is, on the four stages of cognitive development in children, namely: sensorimotor, preoperational, concrete operational, and formal operational.

Kuhlthau introduced the experience of information seeking from the individual's perspective, emphasizing the vital role of affect in information seeking, and proposed the uncertainty principle as a conceptual framework for libraries and information services.

I understand that a search engine falls under the category of information services.

Kuhlthau's work is among the most cited by professors of library and information science and is one of the most widely used conceptualizations by researchers in the field.

The ISP model represents a watershed moment in the development of new strategies for understanding how people search for information.

Information Search Process (ISP)

Also within the alternative approach, Carol Kuhlthau (1991) advocated the Constructive Process Approach for conducting user studies and developed the Information Search Process (ISP) model, represented in the figure below, taken from the book Manual de Estudos de Usuário da Informação (Information User Studies Handbook ).

Information retrieval process

Internships at ISPFeelings at each stageThoughts at each stageActions at each stageAppropriate tasks
1. Initiation e-column-delimiter/>UncertaintyGeneral/VagueSearching for pre-existing informationRecognition
2. SelectionOptimism  Identification
3. ExplorationConfusion/frustration/doubt Search for relevant informationInvestigation
4. FormulationClarityTargeted/Clear Formulation
5. CollectSense of direction/ConfidenceIncreased interestSearching for focused or relevant informationConnection
6. PresentationRelief/Satisfaction or DisappointmentClear or focused Complementary
Source: HLTHAU (1991, p. 367)

According to the authors, during the information-seeking process (uncertainty principle), Kuhlthau considers that the level of uncertainty fluctuates and can be observed in six stages, divided into three fields of experience: emotional, cognitive, and physical .

  • The initiation stage , when there is recognition of the need for information;
  • The selection stage begins the work of defining the field or topic of investigation;
  • The stage of exploring documents on the topic, leading to an expansion of the general theme (for example, reading secondary sources);
  • The formulation stage , in which the focus or perspective of the problem is established;
  • The data collection stage involves interaction with information systems and services to gather information;
  • And the presentation stage , the "end" of the search and "solution" to the problem.

According to González-Teruel (2005, p. 72), Kuhlthau's information-seeking process model identifies the need for information with the state of uncertainty that commonly causes anxiety and lack of confidence. For the researcher, uncertainty is a natural state, especially in the early stages of the information-seeking process.

Rolim and Cendón (2013) argue that the stages can be visualized from the dynamic nature of the information search process, since this process involves the construction of knowledge and meaning. Formulating a focus of interest affects the search process, because establishing a focus requires interpreting existing information. The nature of the information found alters the user's position; redundant information can be annoying, but new information may require a reconfiguration of unavailable knowledge, causing anxiety. The user's attitude influences the search outcome, as their search involves personal choices, and interest increases as the focus is defined and the research progresses.

It's interesting to note in this passage the reciprocity between our needs and what we can find when we actively search for information to fulfill those needs. Feelings such as anxiety, frustration, annoyance, enthusiasm, euphoria, and joy are not normally taken into account when we study the searches of our visitors, but they are part of each individual who searches.

Proposal for a new methodology for planning search-optimized content.

Based on this approach, we describe and propose a new content creation model based on Kuhlthau's research (1991), which I describe in more detail below. Before proceeding, I want to provide some important details, mainly because the model I describe is an adaptation of Kuhlthau's work.

Adapting the information search process

In the table below, highlighted in green, are the steps I selected for adapting the ISP to create search-optimized content.

Internships at ISPFeelings at each stageThoughts at each stageActions at each stageAppropriate tasks
InitiationUncertainty  General/VagueSearching for pre-existing informationRecognition
SelectionOptimism  Identification
ExplorationConfusion/frustration/doubt Search for relevant informationInvestigation
FormulationClarityTargeted/Clear Formulation
CollectSense of direction/ConfidenceIncreased interestSearching for focused or relevant informationConnection
PresentationRelief/Satisfaction or DisappointmentClear or focused Complementary
Source: HLTHAU (1991, p. 367), adapted by me.

The intention is to propose a new way of planning content creation based on three stages of the "Information Search Process" table proposed by Kuhlthau:

  • Initiation;
  • Exploration;
  • Collect.

All six phases are present in any information retrieval process, but in my view, the phases I have selected are perfectly aligned with web content management.

I understand that this model allows for the creation of a structure that aligns creation with the needs of our users, aligning them with a management and creation process that helps satisfy these informational needs at each stage of the search.

Our proposal aims to consider the importance of the needs, desires, demands, expectations, attitudes, behaviors, and other practices in the use of information by users seeking information that creators can provide in their content.

But in contrast, we cite Hewins (1990) apud Cunha (2015) to emphasize that

There are unique characteristics for each user and others that are common to several, therefore, systems must be developed considering the flexibility needed to adapt to all users.

It is clear that Hewins was dealing with information retrieval systems, but we can broaden that meaning and say that for our proposal to be minimally effective, we need to develop a model that can be adapted to both cases: specific audiences and a general audience.

In 1968, Taylor, speaking about the questions that library users asked, established an interesting relationship between the questions asked and the real needs of the users.

This work generated a classification of information needs that is interesting for our work:

  • visceral,
  • conscious,
  • formalized,
  • committed.

Visceral need

It may be related to a vague dissatisfaction, but that feeling isn't strong enough to generate a question. However, this state can change if the person has access to some information that alters that feeling and encourages them to ask the question.

Conscious Need

There is a mental confusion that, in a way, influences the formulation of the question.

Formalized Need

When an individual manages to resolve this confusion to some extent, they formulate the question that triggers the information-seeking process.

Compromised Need

This refers to a question that has been modified or reformulated so that it can be understood by an information system that cannot handle, for example, natural language search. Here, the user needs to adapt to the system, not the other way around.

When Kotler (2000, p. 43) addresses the subject from a marketing perspective, he points out that there are these five types of needs:

  • stated needs (the customer wants an inexpensive TV);
  • Real needs (the customer wants a TV that will last a long time, but doesn't care about the price);
  • unstated needs (the customer expects good service from the seller);
  • The need for "something more" (the customer would like the seller to include a sound system as a free gift);
  • secret needs (the customer wants to be seen by friends as a smart consumer).

With this in mind, we created the model below, which aims to be a basis for generating models for content management and creation that take into account the phases we have already described: initiation, exploration, and information gathering.

I would like to remind you that the most important thing is not the model, which can and should be constantly adapted, updated and improved, but rather a new way of thinking about how we create content on the web.

A new type of paradigm that is driven by the needs of the people who require the services and products your company wants to deliver to them.

I want to remind you of our theoretical basis for creating the model below, with this table:

Internships at ISPFeelings at each stageThoughts at each stageActions at each stageAppropriate tasks
InitiationUncertainty  General/VagueSearching for pre-existing informationRecognition
ExplorationConfusion/frustration/doubt Search for relevant informationInvestigation
CollectSense of direction/ConfidenceIncreased interestSearching for focused or relevant informationConnection
Source: HLTHAU (1991, p. 367), model adapted by me.

Briefing creation template

Create content that addresses {SENTIMENT RESOLUTION} and generates {APPROPRIATE TASKS} related to {ACTIONS}, aiming to reduce {SENTIMENTS} experienced by visitors in the {STAGE} stage.

Example:

specific and focused content that generates recognition of pre-existing information aiming to reduce the uncertainty by visitors who are in the initial .

The model below should be used when generating content briefs with the goal of creating content aligned with user sentiments, the stage they are at in the user journey, and how the content should be generated to meet those needs.

In conclusion. For now.

To conclude this first version of my proposal connecting SEO, Content Management, and the ISP Model, I want to say that those of us who work with and on the Web have much to learn from Library Science and Information Science.

There is a vast body of knowledge generated over centuries of study in this area that is intimately linked to our work. Understanding this connection is what led me, at almost 50 years old, to re-enter university and take another course.

That has been my mission since the day I taught my first class: to connect these two worlds.

Hello, I'm Alexander Rodrigues Silva, SEO specialist and author of the book "Semantic SEO: Semantic Workflow". I've worked in the digital world for over two decades, focusing on website optimization since 2009. My choices have led me to delve into the intersection between user experience and content marketing strategies, always with a focus on increasing organic traffic in the long term. My research and specialization focus on Semantic SEO, where I investigate and apply semantics and connected data to website optimization. It's a fascinating field that allows me to combine my background in advertising with library science. In my second degree, in Library and Information Science, I seek to expand my knowledge in Indexing, Classification, and Categorization of Information, seeing an intrinsic connection and great application of these concepts to SEO work. I have been researching and connecting Library Science tools (such as Domain Analysis, Controlled Vocabulary, Taxonomies, and Ontologies) with new Artificial Intelligence (AI) tools and Large-Scale Language Models (LLMs), exploring everything from Knowledge Graphs to the role of autonomous agents. In my role as an SEO consultant, I seek to bring a new perspective to optimization, integrating a long-term vision, content engineering, and the possibilities offered by artificial intelligence. For me, SEO work is a strategy that needs to be aligned with your business objectives, but it requires a deep understanding of how search engines work and an ability to understand search results.

Semantic Blog
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognizing you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.