New Proposal for Content Management on the Web

New Proposal for Content Management on the Web

—

by

Alexander Rodrigues Silva

in Information Retrieval, Semantic SEO, Web content management

This article proposes a new methodology for planning and managing web content optimized for searches as an alternative to the model focused on search intent created by Google. I will criticize the search intent model, through a brief historical analysis of the documentation created by the company, based on the cognitive paradigm of Information Sciences.

I will describe Kuhlthau’s information search process, its influences, and possible implications in the creation of content for the Web, based on the book Manual de Estudos deUser da Informação by Murilo Bastos da Cunha, Sueli Angelica do Amaral and Edmundo Brandão Dantas. And propose an alternative methodology to the model focused on search intent.

Conteúdos

Criticism of the model by search intent

Currently, the model that seeks to understand why people search online, based on the intention behind each user’s initiative, is a standard in content creation and SEO.

This model, despite its unequivocal contribution, has a problem that has not been explored by professionals of our area: the use of only one dimension in understanding the information search process (the intention) not only limits our understanding but also ignores other aspects of the subjectivity of those looking for information, mainly on the web.

But first, let’s look to the past and understand how we got here.

Historical analysis

Google started using search intent to try to predict what each search meant, as we can see in the Search Quality Evaluator Guidelines documentation.

According to the Search Engine Journal website, there is a science behind the use of intent in retrieving information on the web. The article Utilizing Search Intent in Topic Ontology-based User Profile for Web Mining by Xujuan Zhou, Sheng-Tang Wu, Yuefeng Li, Yue Xu, Raymond Y.K. Lau, and Peter D. Bruza researches the effectiveness of information retrieval processes. It demonstrates a problem using generic user profiles in Web Mining systems, proposing using search intent to enrich these profiles.

Reading the Research Abstract helps us to understand better:

It is common knowledge that taking into account web user profiles can increase the effectiveness of web mining systems. However, due to network users’ dynamic and complex nature, the automatic acquisition of (user) profiles that bring value was very challenging. Ontology-based user profiles can provide information about users that is more accurate. This research emphasizes the acquisition of information about search intentions. This article presents a new approach to developing user profiles for web searches. The model considers the user’s search intentions through the PTM (Pattern-Taxonomy Model) process. Initial experiments show that the user profile based on search intent is more valuable than the generic PTM user profile. Developing a user profile that captures the user’s search intent is essential for effective web search and retrieval. – Translation by me.

In the document, we saw that search segments search intent into two objectives at a basic level:

When the user uses a term and searches for specific information about that term;
When the user uses a term to represent a topic and therefore wants more general information.

Let’s take a simple example:

I need to buy a new refrigerator for my house. My first search can be done using refrigerators, coolers, or fridges. Simply because I need to explore the subject.

After looking in several places, I returned to the search and used a more specific term: 40-liter stainless steel refrigerator.

This research was used as the basis of studies to understand users’ search intent and the impact on search effectiveness.

I would like to highlight another part of the research:

Currently, web search does not consider user search intent. Most search engines disregard the web user’s profile. User profiles are an essential metadata source for Information Retrieval (IR) processes. To improve accuracy and increase information access efficiency, the Web search process needs to evolve further with the ability to incorporate the user’s search intent. However, valuable web user profiles are difficult to acquire without manual intervention.

I want to emphasize in this excerpt the concept of precision in information retrieval. As quoted by Araújo Júnior in Precision in the process of searching and retrieving information:

Precision is a fundamental concept for evaluating the quality of information retrieval; at the same time that it represents the measure of interest (helpful information) of what was found in the process of search and retrieval of information for the user, which qualifies the information retrieved as useful or useless according to your needs.

In the same book, Araújo Júnior recalls that Lancaster (1998) defines precision “as the extent to which items retrieved in searching and retrieving information in a database are considered useful. High accuracy is given when most or all retrieved items (…) are considered useful.”

The point here is that we must go beyond understanding the results page, whether it returned the millions of results it should return and in how many milliseconds. We need to know if those results were relevant to those who made those searches.

How to know this? We’ll talk more about that later.

In 2016, Paul Haahr (one of Google’s top search engineers, and has been with the company for over a decade), and Gary Illyes (Webmaster Trends Analyst) give a presentation called “How Google returns results from a ranking engineer’s perspective” or “How Google returns results from a ranking engineer’s perspective,” which can be seen here:

How Google returns results from a ranking engineer’s perspective

I did some research, and here’s a rundown for you:

In the presentation, Illyes and Haahr provide an engineer’s view of how Google classifies information, makes changes to the algorithm, and how this process determines the retrieval of information in the SERP, showing how challenging it is to provide the most relevant results to users.

In this presentation, we hear about the famous 200+ ranking signals that determine the order in search results and that search algorithms constantly evolve.

We also hear about the small algorithm changes made almost daily and the core updates that can profoundly change the SERP.

Despite the presentation being old, I recommend watching it because it contains the basic process of a recovery tool with our search engines: tracking, analysis of tracked documents, classification, and recovery. It doesn’t change, and it won’t change.

(I know what you’re thinking: what about Google Bard? Well, it’s not an information retrieval tool; it generates information)

In summary, this was the study’s approach that inspired, so to speak, Google to expand this model and create the four search intents.

Google documentation description

As is well known, Google uses human reviewers to do quality control on the information it retrieves. And these evaluators have extensive documentation that they need to follow to ensure the quality of their work.

I don’t intend to describe all the processes detailed in the documentation (which every SEO professional should read). For this article, let’s focus on the process that uses the user’s search intent.

The excerpt I want to highlight begins by analyzing the meaning they gave to the term “multiple meanings of a search” with a visual example, which I reproduce below:

Searches with multiple meanings

Queries with Multiple Meanings

Many queries have more than one meaning. For example, the query [apple] might refer to the computer brand or the fruit. We will call these possible meanings query interpretations. Dominant Interpretation: The dominant interpretation of a query is what most users mean when they type the query. Not all queries have a dominant interpretation. The dominant interpretation should be clear to you, especially after doing a little web research. Common Interpretation: A common interpretation of a query is what many or some users mean when they type a query. A query can have multiple common interpretations. Minor Interpretations: Sometimes you will find less common interpretations. These are interpretations that few users have in mind. We will call these minor interpretations.

They divide searches (queries) concerning meanings into four segments:

Queries with multiple meanings: Many queries have more than one meaning. For example, the query [maçã] might refer to the make of the computer or the fruit. We will call these possible meanings query interpretations.
Dominant Interpretation: The dominant interpretation of a query is what most users mean when they type the query. Not all queries have a dominant interpretation. The dominant interpretation should be clear to you, especially after web research.
Common Interpretation: A common interpretation of a query is what many or some users mean when they type a query. A query can have several common interpretations.
Lesser interpretations: Sometimes, you will find less common interpretations. These are interpretations that few users have in mind. We will call these minor interpretations.

And they continue to claim that searches change over time, and they give us the example of the former US president, Bush.

In 1994 the search for the president of the United States of America returned information about George Bush. Already in 2004, she returns about George Bush, but they are not the same people. The first was the father of the second.

Search for President Bush

It was to solve a semantic problem, to understand the meaning behind each query, that Google’s engineers sought a way to understand what each search brings, but that the system could not take into account before.

Seeking to understand the intention behind each search was the path found. Which at that time, with the resources of algorithms, was the right solution.

In my view, taking into account the challenges of a search engine, it is an intelligent thing to do, as it is a simple approach, which reduces the number of variables to be taken into account, is programmable, and makes the work of evaluators more standardized.

As we all know, the intents used by Google are:

Queries to know something: some of which are simple queries of the best known;
Query to do something: when the user is trying to reach a goal or engage in an activity;
Query on sites: when the user searches for a specific site or page;
Face-to-face inquiries: some are looking for a specific company or organization, others are looking for a category of companies.

respond to all these types of searches, trying to understand the search intentions of its visitors.

Companies have created platforms that try to infer, based on the most diverse criteria, how they can categorize each search within these four intentions, which were summarized as follows:

Navigational;
Informational
Commercial;
Transactional;

I see that the simplicity of this approach is also its weakness. The problem with this approach is that it only considers one aspect of people’s motivations in seeking information and summarizes it under the label of intention without even bothering to define what intention means for this model.

This article is not intended to debate or determine Google’s intention; that’s a task its researchers should have done. Or, if they did, publish it in the documentation. There are 371 occurrences of the word intent in the documentation, and none deals with the definition or meaning of the term.

The wide dissemination of this Google model generated a blind race in the content creation market, an unbridled quest to understand the user’s intention without taking the trouble to reflect on what it means.

Criticism of the model

As I discussed briefly earlier in this text, I understand that the search intent model was necessary for its pioneering nature but that it has weaknesses that can be resolved with the use of more robust models.

I need to raise a point that, for me, helps us understand the process of searching for information.

The beginning of this process is triggered by what Brenda Dervin called, in 1983, “cognitive void,” which happens when we are in a situation where the answers we have to our questions are no longer sufficient to fill this gap between our current knowledge and the what we need to know to overcome this void.

Information Gap

Extracted from the book “Manual de Estudos de Usuário da Informação” (Information User Studies Manual)

This feeling of uncertainty leads us to wander in search of information that already exists but is not in our set of known information.

The understanding of search intentions does not consider this void nor the feelings that are triggered by it.

Therefore, this article proposes an alternative to this model based on the work of an American researcher and educator, Carol Kuhlthau, who understands the importance of identifying the information needs of different user segments.

The identification of the user’s information needs is the vital point of my proposal, and we agree with Cunha (2015) when he states that this is an essential step towards understanding the strategic and preponderant objectives in the delimitation of the collection development policy and to know whether the demands are met (emphasis mine).

In our work environment, we must understand that the collection is all types of content created and made available for users (current, potential, and non-users) to access.

Therefore, once again, bringing Information Science studies to our job market: we need to understand the information needs of those looking for what we are offering on the web since the contents we publish on this network are our collection, which needs the most appropriate policies for their development.

This article does not deal with this broad subject but uses the model created by Kuhlthau to start this process.

Who is Carol Kuhlthau?

Carol Collier Kuhlthau (born December 2, 1937) is a retired American educator, researcher, and international speaker on learning in school libraries, information literacy, and information-seeking behavior.

Biography

Kuhlthau was born in New Brunswick, New Jersey, USA. Kuhlthau graduated from Kean University in 1959. She completed a Masters’s in Library Science at Rutgers University in 1974 and a Doctor of Education in 1983. She was in the Department of Library and Information at Rutgers University Professor of Science for over 20 years and has been Professor Emeritus since 2006.

Carol Kuhlthau founded the Center for International Scholarship in School Libraries.

Kuhlthau’s information search process

In theoretical terms, this article, and the proposals we make in it, are linked to the cognitive paradigm, on which Ellis, Dervin, Kuhlthau, and Wilson are the researchers chosen as a basis.

We warn that care must be taken not to innovate in theory and repeat previous models in an application, as Carlos Alberto Ávila Araújo warns in his text User Studies: Theoretical Plurality, Diversity of Objects on page 8:

The cognitive model of these studies, by privileging the understanding of the need for information from a gap, from a lack of specific knowledge to perform a certain activity ends up imprisoning a way of understanding users as beings endowed with a particular need that would be satisfied by a specific source of information.

For this reason, this article, in addition to opposing the content creation model for the web-based on search intent to our proposal, proposes a new content creation model, which needs to be tested and validated in practice.

Influence of Jean Piaget and Brenda Dervin on Kuhlthau

1991 Kuhlthau’s model was introduced under Information Search Process (ISP). It describes how feelings, thoughts, and actions are present in six information-seeking stages.

The ISP model is based on the work of Jean Piaget, that is, on the four stages of cognitive development in children: sensorimotor, preoperative, concrete operational, and formal operational.

Kuhlthau introduced the holistic experience of information seeking from the individual’s perspective, emphasizing the vital role of affect in information seeking, and proposed the uncertainty principle as a conceptual framework for libraries and information services.

I understand that a search engine falls under the category of information services.

Kuhlthau’s work is among the most cited by library and information science professors and is one of the most used conceptualizations by researchers in the area.

The ISP model represents a watershed in developing new strategies for understanding how people search for information.

Information Search Process (ISP)

Also within the scope of the alternative approach, Carol Kuhlthau (1991) defended the constructivist process (Constructive Process Approach) to carry out user studies and developed the model of the information search process (Information Search Process – ISP), represented in the Figure below, extracted from the book Information User Studies Manual (Manual de Estudos de Usuário da Informação).

Figure 5.8 – Information search process

Stages in ISP	Feelings Common to Each Stage	Thoughts Common to Each Stage	Actions Common to Each Stage	Appropriate Tasks
1. Initiation–e-column-delimiter/1c660ffafb7a, > uncertainty	General / vague	Search for pre-existing information	Recognize
2. Selection	Optimism			Identify
3. Exploration	Confusion/Frustration/Doubt		Seeking Relevant Information	Investigate
4. Formulation	Clarity	Narrowed/ Clearer		Formulation
5. Collection	Sense of direction/Confidence	Increased interest	Seeking Relevant or Focused Information	Gather
6. Presentation	Relief/Satisfaction or Disappointment	Clearer or Focused		Complete

Source: HLTHAU (1991, p. 367)

emotional reactions. For the authors, during the information search process (uncertainty principle), Kuhlthau considers that the level of uncertainty fluctuates and can be observed in six stages, divided into three fields of experience: emotional, cognitive, and physical:

The initiation stage, when there is recognition of the need for information;
The selection stage begins the work of delimiting the field or topic of investigation;
The exploration stage of documents on the topic, leading to an expansion of the general topic (for example, reading secondary sources);
The formulation stage, in which the focus or perspective of the problem is established;
The collection stage through interaction with information systems and services to gather information;
And the presentation stage, the “end” of the search, and the “solution” of the problem.

According to González-Teruel (2005, p. 72), Kuhlthau’s information search process model identifies the need for information with the state of uncertainty that commonly causes anxiety and lack of confidence. Uncertainty is a natural state for the researcher, especially in the early stages of the information search process.

Rolim and Cendón (2013) argue that the steps can be viewed from the dynamic nature of the information search process, as there is the construction of knowledge and meaning in this process. The formulation of a focus of interest affects the search process because it is necessary to interpret the existing information to establish the focus. The nature of the information found alters the user’s position because if the information is redundant, it can generate annoyance. Still, new information can require reconfiguring unavailable knowledge, causing anxiety. The user’s attitude influences the search result, as their search implies personal choices, and interest increases as the focus is defined and the search progresses.

It is interesting to notice in this passage the reciprocity between our needs and what we find when we search for information to meet these needs. Feelings such as anxiety, frustration, annoyance, enthusiasm, euphoria, and joy are generally not considered when we study our visitors’ searches, but they are part of each searcher.

Proposal of a new methodology for planning search-optimized content

Based on this approach, we describe and propose a new content creation model based on Kuhlthau’s research (1991), which I describe below. First of all, I want to go into detail to explain some important points, mainly because the model I describe is an adaptation of Kuhlthau’s work.

Adaptation of the information search process

In the table below, in green, are the steps I selected for adapting the ISP to create content optimized for searches.

Stages in ISP	Feelings Common to Each Stage	Thoughts Common to Each Stage	Actions Common to Each Stage	Appropriate Tasks
Initiation	Uncertainty	General / vague	Seeking Background Information	Recognize
Selection	Optimism			Identify
Exploration	Confusion/Frustration/Doubt		Seeking Relevant Information	Investigate
Formulation	Clarity	Narrowed/ Clearer		Formulation
Collection	Sense of direction/Confidence	Increased interest	Seeking Relevant or Focused Information	Gather
Presentation	Relief/Satisfaction or Disappointment	Clearer or Focused		Complete

Source: HLTHAU (1991, p. 367), adapted by me.

The intention is to propose a new way of planning the creation of content based on three stages of the table process “Process of information search” proposed by Kuhlthau:

Initiation;
Exploration;
Collection.

All six phases are present in any information search process, but I believe the phases I have selected are perfectly aligned with content management on the web.

I understand that this model allows the creation of a structure that aligns the creation with the needs of our users, aligning them with a management and creation process that helps satisfy these informational needs at each search stage.

Our proposal aims to consider the importance of needs, desires, demands, expectations, attitudes, behaviors, and other practices in using information by users looking for information that creators can provide in their content.

But as opposed to, we quote Hewins (1990) apud Cunha (2015) to point out that

there are unique characteristics for each user and others that are common to several, so it will be necessary that the systems be developed considering the necessary flexibility so that they are adaptable to all users.

Of course, Hewins was talking about information retrieval systems. Still, we can broaden this sense and say that for our proposal to be minimally effective, we need to develop a model that can be adapted to both cases: specific audiences and a general audience.

Taylor, in 1968, talking about the questions that library users made, established an interesting relationship between the questions asked and the real needs of users.

This work generated a classification of information needs that is interesting for our work:

visceral,
conscious,
formalized,
committed.

visceral need

It may be related to a vague dissatisfaction, but that feeling is not strong enough to prompt a question. But this state can change if the person has access to some information that changes this feeling and encourages him to ask the question.

Conscious need

There is confusion on a mental level that somehow influences the formulation of the question.

Formalized need

When the individual manages to resolve this confusion to a certain degree, he formulates the question that triggers the process of searching for information.

Committed Need

It is the query that has been modified or reworked so that it can be understood by an information system that cannot handle, for example, a natural language search. Here the user needs to adapt to the system and not vice versa.

When Kotler (2000, p. 43) deals with the subject from the point of view of marketing, he warns that there are these five types of needs:

Stated needs (the customer wants a cheap TV);
Real needs (the customer wants a TV that lasts a long time but doesn’t care about the price);
Unstated needs (the customer expects good service from the seller);
Delight Needs (the customer would like the salesperson to include a sound as a bonus);
Secret needs (the customer wants to be seen by his friends as a smart consumer).

With that in mind, we have created the model below, which aims to generate models for managing and creating content that considers the phases already described by us: initiation, exploration, and collection of information.

I want to remind you that the most important thing is not the model, which can and should be constantly adapted, updated, and improved, but new thinking about how we create content on the web.

A new kind of paradigm that is motivated by the needs of the people who need the services and products your company wants to deliver to them.

I want to recall our theoretical basis for creating the model below, with this table:

Stages in ISP	Feelings Common to Each Stage	Thoughts Common to Each Stage	Actions Common to Each Stage	Appropriate Tasks
Initiation	Uncertainty	General / vague	Seeking Background Information	Recognize
Exploration	Confusion/Frustration/Doubt		Seeking Relevant Information	Investigate
Collection	Sense of direction/Confidence	Increased interest	Seeking Relevant or Focused Information	Gather

Source: HLTHAU (1991, p. 367), adapted by me.

Briefing creation template

Create content that {RESOLVES SENTIMENT} and generates {APPROPRIATE TASKS} about {ACTIONS} to reduce {SENTIMENTOS} visitors’ feeling that they are in this {STAGE}.

Example:

Create specific and punctual content that generates awareness about pre-existing information in order to reduce the uncertainty that visitors who are in the initiation stage feel.

The model below should be used in creating content briefings to create in line with the users’ feelings, the stage they are in, and how their content should be generated to meet these needs.>

In conclusion. For now.

To conclude this first version of my proposal that connects SEO, Content Management, and the ISP Model, I want to say that those of us who work with and on the Web have much to learn from Librarianship and Information Sciences.

An enormous body of knowledge generated over centuries of studies in this area is closely linked to our work. Understanding this connection is what led me, almost 50 years old, to re-enter university and take another course.

My mission since I took my first class has been to connect these two worlds.