Pinned and More Like This Queries

Elasticsearch has a handful of advanced queries dedicated to serving specialized functions. For example, finding similarly looking documents using more_like_this query or giving a few documents a bit more importance using pinned query, and so on. We learn about these two queries in this short article.

Pinned query

You may have seen a few sponsored search results appearing at the top of the result set when querying your favorite e-commerce website such as Amazon. Suppose we want to implement such functionality in our application using Elasticsearch. Well, fret not; a pinned query is at hand.

The pinned query helps to add chosen documents to the result set so they appear at the top of the list. This happens by making their relevance scores higher than others. Let’s quickly look at an example query, given in the following listing, that demonstrates this functionality.

GET iphones/_search
{
"query": {
"pinned":{
"ids":["1","3"],
"organic":{
"match":{
"name":"iPhone 12"
}
}
}
}
}

The pinned query in the listing has a few moving parts: organic and ids blocks.

Let’s look at the organic block first. It is the query block that houses the search query; in this case, we are searching for iPhone 12 in our iphones index. This query ideally should return the two documents: iPhone 12 and iPhone 12 Mini.

However, when you run this query with the data (checkout my GitHub repo for the queries and data), you receive documents iPhone and iPhone 13 (these two are not iPhone 12!) in addition to iPhone 12 and iPhone 12 Mini.

The reason for this is the ids field. This field encloses the additional documents that must be appended to the results and shown at the top of the list (the sponsored results), thus creating higher relevance scores synthetically.

The pinned query helps add additional high-priority documents with the results sets. These documents trump others in the list position to create sponsored results.

You may be wondering if the pinned results have any scoring: can one or some of the pinned results be prioritized over the other(s)? Unfortunately, the answer is no. These documents are presented in the order of IDs as input by us in the query: "ids":["1", "3"], for example.

The other query we look here in this article is “more like this”, topic of the next section.

Looking at the More Like This (more_like_this) query

You may have noticed on Netflix or Amazon Prime Video (or one of your favorite streaming apps) showing you More Like This movies when you browse one of them. For example, figure 12.13 shows all More Like This movies when I visit Paddington 2.

Figure 12.13 Viewing More Like This movies

One of the requirements for users is to search “similar” or “like ‘’ in some documents. For example, researching papers similar to COVID and SARS, or querying movies like The Godfather. Let’s jump right in to an example to understand the use case better.

Let’s say that we are collecting a list of profiles about some people. To create a set of profiles, we index sample documents into the profiles index as the code in the following listing demonstrates.

PUT profiles/_doc/1
{
"name":"John Smith",
"profile":"John Smith is a capable carpenter"
}

PUT profiles/_doc/2
{
"name":"John Smith Patterson",
"profile":"John Smith Patterson is a pretty plumber"
}

PUT profiles/_doc/3
{
"name":"Smith Sotherby",
"profile":"Smith Sotherby is a gentle painter"
}
PUT profiles/_doc/4
{
"name":"Frances Sotherby",
"profile":"Frances Sotherby is a gentleman"
}

There’s nothing surprising about these documents; they’re just profiles about a bunch of routine people. Now that we have these documents indexed, let’s find out how we can ask Elasticsearch to fetch documents that are similar to the text gentle painter or to capable carpenter or even retrieve documents with the similar name, Sotherby.

That’s exactly what the more_like_this query helps us with. The next listing creates a query to search profiles more like Sotherby.

GET profiles/_search
{
"query": {
"more_like_this": {
"fields": ["name", "profile"],
"like": "Sotherby",
"min_term_freq": 1,
"max_query_terms": 12,
"min_doc_freq":1
}
}
}

The more_like_this query accepts text in a like parameter, where this input text is matched against the given fields mentioned in the fields parameter. The query accepts a few tuning parameters such as minimum term and document frequency (min_term) and the maximum number of terms (max_query_terms) that the query should select. If we want to give the user a better experience when showing similar documents, the more_like_this query is the right choice.

That’s pretty much about pinned and more_like_this queries.

Don’t clap 🙂

Or don’t follow me 🙂

These short articles are condensed excerpts taken from my book Elasticsearch in Action, Second Edition. The code is available in my GitHub repository.

Elasticsearch in Action