samwellwang

samwellwang

coder
twitter

ES Query Statement

Elasticsearch#

Elasticsearch is a distributed search engine based on Lucene, which is the search engine that can quickly and accurately find a large amount of data. Its query language DSL is not just powerful, supporting various query methods such as bool, match, term, must, should, etc., and can also perform aggregation queries and painless scripts. Today, let's learn how to use these statements in ES to find the desired data and make our work more efficient. For example, if you want to find a customer's order records or view sales data within a certain time period, ES can easily handle it. In short, ES is not just easy to use. Once you learn these syntax and techniques, you will be more proficient in your CRUD operations! (No need to modify other people's query DSL anymore)

  1. Query Syntax

The query syntax of ES mainly includes the following parts: query type, query condition, filter condition, and sort condition. Among them, the query type includes match, term, bool, etc., the query condition specifies the fields and keywords to be queried, the filter condition is used to filter the query results, and the sort condition is used to sort the query results.

1.1 match query

The match query is the most commonly used query method in ES, used to find documents that contain the specified keyword in the specified field. For example, to find documents that contain the keyword "Elasticsearch", you can use the following DSL statement:

{
  "query": {
    "match": {
      "content": "Elasticsearch"
    }
  }
}

Where "content" is the field name to be queried, and "Elasticsearch" is the keyword to be searched. If you want to search for keywords in multiple fields, you can use the following statement:

{
  "query": {
    "multi_match": {
      "query": "Elasticsearch",
      "fields": ["title", "content"]
    }
  }
}

1.2 term query

The term query is used to match the value of a specific field exactly. For example, to find documents with the "status" field value of "published", you can use the following DSL statement:

{
  "query": {
    "term": {
      "status.keyword": "published"
    }
  }
}

Note that if you want to query a keyword type field, you need to add ".keyword" after the field name.

1.3 bool query

The bool query is used to combine multiple query conditions and supports three logical relationships: must, should, and must_not. For example, to find documents where the "title" field contains the keyword "Elasticsearch" and the "status" field value is "published", you can use the following DSL statement:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "Elasticsearch" } },
        { "term": { "status.keyword": "published" } }
      ]
    }
  }
}

Note that must and should cannot be used together. :)

  1. Aggregation Query

Aggregation query is a very useful feature in ES, which can perform grouping, statistical, and sorting operations on query results. ES supports various aggregation methods, including terms, range, date_histogram, etc.

2.1 terms aggregation

The terms aggregation is used to group and count a field. For example, to group and count the "category" field, you can use the following DSL statement:

{
  "aggs": {
    "group_by_category": {
      "terms": {
        "field": "category.keyword"
      }
    }
  }
}

Where "group_by_category" is the name of the aggregation, and "category.keyword" is the field to be aggregated.

2.2 range aggregation

The range aggregation is used to perform range statistics on a field. For example, to perform range statistics on the "price" field, you can use the following DSL statement:

{
  "aggs": {
    "price_range": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 50 },
          { "from": 50, "to": 100 },
          { "from": 100 }
        ]
      }
    }
  }
}

Where "price_range" is the name of the aggregation, "price" is the field to be aggregated, and "ranges" specifies the ranges to be counted.

  1. Advanced Techniques

In addition to basic query and aggregation operations, ES also supports some advanced techniques, including painless scripts and nested queries.

3.1 painless script

Painless script is a built-in scripting language in ES, which supports Java-like syntax and provides a rich API interface. By using painless scripts, more complex data processing and calculation operations can be achieved. For example, to add a new field "score" in the query results, you can use the following DSL statement:

{
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "score": {
      "script": {
        "lang": "painless",
        "source": "_score * doc['price'].value"
      }
    }
  }
}

Where "_score" represents the document score, and "doc['price'].value" represents getting the value of the "price" field.

3.2 nested query

Nested query is used to query sub-documents nested in documents. For example, in a blog system, each article may contain multiple comments, and each comment is a sub-document nested in the article. To query articles that contain "Elasticsearch" in all comments, you can use the following DSL statement:

{
  "query": {
    "nested": {
      "path": "comments",
      "query": {
        "match": {
          "comments.content": "Elasticsearch"
        }
      }
    }
  }
}

Where "comments" is the name of the sub-document, and "content" is the field to be queried in the sub-document.

  1. Notes

When using ES for CRUD operations, pay attention to the following points:

  • When searching for keyword type fields, you need to add ".keyword" after the field name.
  • When performing nested queries, use the nested query.
  • Must and should cannot be used together.
  • When using painless scripts, pay attention to security and performance issues.

In summary, when using ES for CRUD operations, you need to be proficient in the syntax and techniques of its DSL statements, and pay attention to avoiding possible pitfalls. I hope this article can be helpful to you!

  1. References
    The above version of ES is 6.8.17, and I have also used version 7 of ES to support searching for vector type fields. I will write a separate article about it in the future.
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.