目录
两种检索方式
Query DSL
match_all
match
match_phrase
multi_match
bool
filter
term & .keyword
aggregations
两种检索方式
URL+检索参数
GET /bank/_search?q=*&sort=account_number:asc
URL+请求体
GET /bank/_search {"query": {"match_all": {}},"sort": [{"account_number": "asc"}] }
hits:检索结果
hits.hits --搜索结果数组
Query DSL
Domain Specific Language ——是Elasticsearch中用于构建复杂查询的JSON格式语言。
基本结构
{
QUERY_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
针对某字段时
{
QUERY_NAME:{
FIELD_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
}
match_all
GET /bank/_search {"query": {"match_all": {}},"sort": [{"account_number": {"order": "asc"}}],"from": 0,"size": 5,"_source": ["account_number","balance"] }
match
仅支持单个字段feild,全文检索,分词匹配,倒排索引
GET /bank/_search
{"query": {"match": {"address": "mill lane"}},"_source": ["account_number","address"]
}
match_phrase
短语匹配,不分词,将检索条件当作一个完整的单词
GET /bank/_search
{"query": {"match_phrase": {"address": "mill lane"}},"_source": ["account_number","address"]
}
multi_match
多字段匹配,分词
GET /bank/_search
{"query": {"multi_match": {"query": "mill Movico", "fields": ["address","city"]}},"_source": ["account_number","address","city"]
}
bool
复合查询,可以合并其他查询语句
GET /bank/_search
{"query": {"bool": {"must": [{"match": {"gender": "M"}},{"match": {"address": "mill"}}],"must_not": [{"match": {"age": "28"}}],"should": [{"match": {"firstname": "winnie"}}]}}
}
must:必须符合列举的所有条件
must_not:必须不符合
should:可以符合也可以不符合 列举的条件---影响相关性得分
filter
不产生分数的查询条件,相当于 不加分的must
GET /bank/_search
{"query": {"bool": {"must": [{"match": {"gender": "M"}},{"match": {"address": "mill"}}],"must_not": [{"match": {"age": "28"}}],"should": [{"match": {"firstname": "winnie"}}],"filter": {"range": {"balance": {"gte": 40000,"lte": 50000}}}}}
}
filter引入后,对比引入前,命中结果减少,但相关性得分不变
term & .keyword
精确匹配,直接匹配字段的原始值,不进行任何分词或分析。
适用于非文本字段,比match稍快
“由于ES在保存text字段时,会进行分词,用term去精确匹配一个完整text是非常困难的”
非文本字段用term
GET /bank/_search
{"query": {"term": {"account_number": {"value": "136"}}}
}
文本字段的精确匹配用 .keyword
GET /bank/_search
{"query": {"match": {"address.keyword": "198 Mill Lane"}}}
}
aggregations
执行聚合,用于对数据进行统计分析和分组。类似于 SQL 中的 GROUP BY
和聚合函数(如 SUM
、AVG
、COUNT
等)。
- Bucket Aggregations(桶聚合),将doc分到不同的桶中,每个桶代表一个分组。
- Metric Aggregations(指标聚合),统计,如总和、平均值、最大值、最小值等。
- Pipeline Aggregations(管道聚合),对其他聚合的结果进行二次计算。
GET /bank/_search
{"query": {"match_all": {}},"size": 0, "aggs": {"balanceAvg":{"avg": {"field": "balance"}},"ageAgg": {"terms": {"field": "age","size": 10},"aggs": {"balanceAvg":{"avg": {"field": "balance"}},"genderAgg": {"terms": {"field": "gender.keyword","size": 10}, "aggs": {"balanceAvg": {"avg": {"field": "balance"}}}}}}}
}
terms:桶聚合,分组
avg:指标聚合,统计平均值
计算所有员工的balance平均值,
先依据年龄分组,按年龄计算balance平均值,
再嵌套性别分组,年龄性别分组后计算balance平均值