本篇内容主要讲解“怎么使用Elasticsearch中的Match_phrase查询”,感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习“怎么使用Elasticsearch中的Match_phrase查询”吧!
新建索引:
PUT test_phrase
设置索引mapping:
PUT /test_phrase/_mapping/_doc
{
"properties": {
"name": {
"type":"text"
}
}
}
结果:
{
"mapping": {
"_doc": {
"properties": {
"name": {
"type": "text"
}
}
}
}
}
插入数据:
PUT test_phrase/_doc/2
{
"name":"我爱北京天安门"
}
查询数据:
POST test_phrase/_search
{
"query": {"match_all": {}}
}
结果:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_phrase",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "我爱北京天安门"
}
},
{
"_index" : "test_phrase",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "王乃康"
}
}
]
}
}
查看分词词项:
POST test_phrase/_analyze
{
"field": "name",
"text": "我爱北京天安门"
}
结果:
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "爱",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "北",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "京",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<IDEOGRAPHIC>",
"position" : 3
},
{
"token" : "天",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<IDEOGRAPHIC>",
"position" : 4
},
{
"token" : "安",
"start_offset" : 5,
"end_offset" : 6,
"type" : "<IDEOGRAPHIC>",
"position" : 5
},
{
"token" : "门",
"start_offset" : 6,
"end_offset" : 7,
"type" : "<IDEOGRAPHIC>",
"position" : 6
}
]
}
POST test_phrase/_search
{
"query": {
"match_phrase": {
"name": {
"query": "我"
}
}
}
}
结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "test_phrase",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.2876821,
"_source" : {
"name" : "我爱北京天安门"
}
}
]
}
}
分析:
POST test_phrase/_analyze
{
"field": "name",
"text": "我"
}
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
}
]
}
查询分词"我"的position位置是0,首先文档"我爱北京天安门"的索引分词中有"我"且position为0,符合短语查询的要求,因此可以正确返回。
POST test_phrase/_search
{
"query": {
"match_phrase": {
"name": {
"query": "我爱"
}
}
}
}
结果:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.5753642,
"hits" : [
{
"_index" : "test_phrase",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.5753642,
"_source" : {
"name" : "我爱北京天安门"
}
}
]
}
}
分析:
POST test_phrase/_analyze
{
"field": "name",
"text": "我爱"
}
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "爱",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
}
]
}
查询分词"我爱"的position分别是"我"-0、"爱"-1,首先索引分词中也存在"我"、"爱"词项,其次"我"-0、"爱"-1的position也服务要求,因此可以正确返回。
POST test_phrase/_search
{
"query": {
"match_phrase": {
"name": {
"query": "我北"
}
}
}
}
结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
分析:
POST test_phrase/_analyze
{
"field": "name",
"text": "我北"
}
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "北",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
}
]
}
查询分词中"我"的position是0,"北"的position是1,索引分词中"我"的position是0,"北"的position是2,
虽然查询分词的词项在索引分词的词项中都存在,但是position并未匹配要求,导致搜索结果不能正确返回。
修正:"slop": 1
POST test_phrase/_search
{
"query": {
"match_phrase": {
"name": {
"query": "我北",
"slop": 1
}
}
}
}
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.37229446,
"hits" : [
{
"_index" : "test_phrase",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.37229446,
"_source" : {
"name" : "我爱北京天安门"
}
}
]
}
}
我们可以将一个简单的 match
查询作为一个 must
子句。 这个查询将决定哪些文档需要被包含到结果集中。 我们可以用 minimum_should_match
参数去除长尾。 然后我们可以以 should
子句的形式添加更多特定查询。 每一个匹配成功的都会增加匹配文档的相关度。
GET /my_index/my_type/_search
{
"query": {
"bool": {
"must": {
"match": { #must 子句从结果集中包含或者排除文档
"title": {
"query": "quick brown fox",
"minimum_should_match": "30%"
}
}
},
"should": {
"match_phrase": { #should 子句增加了匹配到文档的相关度评分。
"title": {
"query": "quick brown fox",
"slop": 50
}
}
}
}
}
}
到此,相信大家对“怎么使用Elasticsearch中的Match_phrase查询”有了更深的了解,不妨来实际操作一番吧!这里是亿速云网站,更多相关内容可以进入相关频道进行查询,关注我们,继续学习!
亿速云「云服务器」,即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘,价格低至29元/月。点击查看>>
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。
原文链接:https://my.oschina.net/u/3727895/blog/4492313