本文由 简悦 SimpRead 转码, 原文地址 blog.csdn.net
前言
- 学习视频:Java 项目《谷粒商城》架构师级 Java 项目实战,对标阿里 P6-P7,全网最强
- 学习文档:
接口文档:谷粒商城接口文档
本内容仅用于个人学习笔记,如有侵扰,联系删除
一、 ELASTICSEARCH
0、简介
Elasticsearch:官方分布式搜索和分析引擎 | Elastic
全文搜索属于最常见的需求,开源的 Elasticsearch 是目前全文搜索引擎的首选。
它可以快速地储存、搜索和分析海量数据。维基百科、Stack Overflow、Github 都采用它
Elastic 的底层是开源库 Lucene。但是,你没法直接用 Lucene,必须自己写代码去调用它的接
口。Elastic 是 Lucene 的封装,提供了 REST API 的操作接口,开箱即用。
REST API:天然的跨平台。
官方文档:Elasticsearch Guide [8.13] | Elastic
官方中文:序言 | Elasticsearch: 权威指南 | Elastic
社区中文:
https://es.xiaoleilu.com/index.html
Getting Started(入门指南) - elasticsearch 中文文档
1、基本概念
1.1、Index(索引)
动词,相当于 MySQL 中的 insert;
名词,相当于 MySQL 中的 Database
1.2、Type(类型)
在 Index(索引)中,可以定义一个或多个类型。
类似于 MySQL 中的 Table;每一种类型的数据放在一起;
1.3、Document(文档)
保存在某个索引(Index)下,某种类型(Type)的一个数据(Document),文档是 JSON 格式
的,Document 就像是 MySQL 中的某个 Table 里面的内容;
1.4、倒排索引机制
2、Docker 安装 ES
dokcer 中安装 elastic search
2.1、下载镜像文件 (ealasticsearch 和 kibana)
docker pull elasticsearch:7.6.2 # 存储和检索数据
docker pull kibana:7.6.2 # 可视化检索数据
2.2、创建实例
2.2.1、ElasticSearch
1、配置
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/
2、启动 Elastic search
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.6.2
问题:发现 192.168.119.127:9200 不可访问 服务不断地退出
解决:保证权限
green open .kibana_task_manager_1 T87Lxcb5T22_HUNZ1Ak4QA 1 0 2 0 51.5kb 51.5kb
green open .apm-agent-configuration ps92glGfTkW6ID2ozvoofw 1 0 0 0 283b 283b
green open .kibana_1 CuxQb2nORlybswxvYfSQCA 1 0 5 0 22.6kb 22.6kb
3、测试
查看 elasticsearch 版本信息: http://192.168.119.127:9200/
显示 elasticsearch 节点信息 http://192.168.119.127:9200/_cat/nodes
127.0.0.1 15 96 0 0.06 0.10 0.10 dilm * 5da55b2ace4f
设置开机启动 elasticsearch
{
"name":"John Doe"
}
以后再外面装好插件重启即可;
特别注意:
-e ES_JAVA_OPTS="-Xms64m -Xmx256m" \ 测试环境下,设置 ES 的初始内存和最大内存,否则导致过大启动不了 ES
2.2.2、Kibana
{
"_index": "customer",
"_type": "extrnal",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
http://192.168.119.127:9200 一定改为自己虚拟机的地址
设置开机启动 kabana
{
"_index": "customer",
"_type": "extrnal",
"_id": "1",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"found": true,
"_source": {
"name": "John Doe"
}
}
访问 Kibana:http://192.168.119.127:5601
3、初步检索
3.1、_CAT
1)、GET /_cat/nodes:查看所有节点
如:http://192.168.119.127:9200/_cat/nodes :
DELETE customer/external/1
DELETE customer
注:* 表示集群中的主节点
2)、GET /_cat/health:查看 es 健康状况
如: http://192.168.119.127:9200/_cat/health
{action:{metadata}}\n
{request body }\n
{action:{metadata}}\n
{request body }\n
注:green 表示健康值正常
3)、GET /_cat/master:查看主节点
如: http://192.168.119.127:9200/_cat/master
POST /customer/extrnal/_bulk
{"index":{"_id":"1"}}
{"name":"John Doe"}
{"index":{"_id":"2"}}
{"name":"John Doe"}
4)、GET /__cat/_indicies:查看所有索引 ,等价于 mysql 数据库的 show databases;
如: http://192.168.119.127:9200/_cat/indices
#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
"took" : 85,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "customer",
"_type" : "extrnal",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "customer",
"_type" : "extrnal",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
}
]
}
3.2、索引一个文档
保存一个数据,保存在哪个索引的哪个类型下,指定用那个唯一标识
PUT customer/external/1; 在 customer 索引下的 external 类型下保存 1 号数据为
POST /_bulk
{"delete":{"_index":"website","_type":"blog","_id":"123"}}
{"create":{"_index":"website","_type":"blog","_id":"123"}}
{"title":"my first blog post"}
{"index":{"_index":"website","_type":"blog"}}
{"title":"my second blog post"}
{"update":{"_index":"website","_type":"blog","_id":"123"}}
{"doc":{"title":"my updated blog post"}}
192.168.119.127:9200/customer/extrnal/1
#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
"took" : 120,
"errors" : false,
"items" : [
{
"delete" : {
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 1,
"result" : "not_found",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 404
}
},
{
"create" : {
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 2,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "website",
"_type" : "blog",
"_id" : "p4CoJ48BfGmvnTegFeja",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1,
"status" : 201
}
},
{
"update" : {
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 3,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1,
"status" : 200
}
}
]
}
PUT 和 POST 都可以
POST 新增。如果不指定 id,会自动生成 id。指定 id 就会修改这个数据,并新增版本号;
PUT 可以新增也可以修改。PUT 必须指定 id;由于 PUT 需要指定 id,我们一般用来做修改操作,不指定 id 会报错。
下面是在 postman 中的测试数据:
创建数据成功后,显示 201 created 表示插入记录成功。
{
"account_number": 1,
"balance": 39225,
"firstname": "Amber",
"lastname": "Duke",
"age": 32,
"gender": "M",
"address": "880 Holmes Lane",
"employer": "Pyrami",
"email": "amberduke@pyrami.com",
"city": "Brogan",
"state": "IL"
}
这些返回的 JSON 串的含义;这些带有下划线开头的,称为元数据,反映了当前的基本信息。
“_index”: “customer” 表明该数据在哪个数据库下;
“_type”: “external” 表明该数据在哪个类型下;
“_id”: “1” 表明被保存数据的 id;
“_version”: 1, 被保存数据的版本
“result”: “created” 这里是创建了一条数据,如果重新 put 一条数据,则该状态会变为 updated,并且版本号也会发生变化。
下面选用 POST 方式
添加数据的时候,不指定 ID,会自动的生成 id,并且类型是新增:
再次使用 POST 插入数据,仍然是新增的:
添加数据的时候,指定 ID,会使用该 id,并且类型是新增:
再次使用 POST 插入数据,类型为 updated
3.3、查看文档
GET /customer/external/1
http://192.168.119.127:9200/customer/external/1
GET /bank/_search
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" },
{"balance":"desc"}
]
}
结果:
{ "_index": "customer", // 在哪个索引 "_type": "external",// 在哪个类型 "_id": "1",// 记录 id "_version": 2,// 版本号 "_seq_no": 1,// 并发控制字段,每次更新就会 +1 ,用来做乐观锁 "_primary_term": 1,// 同上,主分片重新分配,如重启,就会变化 "found": true, "_source": {// 真正的内容 "name": "John Doe" } }
更新携带 ?if_seq_no=0&if_primary_term=1
通过 “if_seq_no=1&if_primary_term=1”,当序列号匹配的时候,才进行修改,否则不修改。
实例:将 id=1 的数据更新为 name=1,然后再次更新为 name=2,起始_seq_no=0,_primary_term=1
(1)将 name 更新为 1
192.168.119.127:9200/customer/external/1?if_seq_no=0&if_primary_term=1
2)将 name 更新为 2
192.168.119.127:9200/customer/external/1?if_seq_no=0&if_primary_term=1
出现更新错误。
3)查询新的数据
192.168.119.127:9200/customer/external/1
能够看到_seq_no 变为 2。
(4)再次更新,更新成功
192.168.119.127:9200/customer/external/1?if_seq_no=2&if_primary_term=1
3.4、更新文档
POST customer/external/1/_update { "doc":{ "name": "John Doew" } } 或者 POST customer/external/1 { "name": "John Doe2" } 或者 PUT customer/external/1 { "name": "John Doe" }
- 不同:POST 操作会对比源文档数据,如果相同不会有什么操作,文档 version 不增加 PUT 操作总会将数据重新保存并增加 version 版本;
带_update 对比元数据如果一样就不进行任何操作。
看场景;
对于大并发更新,不带 update;
对于大并发查询偶尔更新,带 update;对比更新,重新计算分配规则。
- 更新同时增加属性
POST customer/external/1/_update { "doc": {"name": "Jane Doe", "age": 20} }
PUT 和 POST 不带_update 也可以
1)、POST 更新文档,带有_update
192.168.119.127:9200/customer/extrnal/1/_update
如果再次执行更新,则不执行任何操作,序列号也不发生变化
POST 更新方式,会对比原来的数据,和原来的相同,则不执行任何操作(version 和_seq_no)都不变。
2)、POST 更新文档,不带_update
在更新过程中,重复执行更新操作,数据也能够更新成功,不会和原来的数据进行对比。
3.5、删除文档或索引
删除 “1” 索引
{
"took" : 17,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "0",
"_score" : null,
"_source" : {
"account_number" : 0,
"balance" : 16623,
"firstname" : "Bradshaw",
"lastname" : "Mckenzie",
"age" : 29,
"gender" : "F",
"address" : "244 Columbus Place",
"employer" : "Euron",
"email" : "bradshawmckenzie@euron.com",
"city" : "Hobucken",
"state" : "CO"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "1",
"_score" : null,
"_source" : {
"account_number" : 1,
"balance" : 39225,
"firstname" : "Amber",
"lastname" : "Duke",
"age" : 32,
"gender" : "M",
"address" : "880 Holmes Lane",
"employer" : "Pyrami",
"email" : "amberduke@pyrami.com",
"city" : "Brogan",
"state" : "IL"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "2",
"_score" : null,
"_source" : {
"account_number" : 2,
"balance" : 28838,
"firstname" : "Roberta",
"lastname" : "Bender",
"age" : 22,
"gender" : "F",
"address" : "560 Kingsway Place",
"employer" : "Chillium",
"email" : "robertabender@chillium.com",
"city" : "Bennett",
"state" : "LA"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "3",
"_score" : null,
"_source" : {
"account_number" : 3,
"balance" : 44947,
"firstname" : "Levine",
"lastname" : "Burks",
"age" : 26,
"gender" : "F",
"address" : "328 Wilson Avenue",
"employer" : "Amtap",
"email" : "levineburks@amtap.com",
"city" : "Cochranville",
"state" : "HI"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "4",
"_score" : null,
"_source" : {
"account_number" : 4,
"balance" : 27658,
"firstname" : "Rodriquez",
"lastname" : "Flores",
"age" : 31,
"gender" : "F",
"address" : "986 Wyckoff Avenue",
"employer" : "Tourmania",
"email" : "rodriquezflores@tourmania.com",
"city" : "Eastvale",
"state" : "HI"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "5",
"_score" : null,
"_source" : {
"account_number" : 5,
"balance" : 29342,
"firstname" : "Leola",
"lastname" : "Stewart",
"age" : 30,
"gender" : "F",
"address" : "311 Elm Place",
"employer" : "Diginetic",
"email" : "leolastewart@diginetic.com",
"city" : "Fairview",
"state" : "NJ"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : null,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "7",
"_score" : null,
"_source" : {
"account_number" : 7,
"balance" : 39121,
"firstname" : "Levy",
"lastname" : "Richard",
"age" : 22,
"gender" : "M",
"address" : "820 Logan Street",
"employer" : "Teraprene",
"email" : "levyrichard@teraprene.com",
"city" : "Shrewsbury",
"state" : "MO"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "8",
"_score" : null,
"_source" : {
"account_number" : 8,
"balance" : 48868,
"firstname" : "Jan",
"lastname" : "Burns",
"age" : 35,
"gender" : "M",
"address" : "699 Visitation Place",
"employer" : "Glasstep",
"email" : "janburns@glasstep.com",
"city" : "Wakulla",
"state" : "AZ"
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "9",
"_score" : null,
"_source" : {
"account_number" : 9,
"balance" : 24776,
"firstname" : "Opal",
"lastname" : "Meadows",
"age" : 39,
"gender" : "M",
"address" : "963 Neptune Avenue",
"employer" : "Cedward",
"email" : "opalmeadows@cedward.com",
"city" : "Olney",
"state" : "OH"
},
"sort" : [
]
}
]
}
}
实例:删除整个 costomer 索引数据
删除 “customer” 索引
3.6、eleasticsearch 的批量操作——bulk
语法格式:
QUERY_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
这里的批量操作,当发生某一条执行发生失败时,其他的数据仍然能够接着执行,也就是说彼此之间是独立的。
bulk api 以此按顺序执行所有的 action(动作)。如果一个单个的动作因任何原因失败,它将继续处理它后面剩余的动作。当 bulk api 返回时,它将提供每个动作的状态(与发送的顺序相同),所以您可以检查是否一个指定的动作是否失败了。
实例 1: 执行多条数据
postman 无法执行了,我们需要使用 kibana 操作
{
QUERY_NAME:{
FIELD_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
}
执行结果:
GET bank/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 5,
"sort": [
{
"account_number": {
"order": "desc"
}
}
]
}
实例 2:对于整个索引执行批量操作
GET bank/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"balance": {
"order": "desc"
}
}
],
"from": 0,
"size": 5,
"_source": ["balance","firstname"]
}
运行结果:
{
"took" : 26,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "248",
"_score" : null,
"_source" : {
"firstname" : "West",
"balance" : 49989
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "854",
"_score" : null,
"_source" : {
"firstname" : "Jimenez",
"balance" : 49795
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "240",
"_score" : null,
"_source" : {
"firstname" : "Oconnor",
"balance" : 49741
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "97",
"_score" : null,
"_source" : {
"firstname" : "Karen",
"balance" : 49671
},
"sort" : [
]
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "842",
"_score" : null,
"_source" : {
"firstname" : "Meagan",
"balance" : 49587
},
"sort" : [
]
}
]
}
}
bulk API 以此按顺序执行所有的 action (动作)。如果一个单个的动作因任何原因而失败,它将继续处理它后面剩余的动作。当 bulk API 返回时,它将提供每个动作的状态(与发送的顺序相同),所以您可以检查是否一个指定的动作是不是失败了。
3.7、样本测试数据
准备了一份顾客银行账户信息的虚构的 JSON 文档样本。每个文档都有下列的 schema(模式)。
GET bank/_search
{
"query": {
"match": {
"balance": 16418
}
}
}
https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json ,导入测试数据,POST bank/account/_bulk,测试数据。
4、进阶检索
4.1、SearchAPI
ES 支持两种基本方式检索 :
- 一个是通过使用 REST request URI 发送搜索参数(uri+ 检索参数)
- 另一个是通过使用 REST request body 来发送它们(uri+ 请求体)
检索信息
- 一切检索从_search 开始
GET bank/_search 检索 bank 下所有信息,包括 type 和 docs
GET bank/_search?q=*&sort=account_number:asc 请求参数方式检索
响应结果解释:
- took - Elasticsearch 执行搜索的时间(毫秒)
- time_out - 告诉我们搜索是否超时
- _shards - 告诉我们多少个分片被搜索了,以及统计了成功 / 失败的搜索分片
- hits - 搜索结果
- hits.total - 搜索结果
- hits.hits - 实际的搜索结果数组(默认为前 10 的文档)
- sort - 结果的排序 key(键)(没有则按 score 排序)
- score 和 max_score –相关性得分和最高得分(全文检索用)
uri + 请求体进行检索
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "20",
"_score" : 1.0,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "elinorratliff@scentric.com",
"city" : "Ribera",
"state" : "WA"
}
}
]
}
}
HTTP 客户端工具( POSTMAN ), get 请求不能携带请求体,我们变为 post 也是一样的我们 POST 一个 JSON 风格的查询请求体到 _search API 。
需要了解,一旦搜索的结果被返回, Elasticsearch 就完成了这次请求,并且不会维护任何服务端的资源或者结果的 cursor (游标)
GET bank/_search
{
"query": {
"match": {
"address": "Kings"
}
}
}
返回结果:
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 6.216692,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "20",
"_score" : 6.216692,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "elinorratliff@scentric.com",
"city" : "Ribera",
"state" : "WA"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "722",
"_score" : 6.216692,
"_source" : {
"account_number" : 722,
"balance" : 27256,
"firstname" : "Roberts",
"lastname" : "Beasley",
"age" : 34,
"gender" : "F",
"address" : "305 Kings Hwy",
"employer" : "Quintity",
"email" : "robertsbeasley@quintity.com",
"city" : "Hayden",
"state" : "PA"
}
}
]
}
}
(1)只有 10 条数据,这是因为存在分页查询;
(2)详细的字段信息,参照:
https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-search.html
The response also provides the following information about the search request:
took
– how long it took Elasticsearch to run the query, in millisecondstimed_out
– whether or not the search request timed out_shards
– how many shards were searched and a breakdown of how many shards succeeded, failed, or were skipped.max_score
– the score of the most relevant document foundhits.total.value
- how many matching documents were foundhits.sort
- the document’s sort position (when not sorting by relevance score)hits._score
- the document’s relevance score (not applicable when usingmatch_all
)
4.2、Query DSL
4.2.1、基本语法格式
Elasticsearch 提供了一个可以执行查询的 Json 风格的 DSL(domain-specific language 领域特 定语言)。这个被称为 Query DSL。该查询语言非常全面,并且刚开始的时候感觉有点复杂,
真正学好它的方法是从一些基础的示例开始的。
- 一个查询语句的典型结构
GET bank/_search
{
"query": {
"match": {
"address": "mill road"
}
}
}
- 如果针对于某个字段,那么它的结构如下:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 32,
"relation" : "eq"
},
"max_score" : 8.926605,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 8.926605,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 5.4032025,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 5.4032025,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "leelong@comverges.com",
"city" : "Movico",
"state" : "MT"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "431",
"_score" : 3.5234027,
"_source" : {
"account_number" : 431,
"balance" : 13136,
"firstname" : "Laurie",
"lastname" : "Shaw",
"age" : 26,
"gender" : "F",
"address" : "263 Aviation Road",
"employer" : "Zillanet",
"email" : "laurieshaw@zillanet.com",
"city" : "Harmon",
"state" : "WV"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "436",
"_score" : 3.5234027,
"_source" : {
"account_number" : 436,
"balance" : 27585,
"firstname" : "Alexander",
"lastname" : "Sargent",
"age" : 23,
"gender" : "M",
"address" : "363 Albemarle Road",
"employer" : "Fangold",
"email" : "alexandersargent@fangold.com",
"city" : "Calpine",
"state" : "OR"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "532",
"_score" : 3.5234027,
"_source" : {
"account_number" : 532,
"balance" : 17207,
"firstname" : "Hardin",
"lastname" : "Kirk",
"age" : 26,
"gender" : "M",
"address" : "268 Canarsie Road",
"employer" : "Exposa",
"email" : "hardinkirk@exposa.com",
"city" : "Stouchsburg",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "873",
"_score" : 3.5234027,
"_source" : {
"account_number" : 873,
"balance" : 43931,
"firstname" : "Tisha",
"lastname" : "Cotton",
"age" : 39,
"gender" : "F",
"address" : "432 Lincoln Road",
"employer" : "Buzzmaker",
"email" : "tishacotton@buzzmaker.com",
"city" : "Bluetown",
"state" : "GA"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "83",
"_score" : 3.5234027,
"_source" : {
"account_number" : 83,
"balance" : 35928,
"firstname" : "Mayo",
"lastname" : "Cleveland",
"age" : 28,
"gender" : "M",
"address" : "720 Brooklyn Road",
"employer" : "Indexia",
"email" : "mayocleveland@indexia.com",
"city" : "Roberts",
"state" : "ND"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "88",
"_score" : 3.5234027,
"_source" : {
"account_number" : 88,
"balance" : 26418,
"firstname" : "Adela",
"lastname" : "Tyler",
"age" : 21,
"gender" : "F",
"address" : "737 Clove Road",
"employer" : "Surelogic",
"email" : "adelatyler@surelogic.com",
"city" : "Boling",
"state" : "SD"
}
}
]
}
}
具体 query 查询案例
GET bank/_search
{
"query": {
"match_phrase": {
"address": "mill road"
}
}
}
query 定义如何查询;
- match_all 查询类型【代表查询所有的所有】,es 中可以在 query 中组合非常多的查询类型完成复杂查询;
- 除了 query 参数之外,我们也可传递其他的参数以改变查询结果,如 sort,size;
- from+size 限定,完成分页功能;
- sort 排序,多字段排序,会在前序字段相等时后续字段内部排序,否则以前序为准;
4.2.2、返回部分字段
{
"took" : 48,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 9.845243,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 9.845243,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
}
]
}
}
查询结果:
GET bank/_search
{
"query": {
"match_phrase": {
"address": "990 Mill"
}
}
}
4.2.3、match 匹配查询
- 基本类型(非字符串),精确控制
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 10.806405,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 10.806405,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
match 返回 balance=16418 的数据。
查询结果:
GET bank/_search
{
"query": {
"match": {
"address.keyword": "990 Mill"
}
}
}
- 字符串,全文检索
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
全文检索,最终会按照评分进行排序,会对检索条件进行分词匹配。
查询结果:
GET bank/_search
{
"query": {
"match": {
"address.keyword": "990 Mill Road"
}
}
}
- 字符串,多个单词(分词 + 全文检索)
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.5032897,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.5032897,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
结果
GET bank/_search
{
"query": {
"multi_match": {
"query": "mill",
"fields": ["address","city"]
}
}
}
最终查询出 address 中包含 mill 或者 road 或者 mill road 的所有记录,并给出相关性得分
4.2.4、match_phrase [短句匹配]
将需要匹配的值当成一整个单词(不分词)进行检索
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 5.6291604,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 5.6291604,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 5.6291604,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 5.6291604,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.6291604,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "leelong@comverges.com",
"city" : "Movico",
"state" : "MT"
}
}
]
}
}
查处 address 中包含 mill_road 的所有记录,并给出相关性得分
查看结果:
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
]
}
}
}
- match_phrase 和 Match 的 keyword 区别
观察如下实例:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 6.0824604,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.0824604,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 6.0824604,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 6.0824604,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
}
]
}
}
查询结果:
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{
"match": {
"age": "38"
}
}
]
}
}
使用 match 的 keyword
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.0824604,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.0824604,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
查询结果,一条也未匹配到
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{
"match": {
"age": "18"
}
}
],
"should": [
{
"match": {
"lastname": "Wallace"
}
}
]
}
}
}
修改匹配条件为 “990 Mill Road”
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 12.585751,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 12.585751,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "136",
"_score" : 6.0824604,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "345",
"_score" : 6.0824604,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
}
]
}
}
查询出一条数据
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "mill"
}
},
{
"range": {
"balance": {
"gte": 10000,
"lte": 20000
}
}
}
]
}
}
}
文本字段的匹配,使用 keyword,匹配的条件就是要显示字段的全部值,要进行精确匹配的。
match_phrase 是做短语匹配,只要文本中包含匹配条件,就能匹配到。
4.2.5、multi_math【多字段匹配】
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.4032025,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 6.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
city 或者 address 中包含 mill,并且在查询过程中,会对于查询条件进行分词。
查询结果:
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"address": "mill"
}
}
],
"filter": {
"range": {
"balance": {
"gte": "10000",
"lte": "20000"
}
}
}
}
}
}
4.2.6、bool 用来做复合查询
复合语句可以合并,任何其他查询语句,包括符合语句。这也就意味着,复合语句之间可以互相嵌套,可以表达非常复杂的逻辑。
- must:必须达到 must 所列举的所有条件
实例:查询 gender=m,并且 address=mill 的数据
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 5.4032025,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "970",
"_score" : 5.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
查询结果:
GET bank/_search
{
"query": {
"term": {
"address": "mill Road"
}
}
}
- must_not:必须不匹配 must_not 所列举的所有条件
实例:查询 gender=m,并且 address=mill 的数据,但是 age 不等于 38 的
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
查询结果:
"aggs":{
"aggs_name这次聚合的名字,方便展示在结果集中":{
"AGG_TYPE聚合的类型(avg,term,terms)":{}
}
},
- should:应该达到 should 列举的条件,如果到达会增加相关文档的评分,并不会改变查询的结果。如果 query 中只有 should 且只有一种匹配规则,那么 should 的条件就会被作为默认匹配条件二区改变查询结果。
实例:匹配 lastName 应该等于 Wallace 的数据
GET bank/_search
{
"query": {
"match": {
"address": "Mill"
}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 10
}
},
"ageAvg": {
"avg": {
"field": "age"
}
},
"balanceAvg": {
"avg": {
"field": "balance"
}
}
},
"size": 0
}
查询结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 38,
"doc_count" : 2
},
{
"key" : 28,
"doc_count" : 1
},
{
"key" : 32,
"doc_count" : 1
}
]
},
"ageAvg" : {
"value" : 34.0
},
"balanceAvg" : {
"value" : 25208.0
}
}
}
能够看到相关度越高,得分也越高
4.2.7、Filter【结果过滤】
并不是所有的查询都需要产生分数,特别是哪些仅用于 filtering 过滤的文档。为了不计算分数,elasticsearch 会自动检查场景并且优化查询的执行。
Each must
, should
, and must_not
element in a Boolean query is referred to as a query clause. How well a document meets the criteria in each must
or should
clause contributes to the document’s relevance score. The higher the score, the better the document matches your search criteria. By default, Elasticsearch returns documents ranked by these relevance scores.
在 boolean 查询中,must
, should
和must_not
元素都被称为查询子句 。 文档是否符合每个 “must” 或“should”子句中的标准,决定了文档的“相关性得分”。 得分越高,文档越符合您的搜索条件。 默认情况下,Elasticsearch 返回根据这些相关性得分排序的文档。
The criteria in a must_not
clause is treated as a filter. It affects whether or not the document is included in the results, but does not contribute to how documents are scored. You can also explicitly specify arbitrary filters to include or exclude documents based on structured data.
“must_not”子句中的条件被视为“过滤器”。
它影响文档是否包含在结果中, 但不影响文档的评分方式。 还可以显式地指定任意过滤器来包含或排除基于结构化数据的文档。
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"ageAvg": {
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
这里先是查询所有匹配 address=mill 的文档,然后再根据 10000<=balance<=20000 进行过滤查询结果
查询结果:
{
"took" : 49,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 31,
"doc_count" : 61,
"ageAvg" : {
"value" : 28312.918032786885
}
},
{
"key" : 39,
"doc_count" : 60,
"ageAvg" : {
"value" : 25269.583333333332
}
},
{
"key" : 26,
"doc_count" : 59,
"ageAvg" : {
"value" : 23194.813559322032
}
},
{
"key" : 32,
"doc_count" : 52,
"ageAvg" : {
"value" : 23951.346153846152
}
},
{
"key" : 35,
"doc_count" : 52,
"ageAvg" : {
"value" : 22136.69230769231
}
},
{
"key" : 36,
"doc_count" : 52,
"ageAvg" : {
"value" : 22174.71153846154
}
},
{
"key" : 22,
"doc_count" : 51,
"ageAvg" : {
"value" : 24731.07843137255
}
},
{
"key" : 28,
"doc_count" : 51,
"ageAvg" : {
"value" : 28273.882352941175
}
},
{
"key" : 33,
"doc_count" : 50,
"ageAvg" : {
"value" : 25093.94
}
},
{
"key" : 34,
"doc_count" : 49,
"ageAvg" : {
"value" : 26809.95918367347
}
},
{
"key" : 30,
"doc_count" : 47,
"ageAvg" : {
"value" : 22841.106382978724
}
},
{
"key" : 21,
"doc_count" : 46,
"ageAvg" : {
"value" : 26981.434782608696
}
},
{
"key" : 40,
"doc_count" : 45,
"ageAvg" : {
"value" : 27183.17777777778
}
},
{
"key" : 20,
"doc_count" : 44,
"ageAvg" : {
"value" : 27741.227272727272
}
},
{
"key" : 23,
"doc_count" : 42,
"ageAvg" : {
"value" : 27314.214285714286
}
},
{
"key" : 24,
"doc_count" : 42,
"ageAvg" : {
"value" : 28519.04761904762
}
},
{
"key" : 25,
"doc_count" : 42,
"ageAvg" : {
"value" : 27445.214285714286
}
},
{
"key" : 37,
"doc_count" : 42,
"ageAvg" : {
"value" : 27022.261904761905
}
},
{
"key" : 27,
"doc_count" : 39,
"ageAvg" : {
"value" : 21471.871794871793
}
},
{
"key" : 38,
"doc_count" : 39,
"ageAvg" : {
"value" : 26187.17948717949
}
},
{
"key" : 29,
"doc_count" : 35,
"ageAvg" : {
"value" : 29483.14285714286
}
}
]
}
}
}
filter 在使用过程中,并不会计算相关性得分:
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"genderAgg": {
"terms": {
"field": "gender.keyword"
},
"aggs": {
"balanceAvg": {
"avg": {
"field": "balance"
}
}
}
},
"ageBalanceAvg": {
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
查询结果:
{
"took" : 119,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 31,
"doc_count" : 61,
"genderAgg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "M",
"doc_count" : 35,
"balanceAvg" : {
"value" : 29565.628571428573
}
},
{
"key" : "F",
"doc_count" : 26,
"balanceAvg" : {
"value" : 26626.576923076922
}
}
]
},
"ageBalanceAvg" : {
"value" : 28312.918032786885
}
}
]
.......//省略其他
}
}
}
能看到所有文档的 “_score” : 5.4032025,说明 balance 这个条件并没有计算相关性得分。
4.2.8、term
和 match 一样。匹配某个属性的值。全文检索字段用 match,其他非 text 字段匹配用 term。
Avoid using the
term
query for text fields.避免对文本字段使用 “term” 查询
By default, Elasticsearch changes the values of
text
fields as part of analysis. This can make finding exact matches fortext
field values difficult.默认情况下,Elasticsearch 作为 analysis 的一部分更改’ text '字段的值。这使得为 “text” 字段值寻找精确匹配变得困难。
To search
text
field values, use the match.要搜索 “text” 字段值,请使用匹配。
https://www.elastic.co/guide/en/elasticsearch/reference/7.6/query-dsl-term-query.html
使用 term 匹配查询
{
"bank" : {
"mappings" : {
"properties" : {
"account_number" : {
"type" : "long"
},
"address" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"age" : {
"type" : "long"
},
"balance" : {
"type" : "long"
},
"city" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"employer" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"firstname" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"gender" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"lastname" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"state" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
查询结果:
PUT /my_index
{
"mappings": {
"properties": {
"age": {
"type": "integer"
},
"email": {
"type": "keyword"
},
"name": {
"type": "text"
}
}
}
}
一条也没有匹配到
而更换为 match 匹配时,能够匹配到 32 个文档
也就是说,全文检索字段用 match,其他非 text 字段匹配用 term。
4.2.9、Aggregation(执行聚合)
聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于 SQL Group by 和 SQL 聚合函数。在 elasticsearch 中,执行搜索返回 this(命中结果),并且同时返回聚合结果,把以响应中的所有 hits(命中结果)分隔开的能力。这是非常强大且有效的,你可以执行查询和多个聚合,并且在一次使用中得到各自的(任何一个的)返回结果,使用一次简洁和简化的 API 避免网络往返。
- size:0 不显示搜索数据
- aggs:执行聚合。聚合语法如下:
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "my_index"
}
(1)、搜索 address 中包含 mill 的所有人的年龄分布以及平均年龄,但不显示这些人的详情
{
"my_index" : {
"aliases" : { },
"mappings" : {
"properties" : {
"age" : {
"type" : "integer"
},
"email" : {
"type" : "keyword"
},
"employee-id" : {
"type" : "keyword",
"index" : false
},
"name" : {
"type" : "text"
}
}
},
"settings" : {
"index" : {
"creation_date" : "1588410780774",
"number_of_shards" : "1",
"number_of_replicas" : "1",
"uuid" : "ua0lXhtkQCOmn7Kh3iUu0w",
"version" : {
"created" : "7060299"
},
"provided_name" : "my_index"
}
}
}
}
查询结果:
PUT /my_index/_mapping
{
"properties": {
"employee-id": {
"type": "keyword",
"index": false
}
}
}
复杂:
(2)、按照年龄聚合,并且求这些年龄段的这些人的平均薪资
POST reindex [固定写法]
{
"source":{
"index":"twitter"
},
"dest":{
"index":"new_twitters"
}
}
输出结果:
POST reindex [固定写法]
{
"source":{
"index":"twitter",
"type":"twitter"
},
"dest":{
"index":"new_twitters"
}
}
(3)、查出所有年龄分布,并且这些年龄段中 M 的平均薪资和 F 的平均薪资以及这个年龄段的总体平均薪资
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "account",//类型为account
"_id" : "1",
"_score" : 1.0,
"_source" : {
"account_number" : 1,
"balance" : 39225,
"firstname" : "Amber",
"lastname" : "Duke",
"age" : 32,
"gender" : "M",
"address" : "880 Holmes Lane",
"employer" : "Pyrami",
"email" : "amberduke@pyrami.com",
"city" : "Brogan",
"state" : "IL"
}
},
...
输出结果:
PUT /newbank
{
"mappings": {
"properties": {
"account_number": {
"type": "long"
},
"address": {
"type": "text"
},
"age": {
"type": "integer"
},
"balance": {
"type": "long"
},
"city": {
"type": "keyword"
},
"email": {
"type": "keyword"
},
"employer": {
"type": "keyword"
},
"firstname": {
"type": "text"
},
"gender": {
"type": "keyword"
},
"lastname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"state": {
"type": "keyword"
}
}
}
}
4.3、Mapping
4.3.1、字段类型
(1)、核心类型
- 字符串 (string)
text, keyword
- 数字类型 (Numeric)
long, integer, short, byte, double, float,half fioat,scaled foat
- 日期类型 (Date)
date
- 布尔类型 (Boolean)
boolean
- 二进制类型 (binary)
binary
(2)、复合类型
- 数组类型 (Array)
Array 支持不针对特定的类型对象类型 (Object)
object 用于单 JSON 对象嵌套类型 (Nested)
nested 用于 JSON 对象数组
- 地理类型 (Geo)
地理坐标 (Geo-points)
geo_point 用于描述 经纬度坐标
地理图形 (Geo-shape)
geo_shape 用于描述复杂形状,如多边形
(3)、特定类型
- IP 类型
ip 用于描述 Ipv4 和 ipv6 地址
- 补全类型 (Completion)
completlon 提供自动完成提示
- 令牌计数类型 (Token count)
token count 用于统计字符串中的词条数量
- 附件类型 (attachment)
参考 mapper.atachements 插件,支持将附件如 Microsot Omce 格式,Open Document 格式,ePUD,HTML 等等索引为 atachment 数据类型。
- 抽取类型 (Percolator)
接受特定领城查询语言 (query-dsl) 的查询
(4)、多字段
通常用于为不同目的用不同的方法索引同一个字段。例如,string 字段可以映射为一个 text 字段用
于全文检索,同样可以映射为一个 keyword 字段用于排序和聚合。另外,你可以使用
standard analyzer,english analyzer,french analyzer 来索引一个 text 字段
这就是 muti-fields 的目的。大多数的数据类型通过 felds 参数来支持 muti-fields。
4.3.2、映射
Mapping(映射)
Maping 是用来定义一个文档(document),以及它所包含的属性(field)是如何存储和索引的。比如:使用 maping 来定义:
哪些字符串属性应该被看做全文本属性(full text fields);
哪些属性包含数字,日期或地理位置;
文档中的所有属性是否都嫩被索引(all 配置);
日期的格式;
自定义映射规则来执行动态添加属性;
查看 mapping 信息
GET bank/_mapping
POST _reindex
{
"source": {
"index": "bank",
"type": "account"
},
"dest": {
"index": "newbank"
}
}
- 修改 mapping 信息
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
自动猜测的映射类型
4.3.3*、新版本改变*
ElasticSearch7 - 去掉 type 概念
- 关系型数据库中两个数据表示是独立的,即使他们里面有相同名称的列也不影响使用,但 ES 中不是这样的。elasticsearch 是基于 Lucene 开发的搜索引擎,而 ES 中不同 type 下名称相同的 filed 最终在 Lucene 中的处理方式是一样的。
* 两个不同 type 下的两个 user_name,在 ES 同一个索引下其实被认为是同一个 filed,你必须在两个不同的 type 中定义相同的 filed 映射。否则,不同 type 中的相同字段名称就会在处理中出现冲突的情况,导致 Lucene 处理效率下降。
* 去掉 type 就是为了提高 ES 处理数据的效率。
- Elasticsearch 7.x
* URL 中的 type 参数为可选。比如,索引一个文档不再要求提供文档类型。
- Elasticsearch 8.x
* 不再支持 URL 中的 type 参数。
- 解决:
* 将索引从多类型迁移到单类型,每种类型文档一个独立索引
* 将已存在的索引下的类型数据,全部迁移到指定位置即可。详见数据迁移
Elasticsearch 7.x
- Specifying types in requests is deprecated. For instance, indexing a document no longer requires a document
type
. The new index APIs arePUT {index}/_doc/{id}
in case of explicit ids andPOST {index}/_doc
for auto-generated ids. Note that in 7.0,_doc
is a permanent part of the path, and represents the endpoint name rather than the document type.- The
include_type_name
parameter in the index creation, index template, and mapping APIs will default tofalse
. Setting the parameter at all will result in a deprecation warning.- The
_default_
mapping type is removed.Elasticsearch 8.x
- Specifying types in requests is no longer supported.
- The
include_type_name
parameter is removed.
1)、创建映射
创建索引并指定映射
#! Deprecation: [types removal] Specifying types in reindex requests is deprecated.
{
"took" : 2954,
"timed_out" : false,
"total" : 1000,
"updated" : 0,
"created" : 1000,
"deleted" : 0,
"batches" : 1,
"version_conflicts" : 0,
"noops" : 0,
"retries" : {
"bulk" : 0,
"search" : 0
},
"throttled_millis" : 0,
"requests_per_second" : -1.0,
"throttled_until_millis" : 0,
"failures" : [ ]
}
输出:
POST _analyze
{
"analyzer": "standard",
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
}
查看映射
{
"tokens" : [
{
"token" : "the",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "2",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<NUM>",
"position" : 1
},
{
"token" : "quick",
"start_offset" : 6,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 2
},
{
"token" : "brown",
"start_offset" : 12,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 3
},
{
"token" : "foxes",
"start_offset" : 18,
"end_offset" : 23,
"type" : "<ALPHANUM>",
"position" : 4
},
{
"token" : "jumped",
"start_offset" : 24,
"end_offset" : 30,
"type" : "<ALPHANUM>",
"position" : 5
},
{
"token" : "over",
"start_offset" : 31,
"end_offset" : 35,
"type" : "<ALPHANUM>",
"position" : 6
},
{
"token" : "the",
"start_offset" : 36,
"end_offset" : 39,
"type" : "<ALPHANUM>",
"position" : 7
},
{
"token" : "lazy",
"start_offset" : 40,
"end_offset" : 44,
"type" : "<ALPHANUM>",
"position" : 8
},
{
"token" : "dog's",
"start_offset" : 45,
"end_offset" : 50,
"type" : "<ALPHANUM>",
"position" : 9
},
{
"token" : "bone",
"start_offset" : 51,
"end_offset" : 55,
"type" : "<ALPHANUM>",
"position" : 10
}
]
}
输出结果:
[root@hadoop-104 ~]# curl http://localhost:9200
{
"name" : "0adeb7852e00",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "9gglpP0HTfyOTRAaSe2rIg",
"version" : {
"number" : "7.6.2", #版本号为7.6.2
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
"build_date" : "2020-03-26T06:34:37.794943Z",
"build_snapshot" : false,
"lucene_version" : "8.4.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
[root@hadoop-104 ~]#
2)、添加新的字段映射
[root@localhost ~]# docker exec -it elasticsearch /bin/bash
[root@d6f951c0ac1d elasticsearch]#
这里的 “index”: false,表明新增的字段不能被检索,只是一个冗余字段。
3)、更新映射
对于已经存在的字段映射,我们不能更新。更新必须创建新的索引,进行数据迁移。
4)、数据迁移
先创建 new_twitter 的正确映射。然后使用如下方式进行数据迁移。
[root@d6f951c0ac1d elasticsearch]# cd plugins/
[root@d6f951c0ac1d plugins]# wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip
将旧索引的 type 下的数据进行迁移
[root@d6f951c0ac1d plugins]# pwd
/mydata/elasticsearch/plugins
[root@d6f951c0ac1d plugins]# ll
总用量 4400
-rw-r--r--. 1 root root 4504473 5月 7 13:26 elasticsearch-analysis-ik-7.6.2.zip
[root@d6f951c0ac1d plugins]# mkdir ik
[root@d6f951c0ac1d plugins]# unzip elasticsearch-analysis-ik-7.6.2.zip -d ik
Archive: elasticsearch-analysis-ik-7.6.2.zip
creating: ik/config/
inflating: ik/config/main.dic
inflating: ik/config/quantifier.dic
inflating: ik/config/extra_single_word_full.dic
inflating: ik/config/IKAnalyzer.cfg.xml
inflating: ik/config/surname.dic
inflating: ik/config/suffix.dic
inflating: ik/config/stopword.dic
inflating: ik/config/extra_main.dic
inflating: ik/config/extra_stopword.dic
inflating: ik/config/preposition.dic
inflating: ik/config/extra_single_word_low_freq.dic
inflating: ik/config/extra_single_word.dic
inflating: ik/elasticsearch-analysis-ik-7.6.2.jar
inflating: ik/httpclient-4.5.2.jar
inflating: ik/httpcore-4.4.4.jar
inflating: ik/commons-logging-1.2.jar
inflating: ik/commons-codec-1.9.jar
inflating: ik/plugin-descriptor.properties
inflating: ik/plugin-security.policy
更多详情见: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-reindex.html
GET /bank/_search
[root@d6f951c0ac1d plugins]# chmod -R 777 ik/
[root@d6f951c0ac1d plugins]# ll
总用量 0
drwxrwxrwx. 3 root root 243 5月 7 13:31 ik
GET /bank/_search
想要将年龄修改为 integer
[root@d6f951c0ac1d elasticsearch]# cd bin/
[root@d6f951c0ac1d bin]# ll
total 19604
-rwxr-xr-x. 1 elasticsearch root 1915 Mar 26 2020 elasticsearch
-rwxr-xr-x. 1 elasticsearch root 491 Mar 26 2020 elasticsearch-certgen
-rwxr-xr-x. 1 elasticsearch root 483 Mar 26 2020 elasticsearch-certutil
-rwxr-xr-x. 1 elasticsearch root 982 Mar 26 2020 elasticsearch-cli
-rwxr-xr-x. 1 elasticsearch root 433 Mar 26 2020 elasticsearch-croneval
-rwxr-xr-x. 1 elasticsearch root 4316 Mar 26 2020 elasticsearch-env
-rwxr-xr-x. 1 elasticsearch root 1828 Mar 26 2020 elasticsearch-env-from-file
-rwxr-xr-x. 1 elasticsearch root 121 Mar 26 2020 elasticsearch-keystore
-rwxr-xr-x. 1 elasticsearch root 440 Mar 26 2020 elasticsearch-migrate
-rwxr-xr-x. 1 elasticsearch root 126 Mar 26 2020 elasticsearch-node
-rwxr-xr-x. 1 elasticsearch root 172 Mar 26 2020 elasticsearch-plugin
-rwxr-xr-x. 1 elasticsearch root 431 Mar 26 2020 elasticsearch-saml-metadata
-rwxr-xr-x. 1 elasticsearch root 438 Mar 26 2020 elasticsearch-setup-passwords
-rwxr-xr-x. 1 elasticsearch root 118 Mar 26 2020 elasticsearch-shard
-rwxr-xr-x. 1 elasticsearch root 427 Mar 26 2020 elasticsearch-sql-cli
-rwxr-xr-x. 1 elasticsearch root 19986912 Mar 26 2020 elasticsearch-sql-cli-7.6.2.jar
-rwxr-xr-x. 1 elasticsearch root 426 Mar 26 2020 elasticsearch-syskeygen
-rwxr-xr-x. 1 elasticsearch root 426 Mar 26 2020 elasticsearch-users
-rwxr-xr-x. 1 elasticsearch root 346 Mar 26 2020 x-pack-env
-rwxr-xr-x. 1 elasticsearch root 354 Mar 26 2020 x-pack-security-env
-rwxr-xr-x. 1 elasticsearch root 353 Mar 26 2020 x-pack-watcher-env
[root@d6f951c0ac1d bin]# elasticsearch-plugin
A tool for managing installed elasticsearch plugins
Non-option arguments:
command
Option Description
------ -----------
-E <KeyValuePair> Configure a setting
-h, --help Show help
-s, --silent Show minimal output
-v, --verbose Show verbose output
查看 “newbank” 的映射:
GET /newbank/_mapping
能够看到 age 的映射类型被修改为了 integer.
将 bank 中的数据迁移到 newbank 中
[root@d6f951c0ac1d bin]# elasticsearch-plugin -h
A tool for managing installed elasticsearch plugins
Commands
--------
list - Lists installed elasticsearch plugins
install - Install a plugin
remove - removes a plugin from Elasticsearch
Non-option arguments:
command
Option Description
------ -----------
-E <KeyValuePair> Configure a setting
-h, --help Show help
-s, --silent Show minimal output
-v, --verbose Show verbose output
[root@d6f951c0ac1d bin]# elasticsearch-plugin list
ik
运行输出:
[root@d6f951c0ac1d bin]# exit;
exit
[root@localhost plugins]# docker restart elasticsearch
elasticsearch
[root@localhost plugins]#
查看 newbank 中的数据
GET /newbank/_search
4.4、分词
一个 tokenizer(分词器)接收一个字符流,将之分割为独立的 tokens(词元,通常是独立的单词),然后输出 tokens 流。
例如:whitespace tokenizer 遇到空白字符时分割文本。它会将文本 “Quick brown fox!” 分割为[Quick,brown,fox!]。
该 tokenizer(分词器)还负责记录各个 terms(词条) 的顺序或 position 位置(用于 phrase 短语和 word proximity 词近邻查询),以及 term(词条)所代表的原始 word(单词)的 start(起始)和 end(结束)的 character offsets(字符串偏移量)(用于高亮显示搜索的内容)。
elasticsearch 提供了很多内置的分词器,可以用来构建 custom analyzers(自定义分词器)。
关于分词器: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis.html
GET my_index/_analyze
{
"text":"我是中国人"
}
执行结果:
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "是",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "中",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "国",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<IDEOGRAPHIC>",
"position" : 3
},
{
"token" : "人",
"start_offset" : 4,
"end_offset" : 5,
"type" : "<IDEOGRAPHIC>",
"position" : 4
}
]
}
4.4.1、安装 ik 分词器
原先没有配置网络,需要先配置网络才可以下载 ik
自己虚拟机的网络
查看网络
cd /etc/sysconfig/network-scripts/
修改网关
vi ifcfg-ens33
重启网络:
service network restart
测试
所有的语言分词,默认使用的都是 “Standard Analyzer”,但是这些分词器针对于中文的分词,并不友好。为此需要安装中文的分词器。
注意:不能用默认 elasticsearch-plugin install xxx.zip 进行自动安装
https://github.com/medcl/elasticsearch-analysis-ik/releases 对应 es 版本安装
在前面安装的 elasticsearch 时,我们已经将 elasticsearch 容器的 “/usr/share/elasticsearch/plugins” 目录,映射到宿主机的 “ /mydata/elasticsearch/plugins” 目录下,所以比较方便的做法就是下载 “/elasticsearch-analysis-ik-7.6.2.zip” 文件,然后解压到该文件夹下即可。安装完毕后,需要重启 elasticsearch 容器。
如果不嫌麻烦,还可以采用如下的方式。
(1)查看 elasticsearch 版本号:
GET my_index/_analyze
{
"analyzer": "ik_smart",
"text":"我是中国人"
}
2)进入 es 容器内部 plugin 目录
- docker exec -it 容器 id /bin/bash
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "是",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "中国人",
"start_offset" : 2,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
}
]
}
GET my_index/_analyze
{
"analyzer": "ik_max_word",
"text":"我是中国人"
}
未找到 wget 命令,需要下载该命令
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "是",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "中国人",
"start_offset" : 2,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "中国",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "国人",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 4
}
]
}
- unzip 下载的文件
[root@localhost mydata]# mkdir nginx
[root@localhost mydata]# ll
总用量 0
drwxrwxrwx. 5 root root 47 4月 29 09:34 elasticsearch
drwxr-xr-x. 5 root root 41 7月 14 2023 mysql
drwxr-xr-x. 2 root root 6 5月 7 14:03 nginx
drwxr-xr-x. 4 root root 30 7月 15 2023 redis
- rm -rf *.zip
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf/:/etc/nginx \
-d nginx:1.10
- 修改分词器的权限
[root@localhost html]# mkdir es
[root@localhost html]# ls
es index.html
[root@localhost html]# cd es
[root@localhost es]# touch fenci.txt
[root@localhost es]# vim fenci.txt
[root@localhost es]#
- 确认是否安装好了分词器
尚硅谷
乔碧罗
[root@localhost config]# cd /mydata/elasticsearch/plugins/ik/config
[root@localhost config]# ls
extra_main.dic extra_single_word_full.dic extra_stopword.dic main.dic quantifier.dic suffix.dic
extra_single_word.dic extra_single_word_low_freq.dic IKAnalyzer.cfg.xml preposition.dic stopword.dic surname.dic
[root@localhost config]# vi IKAnalyzer.cfg.xml
看到 ik 则代表已经装好 ik 分词器
- 重启 elasticsearch
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict"></entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<entry key="remote_ext_dict">http://192.168.119.127/es/fenci.txt</entry>
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
4.4.2、测试分词器
使用默认
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict"></entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<!-- <entry key="remote_ext_dict">words_location</entry> -->
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
请观察执行结果:
[root@localhost config]# docker restart elasticsearch
elasticsearch
POST _analyze
{
"analyzer": "ik_max_word",
"text": "乔碧罗殿下"
}
输出结果:
{
"tokens" : [
{
"token" : "乔碧罗",
"start_offset" : 0,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "殿下",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 1
}
]
}
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.4.2</version>
</dependency>
输出结果:
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
4.4.3、安装 nginx
在 mydata 目录下创建 nginx 文件夹
package com.atguigu.gulimall.search.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.*;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestOperations;
/**
* @Description: GulimallElasticSearchConfig
* @Author: WangTianShun
* @Date: 2020/11/1 10:00
* @Version 1.0
*
* 1、导入依赖
* 2、编写配置,给容器中注入一个RestHighLevelClient
* 3、参照API操作
*/
@Configuration
public class GulimallElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
// builder.addHeader("Authorization", "Bearer " + TOKEN);
// builder.setHttpAsyncResponseConsumerFactory(
// new HttpAsyncResponseConsumerFactory
// .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient restHighLevelClient() {
RestClientBuilder builder = RestClient.builder(new HttpHost("192.168.43.125", 9200, "http"));
return new RestHighLevelClient(builder);
}
}
- 随便启动一个 nginx 实例,只是为了复制出配置
@Data
class User{
private String userName;
private String gender;
private Integer age;
}
/**
* 测试储存数据es
*/
@Test
public void indexData() throws IOException {
IndexRequest indexRequest = new IndexRequest("users");
indexRequest.id("1");
// indexRequest.source("userName","zhangsan","age",18,"gender","男");
User user = new User();
user.setUserName("张三");
user.setAge(13);
user.setGender("男");
String jsonString = JSON.toJSONString(user);
indexRequest.source(jsonString, XContentType.JSON);
//执行操作
IndexResponse index = client.index(indexRequest, GulimallElasticSearchConfig.COMMON_OPTIONS);
//提取有用地响应数据
System.out.println(index);
}
- 将容器内的配置拷贝到当前目录
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 22, //匹配到了22条
"relation": "eq"
},
"max_score": 3.7952394,
"hits": [{
"_index": "bank",
"_type": "account",
"_id": "210",
"_score": 3.7952394,
"_source": {
"account_number": 210,
"balance": 33946,
"firstname": "Cherry",
"lastname": "Carey",
"age": 24,
"gender": "M",
"address": "539 Tiffany Place",
"employer": "Martgo",
"email": "cherrycarey@martgo.com",
"city": "Fairacres",
"state": "AK"
}
},
....//省略其他
]
}
}
- 修改文件名称,把这个 conf 移动到 / mydata/nginx 下
@Data
@ToString
static class Account {
private int account_number;
private int balance;
private String firstname;
private String lastname;
private int age;
private String gender;
private String address;
private String employer;
private String email;
private String city;
private String state;
}
/**
* 测试检索请求
* 复杂检索:在bank中搜索address中包含mill的所有人的年龄分布以及平均年龄,平均薪资
*/
@Test
public void searchData() throws IOException {
// 1、创建检索请求
SearchRequest searchRequest = new SearchRequest();
// 1.1、指定索引
searchRequest.indices("bank");
// 1.2、指定DSL,检索条件
//SearchSourceBuilder sourceBuilder 封装的条件
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
searchRequest.source(sourceBuilder);
// 1.2.1、检索条件
// sourceBuilder.query();
// sourceBuilder.from();
// sourceBuilder.size();
// sourceBuilder.aggregation();
sourceBuilder.query(QueryBuilders.matchQuery("address", "mill"));
// 1.2.2、按照年龄的值分布进行聚合
TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("ageAgg").field("age").size(10);
sourceBuilder.aggregation(termsAggregationBuilder);
// 1.2.3、计算平均薪资
AvgAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("balanceAvg").field("balance");
sourceBuilder.aggregation(avgAggregationBuilder);
System.out.println("检索条件" + sourceBuilder.toString());
// 2、执行检索
SearchResponse search = client.search(searchRequest, GulimallElasticSearchConfig.COMMON_OPTIONS);
// 3、分析结果 search
System.out.println(search.toString());
// 3.1、获取所有查到的数据
SearchHits hits = search.getHits();
SearchHit[] searchHits = hits.getHits();
for (SearchHit hit : searchHits) {
String sourceAsString = hit.getSourceAsString();
Account account = JSON.parseObject(sourceAsString, Account.class);
System.out.println(account);
}
// 3.2、获取这次检索到的分析信息
Aggregations aggregations = search.getAggregations();
for (Aggregation aggregation : aggregations.asList()) {
String name = aggregation.getName();
System.out.println("当前聚合的名字" + name);
}
Terms ageAgg = aggregations.get("ageAgg");
for (Terms.Bucket bucket : ageAgg.getBuckets()) {
String keyAsString = bucket.getKeyAsString();
System.out.println("年龄" + keyAsString);
}
Avg balanceAvg = aggregations.get("balanceAvg");
System.out.println("平均薪资" + balanceAvg.getValue());
}
新建一个 nginx 把 conf 移动到新建的 nginx
- 执行命令删除原容器
终止原容器:
检索条件{"query":{"match":{"address":{"query":"mill","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}},"aggregations":{"ageAgg":{"terms":{"field":"age","size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}},"balanceAvg":{"avg":{"field":"balance"}}}}
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":5.4032025,"hits":[{"_index":"bank","_type":"account","_id":"970","_score":5.4032025,"_source":{"account_number":970,"balance":19648,"firstname":"Forbes","lastname":"Wallace","age":28,"gender":"M","address":"990 Mill Road","employer":"Pheast","email":"forbeswallace@pheast.com","city":"Lopezo","state":"AK"}},{"_index":"bank","_type":"account","_id":"136","_score":5.4032025,"_source":{"account_number":136,"balance":45801,"firstname":"Winnie","lastname":"Holland","age":38,"gender":"M","address":"198 Mill Lane","employer":"Neteria","email":"winnieholland@neteria.com","city":"Urie","state":"IL"}},{"_index":"bank","_type":"account","_id":"345","_score":5.4032025,"_source":{"account_number":345,"balance":9812,"firstname":"Parker","lastname":"Hines","age":38,"gender":"M","address":"715 Mill Avenue","employer":"Baluba","email":"parkerhines@baluba.com","city":"Blackgum","state":"KY"}},{"_index":"bank","_type":"account","_id":"472","_score":5.4032025,"_source":{"account_number":472,"balance":25571,"firstname":"Lee","lastname":"Long","age":32,"gender":"F","address":"288 Mill Street","employer":"Comverges","email":"leelong@comverges.com","city":"Movico","state":"MT"}}]},"aggregations":{"lterms#ageAgg":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":38,"doc_count":2},{"key":28,"doc_count":1},{"key":32,"doc_count":1}]},"avg#balanceAvg":{"value":25208.0}}}
GulimallSearchApplicationTests.Account(account_number=970, balance=19648, firstname=Forbes, lastname=Wallace, age=28, gender=M, address=990 Mill Road, employer=Pheast, email=forbeswallace@pheast.com, city=Lopezo, state=AK)
GulimallSearchApplicationTests.Account(account_number=136, balance=45801, firstname=Winnie, lastname=Holland, age=38, gender=M, address=198 Mill Lane, employer=Neteria, email=winnieholland@neteria.com, city=Urie, state=IL)
GulimallSearchApplicationTests.Account(account_number=345, balance=9812, firstname=Parker, lastname=Hines, age=38, gender=M, address=715 Mill Avenue, employer=Baluba, email=parkerhines@baluba.com, city=Blackgum, state=KY)
GulimallSearchApplicationTests.Account(account_number=472, balance=25571, firstname=Lee, lastname=Long, age=32, gender=F, address=288 Mill Street, employer=Comverges, email=leelong@comverges.com, city=Movico, state=MT)
当前聚合的名字ageAgg
当前聚合的名字balanceAvg
年龄38
年龄28
年龄32
执行命令删除原容器:
PUT product
{
"mappings":{
"properties": {
"skuId":{
"type": "long"
},
"spuId":{
"type": "keyword"
},
"skuTitle": {
"type": "text",
"analyzer": "ik_smart"
},
"skuPrice": {
"type": "keyword"
},
"skuImg":{
"type": "keyword",
"index": false,
"doc_values": false
},
"saleCount":{
"type":"long"
},
"hasStock": {
"type": "boolean"
},
"hotScore": {
"type": "long"
},
"brandId": {
"type": "long"
},
"catalogId": {
"type": "long"
},
"brandName": {
"type": "keyword",
"index": false,
"doc_values": false
},
"brandImg":{
"type": "keyword",
"index": false,
"doc_values": false
},
"catalogName": {
"type": "keyword",
"index": false,
"doc_values": false
},
"attrs": {
"type": "nested",
"properties": {
"attrId": {
"type": "long"
},
"attrName": {
"type": "keyword",
"index": false,
"doc_values": false
},
"attrValue": {
"type": "keyword"
}
}
}
}
}
}
- 创建新的 Nginx,执行以下命令
执行命令前创建 html、logs 文件,确保执行命令前是这样结构
PUT my_index/_doc/1
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
- 设置开机启动 nginx
{
"group" : "fans",
"user.first" : [ "alice", "john" ],
"user.last" : [ "smith", "white" ]
}
- 创建 “/mydata/nginx/html/index.html” 文件,测试是否能够正常访问
{
"my_index" : {
"mappings" : {
"properties" : {
"group" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"user" : {
"properties" : {
"first" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"last" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
访问:http://ngnix 所在主机的 IP:80/index.html
http://192.168.119.127/index.html
在 html 目录下创建 es 文件夹,并创建 fenci.txt 文件,为后面 es 自定义分词器使用
GET my_index/_search
{
"query": {
"bool": {
"must": [
{ "match": { "user.first": "Alice" }},
{ "match": { "user.last": "Smith" }}
]
}
}
}
{
"took" : 49,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.5753642,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
}
]
}
}
访问测试
http://192.168.119.127/es/fenci.txt
4.4.4、自定义词库
修改 / mydata/elasticsearch/plugins/ik/config 中的 IKAnalyzer.cfg.xml
PUT my_index
{
"mappings": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
PUT my_index/_doc/1
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
原来的 xml
{
"my_index" : {
"mappings" : {
"properties" : {
"group" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"user" : {
"type" : "nested",
"properties" : {
"first" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"last" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
修改完成后,需要重启 elasticsearch 容器,否则修改不生效。
GET my_index/_search
{
"query": {
"bool": {
"must": [
{ "match": { "user.first": "Alice" }},
{ "match": { "user.last": "Smith" }}
]
}
}
}
更新完成后,es 只会对于新增的数据用更新分词。历史数据是不会重新分词的。如果想要历史数据重新分词,需要执行
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
测试
package com.atguigu.common.to.es;
import lombok.Data;
import java.math.BigDecimal;
import java.util.List;
@Data
public class SkuEsModel {
private Long skuId;
private Long spuId;
private String skuTitle;
private BigDecimal skuPrice;
private String skuImg;
private Long saleCount;
private boolean hasStock;
private Long hotScore;
private Long brandId;
private Long catalogId;
private String brandName;
private String brandImg;
private String catalogName;
private List<Attr> attrs;
@Data
public static class Attr{
private Long attrId;
private String attrName;
private String attrValue;
}
}
结果
@PostMapping("spuinfo/{spuId}/up")
public R spuUp(@PathVariable("spuId") Long spuId){
spuInfoService.up(spuId);
return R.ok();
}
5、SpringBoot 整合 ElasticSearch
5.1、导入依赖
这里的版本要和所按照的 ELK 版本匹配。
/**
* 商品上架
*
* @param spuId
*/
void up(Long spuId);
在 spring-boot-dependencies 中所依赖的 ELK 版本位 6.8.7
@Override
public void up(Long spuId) {
// 1、查出当前spuId对应的sku信息,品牌名字
List<SkuInfoEntity> skus = skuInfoService.getSkuBySpuId(spuId);
List<Long> skuIdList = skus.stream().map(SkuInfoEntity::getSkuId).collect(Collectors.toList());
// 2.1、发送远程调用,库存系统查询是否有库存
Map<Long, Boolean> stockMap = null;
try {
R r = wareFeignService.getSkusHasStock(skuIdList);
TypeReference<List<SkuHasStockVo>> typeReference = new TypeReference<List<SkuHasStockVo>>() {
};
stockMap = r.getData(typeReference).stream().collect(Collectors.toMap(SkuHasStockVo::getSkuId, SkuHasStockVo::getHasStock));
} catch (Exception e) {
log.error("库存服务查询异常,原因:", e);
}
// 2.4、查询当前sku的所有可以被用来检索的规格属性
List<ProductAttrValueEntity> baseAttrs = productAttrValueService.baseAttrListForSpu(spuId);
List<Long> attrIds = baseAttrs.stream().map(attr -> attr.getAttrId()).collect(Collectors.toList());
List<Long> searchAttrIds = attrService.selectSearchAttrs(attrIds);
Set<Long> idSet = new HashSet<>(searchAttrIds);
List<SkuEsModel.Attrs> attrsList = baseAttrs.stream().filter(item -> idSet.contains(item.getAttrId())).map(item -> {
SkuEsModel.Attrs attrs1 = new SkuEsModel.Attrs();
BeanUtils.copyProperties(item, attrs1);
return attrs1;
}).collect(Collectors.toList());
// 2、封装每个sku的信息
Map<Long, Boolean> finalStockMap = stockMap;
List<SkuEsModel> upProducts = skus.stream().map(sku -> {
// 组装需要的数据
SkuEsModel esModel = new SkuEsModel();
BeanUtils.copyProperties(sku, esModel);
esModel.setSkuPrice(sku.getPrice());
esModel.setSkuImg(sku.getSkuDefaultImg());
// 2.1、是否有库存 hasStock,hotScore
if (finalStockMap == null) {
esModel.setHasStock(true);
} else {
esModel.setHasStock(finalStockMap.get(sku.getSkuId()));
}
// 2.2、热度评分。0
esModel.setHotScore(0L);
// 2.3、查询品牌和分类的名字信息
BrandEntity brand = brandService.getById(esModel.getBrandId());
esModel.setBrandName(brand.getName());
esModel.setBrandImg(brand.getLogo());
CategoryEntity category = categoryService.getById(esModel.getCatalogId());
esModel.setCatalogName(category.getName());
// 2.4、设置检索属性
esModel.setAttrs(attrsList);
System.out.println("======================esModel" + esModel);
return esModel;
}).collect(Collectors.toList());
// 3、将数据发送给es进行保存
R r = searchFeignService.productStatusUp(upProducts);
System.out.println("=========================" + r);
if (r.getCode() == 0) {
//远程调用成功
// 3.1、修改当前spu的状态
System.out.println("修改当前spu的状态");
baseMapper.updateSpuStatus(spuId, ProductConstant.StatusEnum.SPU_UP.getCode());
} else {
// 远程调用失败
// TODO 3.2、重复调用?接口幂等性;重试机制
/**
* Feign调用流程:
* 1、构造请求数据,将对象转为json
* RequestTemplate template = buildTemplateFromArgs.create(argv);
* 2、发送请求进行执行(执行成功会解码响应数据)
* executeAndDecode(template)
* 3、执行请求会有重试机制
* while(true){
* try{
* executeAndDecode(template);
* }catch(){
* retryer.continueOrPropagate(e);
* throw ex;
* continue;
* }
* }
*/
}
}
需要在项目中将它改为 7.6.2
/**
* 查出当前spuId对应的sku信息
*
* @param spuId
* @return
*/
List<SkuInfoEntity> getSkuBySpuId(Long spuId);
编写配置类
@Override
public List<SkuInfoEntity> getSkuBySpuId(Long spuId) {
List<SkuInfoEntity> list = this.list(new QueryWrapper<SkuInfoEntity>().eq("spu_id", spuId));
return list;
}
5.2、编写测试类
1)测试保存数据
Index API | Java REST Client [7.17] | Elastic
package com.atguigu.gulimall.product.feign;
import com.atguigu.common.utils.R;
import com.atguigu.gulimall.product.vo.SkuHasStockVo;
import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import java.util.List;
@FeignClient("gulimall-ware")
public interface WareFeignService {
/**
* 1、R设计的时候可以加上泛型
* 2、直接返回我们想要的结果
* 3、自己封装返回结果
* @param skuIds
* @return
*/
@PostMapping("/ware/waresku/hasStock")
R getSkusHasStock(@RequestBody List<Long> skuIds);
}
控制台打印
/**
* 查询sku是否有库存
*/
@PostMapping("hasStock")
public R getSkusHasStock(@RequestBody List<Long> skuIds){
// sku_id, stock
List<SkuHasStockVo> vos = wareSkuService.getSkusHasStock(skuIds);
return R.ok().setData(vos);
}
测试前:
测试后:
2)测试获取数据
Search API | Java REST Client [7.17] | Elastic
搜索 address 中包含 mill 的所有人的年龄分布以及平均年龄,平均薪资
@Override
public List<SkuHasStockVo> getSkusHasStock(List<Long> skuIds) {
List<SkuHasStockVo> collect = skuIds.stream().map(skuId -> {
SkuHasStockVo vo = new SkuHasStockVo();
// 查询sku的总库存量
Long count = baseMapper.getSkuStock(skuId);
vo.setSkuId(skuId);
vo.setHasStock(count == null ? false : count > 0);
return vo;
}).collect(Collectors.toList());
return collect;
}
java 实现
<select id="getSkuStock" resultType="java.lang.Long">
select sum(stock - stock_locked) from wms_ware_sku where sku_id=#{sku_id}
</select>
可以尝试对比打印的条件和执行结果,和前面的 ElasticSearch 的检索语句和检索结果进行比较;
/**
* 在指定的所有属性集合里面,挑出检索属性
*
* @param attrIds
* @return
*/
List<Long> selectSearchAttrs(List<Long> attrIds);
6、其他
6.1、ELK
ELK 是包含但不限于 Elasticsearch(简称 es)、Logstash、Kibana 三个开源软件的组成的一个整体。这三个软件合成 ELK。是用于数据抽取(Logstash)、搜索分析(Elasticsearch)、数据展现(Kibana)的一整套解决方案,所以也称作 ELK stack。
本课程从分别对三个组件经行详细介绍,尤其是 Elasticsearch,因为它是 elk 的核心。本课程从 es 底层对文档、索引、搜索、聚合、集群经行介绍,从搜索和聚合分析实例来展现 es 的魅力。Logstash 从内部如何采集数据到指定地方来展现它数据采集的功能。Kibana 则从数据绘图展现数据可视化的功能。
ELK 可以参考该文档:ELK 高级搜索,深度详解 ElasticStack 技术栈 - 上篇
6.2、kibana 控制台命令
ctrl+home:回到文档首部;
ctril+end:回到文档尾部。
二、商城业务 - 商品上架
0、商城架构图
上架的商品才可以在网站展示。
上架的商品需要可以被检索。
页面
1、商品 Mapping
分析:商品上架在 es 中是存 sku 还是 spu ?
- 检索的时候输入名字,是需要按照 sku 的 title 进行全文检索的
- 检索使用商品规格,规格是 spu 的公共属性,每个 spu 是一样的
- 按照分类 id 进去的都是直接列出 spu 的,还可以切换。
- 我们如果将 sku 的全量信息保存到 es 中(包括 spu 属性)就太多量字段了。
- 我们如果将 spu 以及他包含的 sku 信息保存到 es 中,也可以方便检索。但是 sku 属于 spu 的级联对象,在 es 中需要 nested 模型,这种性能差点。
- 但是存储与检索我们必须性能折中。
- 如果我们分拆存储,spu 和 attr 一个索引,sku 单独一个索引可能涉及的问题。
- 检索商品的名字,如 “手机”,对应的 spu 有很多,我们要分析出这些 spu 的所有关联属性,再做一次查询,就必须将所有 spu_id 都发出去。假设有 1 万个数据,数据传输一次就 10000*4=4MB;并发情况下假设 1000 检索请求,那就是 4GB 的数据,传输阻塞时间会很长,业务更加无法继续。
- 所以,我们如下设计,这样才是文档区别于关系型数据库的地方,宽表设计,不能去考虑数据库范式。
向 ES 添加商品属性映射
@Override
public List<Long> selectSearchAttrs(List<Long> attrIds) {
return baseMapper.selectSearchAttrIds(attrIds);
}
index : 默认 true ,如果为 false ,表示该字段不会被索引,但是检索结果里面有,但字段本身不能 当做检索条件。 doc_values : 默认 true ,设置为 false ,表示不可以做排序、聚合以及脚本操作,这样更节省磁盘空间。 还可以通过设定 doc_values 为 true , index 为 false 来让字段不能被搜索但可以用于排序、聚合以及脚本操作:
spu 在 es 中的存储模型分析总结
如果每个 sku 都存储规格参数,会有冗余存储,因为每个 spu 对应的 sku 的规格参数都一样。但是如果将规格参数单独建立索引会出现检索时出现大量数据传输的问题,会阻塞网络因此我们选用第一种存储模型,以空间换时间。
2、上架细节
上架是将后台的商品放在 es 中可以提供检索和查询功能:
- hasStock:代表是否有库存。默认上架的商品都有库存。如果库存无货的时候才需要更新一下 es
- 库存补上以后,也需要重新更新一下 es
- hotScore 是热度值,我们只模拟使用点击率更新热度。点击率增加到一定程度才更新热度值。
- 下架就是从 es 中移除检索项,以及修改 mysql 状态
商品上架步骤:
- 先在 es 中按照之前的 mapping 信息,建立 product 索引。
- 点击上架,查询出所有 sku 的信息,保存到 es 中
- es 保存成功返回,更新数据库的上架状态信息
3、数据一致性
- 商品无库存的时候需要更新 es 的库存信息
- 商品有库存也要更新 es 的信息
4、ES 中的数组扁平化
关于 “nested”,Nested datatype | Elasticsearch Guide [7.6] | Elastic
ES 中数组的扁平化处理:
对象数组的扁平化:
内部对象字段数组的工作方式与您预期的不同。Lucene 没有内部对象的概念,所以 Elasticsearch 将对象层次结构简化为字段名和值的简单列表。例如,以下文件:
package com.atguigu.gulimall.product.feign;
import com.atguigu.common.to.es.SkuEsModel;
import com.atguigu.common.utils.R;
import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import java.util.List;
@FeignClient("gulimall-search")
public interface SearchFeignService {
@PostMapping("search/save/product")
public R productStatusUp(@RequestBody List<SkuEsModel> skuEsModels);
}
在内部将转换成一个文档,看起来是这样的:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.5.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.atguigu.gulimall</groupId>
<artifactId>gulimall-search</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>gulimall-search</name>
<description>ElasticSearch检索服务</description>
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
<dependencies>
<!--导入es的rest-high-level-client-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.4.2</version>
</dependency>
<dependency>
<groupId>com.auguigu.gulimall</groupId>
<artifactId>gulimall-commom</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
查询 my_index 的映射
spring.cloud.nacos.discovery.server-addr=127.0.0.1:8848
spring.application.name=gulimall-search
server.port=12000
package com.atguigu.gulimall.search;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;
import org.springframework.cloud.client.discovery.EnableDiscoveryClient;
@EnableDiscoveryClient
@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
public class GulimallSearchApplication {
public static void main(String[] args) {
SpringApplication.run(GulimallSearchApplication.class, args);
}
}
user.first 和 user.last 字段被平铺成多值字段,alice 和 white 之间的关联也丢失了。在查询 alice 和 smith 时,这个文档将将发生错误的匹配
package com.atguigu.gulimall.search.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.*;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestOperations;
/**
* 1、导入依赖
* 2、编写配置,给容器中注入一个RestHighLevelClient
* 3、参照API操作
*/
@Configuration
public class GulimallElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
// builder.addHeader("Authorization", "Bearer " + TOKEN);
// builder.setHttpAsyncResponseConsumerFactory(
// new HttpAsyncResponseConsumerFactory
// .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient restHighLevelClient() {
RestClientBuilder builder = RestClient.builder(new HttpHost("192.168.43.125", 9200, "http"));
return new RestHighLevelClient(builder);
}
}
所想要的只是 user.first="Alice",user.last="Smith",本身是查询不到的,但是却查询出来了两条结果:
package com.atguigu.gulimall.search.controller;
import com.atguigu.common.constant.ProductConstant;
import com.atguigu.common.exception.BizCodeEnume;
import com.atguigu.common.to.es.SkuEsModel;
import com.atguigu.common.utils.R;
import com.atguigu.gulimall.search.service.ProductSaveService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RequestMapping("/search/save")
@RestController
@Slf4j
public class ElasticSaveController {
@Autowired
ProductSaveService productSaveService;
/**
* 上架商品
*/
@PostMapping("/product")
public R productStatusUp(@RequestBody List<SkuEsModel> skuEsModels) {
boolean b;
try {
b = productSaveService.productStatusUp(skuEsModels);
} catch (Exception e) {
log.error("ElasticSaveController商品上架错误:{}", e);
return R.error(BizCodeEnume.PRODUCT_UP_EXCEPTION.getCode(), BizCodeEnume.PRODUCT_UP_EXCEPTION.getMsg());
}
if (!b) {
return R.ok();
} else {
return R.error(BizCodeEnume.PRODUCT_UP_EXCEPTION.getCode(), BizCodeEnume.PRODUCT_UP_EXCEPTION.getMsg());
}
}
}
删除 “my_index” 索引
package com.atguigu.gulimall.search.service;
import com.atguigu.common.to.es.SkuEsModel;
import java.io.IOException;
import java.util.List;
public interface ProductSaveService {
boolean productStatusUp(List<SkuEsModel> skuEsModels) throws IOException;
}
重新创建 my_index 索引
public class EsConstant {
public static final String PRODUCT_INDEX = "product"; //sku数据在es中的索引
}
重新插入数据
package com.atguigu.gulimall.search.service.impl;
import com.alibaba.fastjson.JSON;
import com.atguigu.common.to.es.SkuEsModel;
import com.atguigu.gulimall.search.config.GulimallElasticSearchConfig;
import com.atguigu.gulimall.search.constant.EsConstant;
import com.atguigu.gulimall.search.service.ProductSaveService;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
@Slf4j
@Service
public class ProductSaveServiceImpl implements ProductSaveService {
@Autowired
RestHighLevelClient restHighLevelClient;
@Override
public boolean productStatusUp(List<SkuEsModel> skuEsModels) throws IOException {
// 保存到es
// 1、给es中建立索引。product,建立好映射关系
// 2、给es中保存这些数据
// BulkRequest bulkRequest, RequestOptions options
BulkRequest bulkRequest = new BulkRequest();
for (SkuEsModel model : skuEsModels) {
// 1、构造保存请求
IndexRequest indexRequest = new IndexRequest(EsConstant.PRODUCT_INDEX);
indexRequest.id(model.getSkuId().toString());
String jsonString = JSON.toJSONString(model);
indexRequest.source(jsonString, XContentType.JSON);
bulkRequest.add(indexRequest);
}
BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, GulimallElasticSearchConfig.COMMON_OPTIONS);
// TODO 如果批量错误
boolean b = bulk.hasFailures();
List<String> collect = Arrays.stream(bulk.getItems()).map(item -> item.getId()).collect(Collectors.toList());
log.info("商品上架完成:{},返回数据:{}", collect, bulk.toString());
return b;
}
}
查看 my_index 的映射,能够看到 user 的类型变为了 “nested”
<update id="updateSpuStatus">
update pms_spu_info set publish_status=#{code},update_time=NOW() where id =#{spuId}
</update>
再次查询 user.first="Alice",user.last="Smith" 时,查询不到数据
GET my_index/_search
{
"query": {
"bool": {
"must": [
{ "match": { "user.first": "Alice" }},
{ "match": { "user.last": "Smith" }}
]
}
}
}
查询结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
比较 my_index 前后映射的变化:
5、商品上架接口实现
商品上架需要在 es 中保存 spu 信息并更新 spu 的状态信息,由于SpuInfoEntity
与索引的数据模型并不对应,所以我们要建立专门的 vo 进行数据传输
接口文档:商品系统 - 20、商品上架
POST /product/spuinfo/{spuId}/up
请求参数
分页数据
响应数据
{
"msg": "success",
"code": 0
}
功能效果
新增 “com.atguigu.common.to.es.SkuEsModel” 类,代码如下:
package com.atguigu.common.to.es;
import lombok.Data;
import java.math.BigDecimal;
import java.util.List;
@Data
public class SkuEsModel {
private Long skuId;
private Long spuId;
private String skuTitle;
private BigDecimal skuPrice;
private String skuImg;
private Long saleCount;
private boolean hasStock;
private Long hotScore;
private Long brandId;
private Long catalogId;
private String brandName;
private String brandImg;
private String catalogName;
private List<Attr> attrs;
@Data
public static class Attr{
private Long attrId;
private String attrName;
private String attrValue;
}
}
编写商品上架的接口
修改 “com.atguigu.gulimall.product.controller.SpuInfoController” 类,代码如下:
@PostMapping("spuinfo/{spuId}/up")
public R spuUp(@PathVariable("spuId") Long spuId){
spuInfoService.up(spuId);
return R.ok();
}
修改 “com.atguigu.gulimall.product.service.SpuInfoService” 类,代码如下:
/**
* 商品上架
*
* @param spuId
*/
void up(Long spuId);
由于每个 spu 对应的各个 sku 的规格参数相同,因此我们要将查询规格参数提前,只查询一次
修改 “com.atguigu.gulimall.product.service.impl.SpuInfoServiceImpl” 类,代码如下:
@Override
public void up(Long spuId) {
// 1、查出当前spuId对应的sku信息,品牌名字
List<SkuInfoEntity> skus = skuInfoService.getSkuBySpuId(spuId);
List<Long> skuIdList = skus.stream().map(SkuInfoEntity::getSkuId).collect(Collectors.toList());
// 2.1、发送远程调用,库存系统查询是否有库存
Map<Long, Boolean> stockMap = null;
try {
R r = wareFeignService.getSkusHasStock(skuIdList);
TypeReference<List<SkuHasStockVo>> typeReference = new TypeReference<List<SkuHasStockVo>>() {
};
stockMap = r.getData(typeReference).stream().collect(Collectors.toMap(SkuHasStockVo::getSkuId, SkuHasStockVo::getHasStock));
} catch (Exception e) {
log.error("库存服务查询异常,原因:", e);
}
// 2.4、查询当前sku的所有可以被用来检索的规格属性
List<ProductAttrValueEntity> baseAttrs = productAttrValueService.baseAttrListForSpu(spuId);
List<Long> attrIds = baseAttrs.stream().map(attr -> attr.getAttrId()).collect(Collectors.toList());
List<Long> searchAttrIds = attrService.selectSearchAttrs(attrIds);
Set<Long> idSet = new HashSet<>(searchAttrIds);
List<SkuEsModel.Attrs> attrsList = baseAttrs.stream().filter(item -> idSet.contains(item.getAttrId())).map(item -> {
SkuEsModel.Attrs attrs1 = new SkuEsModel.Attrs();
BeanUtils.copyProperties(item, attrs1);
return attrs1;
}).collect(Collectors.toList());
// 2、封装每个sku的信息
Map<Long, Boolean> finalStockMap = stockMap;
List<SkuEsModel> upProducts = skus.stream().map(sku -> {
// 组装需要的数据
SkuEsModel esModel = new SkuEsModel();
BeanUtils.copyProperties(sku, esModel);
esModel.setSkuPrice(sku.getPrice());
esModel.setSkuImg(sku.getSkuDefaultImg());
// 2.1、是否有库存 hasStock,hotScore
if (finalStockMap == null) {
esModel.setHasStock(true);
} else {
esModel.setHasStock(finalStockMap.get(sku.getSkuId()));
}
// 2.2、热度评分。0
esModel.setHotScore(0L);
// 2.3、查询品牌和分类的名字信息
BrandEntity brand = brandService.getById(esModel.getBrandId());
esModel.setBrandName(brand.getName());
esModel.setBrandImg(brand.getLogo());
CategoryEntity category = categoryService.getById(esModel.getCatalogId());
esModel.setCatalogName(category.getName());
// 2.4、设置检索属性
esModel.setAttrs(attrsList);
System.out.println("======================esModel" + esModel);
return esModel;
}).collect(Collectors.toList());
// 3、将数据发送给es进行保存
R r = searchFeignService.productStatusUp(upProducts);
System.out.println("=========================" + r);
if (r.getCode() == 0) {
//远程调用成功
// 3.1、修改当前spu的状态
System.out.println("修改当前spu的状态");
baseMapper.updateSpuStatus(spuId, ProductConstant.StatusEnum.SPU_UP.getCode());
} else {
// 远程调用失败
// TODO 3.2、重复调用?接口幂等性;重试机制
/**
* Feign调用流程:
* 1、构造请求数据,将对象转为json
* RequestTemplate template = buildTemplateFromArgs.create(argv);
* 2、发送请求进行执行(执行成功会解码响应数据)
* executeAndDecode(template)
* 3、执行请求会有重试机制
* while(true){
* try{
* executeAndDecode(template);
* }catch(){
* retryer.continueOrPropagate(e);
* throw ex;
* continue;
* }
* }
*/
}
}
1、查出当前 spuId 对应的 sku 信息, 品牌名字
修改 “com.atguigu.gulimall.product.service.SkuInfoService” 类,代码如下:
/**
* 查出当前spuId对应的sku信息
*
* @param spuId
* @return
*/
List<SkuInfoEntity> getSkuBySpuId(Long spuId);
修改 “com.atguigu.gulimall.product.service.impl.SkuInfoServiceImpl” 类,代码如下:
@Override
public List<SkuInfoEntity> getSkuBySpuId(Long spuId) {
List<SkuInfoEntity> list = this.list(new QueryWrapper<SkuInfoEntity>().eq("spu_id", spuId));
return list;
}
2、封装每个 sku 的信息
2.1、发送远程调用,库存系统查询是否有库存
修改 “com.atguigu.gulimall.product.feign.WareFeignService” 类,代码如下:
package com.atguigu.gulimall.product.feign;
import com.atguigu.common.utils.R;
import com.atguigu.gulimall.product.vo.SkuHasStockVo;
import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import java.util.List;
@FeignClient("gulimall-ware")
public interface WareFeignService {
/**
* 1、R设计的时候可以加上泛型
* 2、直接返回我们想要的结果
* 3、自己封装返回结果
* @param skuIds
* @return
*/
@PostMapping("/ware/waresku/hasStock")
R getSkusHasStock(@RequestBody List<Long> skuIds);
}
修改”com.atguigu.gulimall.ware.controller.WareSkuController”,代码如下:
/**
* 查询sku是否有库存
*/
@PostMapping("hasStock")
public R getSkusHasStock(@RequestBody List<Long> skuIds){
// sku_id, stock
List<SkuHasStockVo> vos = wareSkuService.getSkusHasStock(skuIds);
return R.ok().setData(vos);
}
修改”com.atguigu.gulimall.ware.service.WareSkuService” 类,代码如下:
List<SkuHasStockVo> getSkusHasStock(List<Long> skuIds);
修改”com.atguigu.gulimall.ware.service.WareSkuService” 类,代码如下:
@Override
public List<SkuHasStockVo> getSkusHasStock(List<Long> skuIds) {
List<SkuHasStockVo> collect = skuIds.stream().map(skuId -> {
SkuHasStockVo vo = new SkuHasStockVo();
// 查询sku的总库存量
Long count = baseMapper.getSkuStock(skuId);
vo.setSkuId(skuId);
vo.setHasStock(count == null ? false : count > 0);
return vo;
}).collect(Collectors.toList());
return collect;
}
修改 “com.atguigu.gulimall.ware.dao.WareSkuDao” 类,代码如下
Long getSkuStock(Long skuId);
修改 “com.atguigu.gulimall.ware.dao.WareSkuDao.xml” 类,代码如下
<select id="getSkuStock" resultType="java.lang.Long">
select sum(stock - stock_locked) from wms_ware_sku where sku_id=#{sku_id}
</select>
2.4、查询当前 sku 的所有可以被用来检索的规格属性
修改 “com.atguigu.gulimall.product.service.AttrService” 类,代码如下:
/**
* 在指定的所有属性集合里面,挑出检索属性
*
* @param attrIds
* @return
*/
List<Long> selectSearchAttrs(List<Long> attrIds);
修改 “com.atguigu.gulimall.product.service.impl.AttrServiceImpl” 类,代码如下:
@Override
public List<Long> selectSearchAttrs(List<Long> attrIds) {
return baseMapper.selectSearchAttrIds(attrIds);
}
3、将数据发送给 es 进行保存
修改 “com.atguigu.gulimall.product.feign.SearchFeignService” 类,代码如下:
package com.atguigu.gulimall.product.feign;
import com.atguigu.common.to.es.SkuEsModel;
import com.atguigu.common.utils.R;
import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import java.util.List;
@FeignClient("gulimall-search")
public interface SearchFeignService {
@PostMapping("search/save/product")
public R productStatusUp(@RequestBody List<SkuEsModel> skuEsModels);
}
创建 gulimall-search
1、添加 pom
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.5.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.atguigu.gulimall</groupId>
<artifactId>gulimall-search</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>gulimall-search</name>
<description>ElasticSearch检索服务</description>
<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
<dependencies>
<!--导入es的rest-high-level-client-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.4.2</version>
</dependency>
<dependency>
<groupId>com.auguigu.gulimall</groupId>
<artifactId>gulimall-commom</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
2、修改 yml
spring.cloud.nacos.discovery.server-addr=127.0.0.1:8848
spring.application.name=gulimall-search
server.port=12000
3、添加主配置类
package com.atguigu.gulimall.search;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration;
import org.springframework.cloud.client.discovery.EnableDiscoveryClient;
@EnableDiscoveryClient
@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
public class GulimallSearchApplication {
public static void main(String[] args) {
SpringApplication.run(GulimallSearchApplication.class, args);
}
}
4、配置 ElaseaticSearch
修改 “com.atguigu.gulimall.search.config.GulimallElasticSearchConfig” 类,代表如下:
package com.atguigu.gulimall.search.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.*;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestOperations;
/**
* 1、导入依赖
* 2、编写配置,给容器中注入一个RestHighLevelClient
* 3、参照API操作
*/
@Configuration
public class GulimallElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
// builder.addHeader("Authorization", "Bearer " + TOKEN);
// builder.setHttpAsyncResponseConsumerFactory(
// new HttpAsyncResponseConsumerFactory
// .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient restHighLevelClient() {
RestClientBuilder builder = RestClient.builder(new HttpHost("192.168.43.125", 9200, "http"));
return new RestHighLevelClient(builder);
}
}
修改 “com.atguigu.gulimall.search.controller.ElasticSaveController” 类,代表如下:
package com.atguigu.gulimall.search.controller;
import com.atguigu.common.constant.ProductConstant;
import com.atguigu.common.exception.BizCodeEnume;
import com.atguigu.common.to.es.SkuEsModel;
import com.atguigu.common.utils.R;
import com.atguigu.gulimall.search.service.ProductSaveService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RequestMapping("/search/save")
@RestController
@Slf4j
public class ElasticSaveController {
@Autowired
ProductSaveService productSaveService;
/**
* 上架商品
*/
@PostMapping("/product")
public R productStatusUp(@RequestBody List<SkuEsModel> skuEsModels) {
boolean b;
try {
b = productSaveService.productStatusUp(skuEsModels);
} catch (Exception e) {
log.error("ElasticSaveController商品上架错误:{}", e);
return R.error(BizCodeEnume.PRODUCT_UP_EXCEPTION.getCode(), BizCodeEnume.PRODUCT_UP_EXCEPTION.getMsg());
}
if (!b) {
return R.ok();
} else {
return R.error(BizCodeEnume.PRODUCT_UP_EXCEPTION.getCode(), BizCodeEnume.PRODUCT_UP_EXCEPTION.getMsg());
}
}
}
修改 “com.atguigu.gulimall.search.service.ProductSaveService” 类,代表如下:
package com.atguigu.gulimall.search.service;
import com.atguigu.common.to.es.SkuEsModel;
import java.io.IOException;
import java.util.List;
public interface ProductSaveService {
boolean productStatusUp(List<SkuEsModel> skuEsModels) throws IOException;
}
修改 “com.atguigu.gulimall.search.constant.EsConstant” 类,代表如下:
public class EsConstant {
public static final String PRODUCT_INDEX = "product"; //sku数据在es中的索引
}
修改 “com.atguigu.gulimall.search.service.impl.ProductSaveServiceImpl” 类,代表如下:
package com.atguigu.gulimall.search.service.impl;
import com.alibaba.fastjson.JSON;
import com.atguigu.common.to.es.SkuEsModel;
import com.atguigu.gulimall.search.config.GulimallElasticSearchConfig;
import com.atguigu.gulimall.search.constant.EsConstant;
import com.atguigu.gulimall.search.service.ProductSaveService;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.xcontent.XContentType;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
@Slf4j
@Service
public class ProductSaveServiceImpl implements ProductSaveService {
@Autowired
RestHighLevelClient restHighLevelClient;
@Override
public boolean productStatusUp(List<SkuEsModel> skuEsModels) throws IOException {
// 保存到es
// 1、给es中建立索引。product,建立好映射关系
// 2、给es中保存这些数据
// BulkRequest bulkRequest, RequestOptions options
BulkRequest bulkRequest = new BulkRequest();
for (SkuEsModel model : skuEsModels) {
// 1、构造保存请求
IndexRequest indexRequest = new IndexRequest(EsConstant.PRODUCT_INDEX);
indexRequest.id(model.getSkuId().toString());
String jsonString = JSON.toJSONString(model);
indexRequest.source(jsonString, XContentType.JSON);
bulkRequest.add(indexRequest);
}
BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, GulimallElasticSearchConfig.COMMON_OPTIONS);
// TODO 如果批量错误
boolean b = bulk.hasFailures();
List<String> collect = Arrays.stream(bulk.getItems()).map(item -> item.getId()).collect(Collectors.toList());
log.info("商品上架完成:{},返回数据:{}", collect, bulk.toString());
return b;
}
}
3.1、修改当前 spu 的状态
修改 "com.atguigu.gulimall.product.dao.SpuInfoDao" 类,代码如下:
void updateSpuStatus(@Param("spuId") Long spuId,@Param("code") int code);
修改 "com.atguigu.gulimall.product.dao.SpuInfoDao.xml" 类,代码如下:
<update id="updateSpuStatus">
update pms_spu_info set publish_status=#{code},update_time=NOW() where id =#{spuId}
</update>
Top comments (0)