ElasticSearch

一. ElasticSearch介绍

1.1 引言

  1. 在海量数据中执行所有功能时候,使用MySQL,效率太低
  2. 如果输入的关键字不准确,同样可以搜索到数据
  3. 将搜索的关键字,以红色字体显示

1.2 ES的介绍

ES是使用Java语言,基于Lucene的全文检索框架,实现分布式全文检索的功能,提供基于RESTful风格的WEB接口;官方也为各种语言通过的API

Lucene:本身就是全文检索的底层

分布式:主要是为了突出他的横向扩展功能(搭建集群)

全文检索:分词器(将一段词语进行分词,统一放到分词库中,在检索时候,根据关键词从分词库中检索,找到匹配的内容)+倒排查找

RESTful风格的WEB接口:只需要发送一个HTTP请求,根据请求方式的不同,携带参数的不同,执行相应的功能

1.3 ES和Slor

1. Slor在查询死数据的时候比ES的效率快,但是查询实时数据的时候,Slor的查询速度会降低很多而ES基本没有什么变化
2. Slor的搭建基于Zookeeper来管理,ES本身就支持集群的搭建,不需要第三方的框架
3. 

1.4 倒排索引

二.ElasticSearch安装

2.1 安装ES&Kibana

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
version: '3.1'
services:
elasticsearch:
image: daocloud.io/library/elasticsearch:6.5.4
restart: always
container_name: elasticsearch
environment:
- TZ=Asia/Shanghai #设置时区
- "cluster.name=elasticsearch" #设置集群名称为elasticsearch
- "discovery.type=single-node" #以单一节点模式启动
- "ES_JAVA_OPTS=-Xms256m -Xmx256m" #设置使用jvm内存大小
ports:
- 9200:9200
kibana:
image: daocloud.io/library/kibana:6.5.4
restart: always
container_name: kibana
ports:
- 5601:5601
environment:
- elasticsearch_url=http://192.168.192.130:9200
depends_on:
- elasticsearch

2.2 安装IK分词器

下载IK分词器的地址:https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.4/elasticsearch-analysis-ik-6.5.4.zip

安装步骤: 安装结束之后,需要重启elasticsearch

1
2
3
4
docker ps
docker exec -it 69 bash
cd bin
./elasticsearch-plugin https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.4/elasticsearch-analysis-ik-6.5.4.zip

测试:

1
2
3
4
5
POST _analyze
{
"analyzer": "ik_max_word",
"text": "吴超最牛逼"
}

三.ElasticSearch基本操作

ES的结构

索引 Index

ES服务可以创建多个索引Index

每个索引Index默认分成五个分片,每个分片存在至少一个备份分片,

默认备份分片不会参与检索,当主分片压力过大时候才会参与检索

备份的分片必须放在不同的服务中

类型Typeimage-20201004093816662

一个索引Index下可以创建多个类型Type

根据版本不同,类型Type的创建也会不同

type

文档 doc

一个类型Type下,可以创建多个文档doc

这个文档类似于MySQL表中的多行数据

image-20201004094656029

属性 field

一个文档doc中,可以包含多个属性filed

类似于MySQL表中一行数据存在多个列

image-20201004094905540

3.2 操作ES的RESTful语法

GET请求:

查询索引 :http://ip:port/index

查询指定的文档:http://ip:port/index/type/doc_id

POST请求:

查询文档:http://ip:port/index/type/_search 可以在请求体重添加json字符串代表查询条件

修改文档:http://ip:port/index/type/do_id/_update_可以在请求体中添加json字符串代表修改的信息

PUT请求:

创建索引:http://ip:port/index/ 可以在请求体重添加json字符串,指定索引的信息、类型、结构

创建索引的时,指定索引文档存储的属性信息:http://ip:port/index/type/type/_mappings_

DELETE请求:

删库跑路:http://ip:port/index/

删除指定的文档:http://ip:port/index/type/doc_id

3.3 索引ndex的操作

3.3.1 创建一个索引

1
2
3
4
5
6
7
8
#创建一个索引
PUT /person
{
"settings": {
"number_of_shards": 5
, "number_of_replicas": 1
}
}

3.3.2 查看一个索引

image-20201004100733601

1
2
# 查看索引信息
GET /person

3.3.3 删除索引Index

image-20201004100944344

1
2
# 删除索引
DELETE /person

3.4 ES中field可以指定的类型

String 类型

text: 一般被用于全文检索,(商品的描述),可以将当前的filed进行分词

keyword:当前的field不会被分词

数值类型:

  • long:
  • integer:
  • short:
  • double:
  • float:
  • half_float:精度比float小一半
  • scaled_float:根据一个long类型和指定的scaled来表达浮点型,long 345 sacled 100 表达 3.45

时间类型

  • date类型:根据时间类型指定具体的格式

布尔类型

  • boolean类型:表达true和false

二进制类型:

  • binary:基于64位 encode string

范围类型

  • long_range:赋值时候无需指定具体的内容,只需要存储一个范围就行,例如 gt,lt,gte,lte
  • float_range:
  • Integer_range:
  • double_range:
  • date_range:
  • ip_range:

GEO经纬度类型

  • geo_point:用来存储经纬度

IP类型

  • ip:存储ipv4和ipv6

其他的数据类型参考官网

https://www.elastic.co/guide/en/elasticsearch/reference/6.5/mapping-types.html

3.5 创建索引Index并指定数据结构(field)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# 创建索引 指定数据结构
PUT /book
{
"settings": {
"number_of_shards": 5, #分片数
"number_of_replicas": 1 #备份数
},
"mappings": { #指定数据结构
"novel": { # 类型名字
#文档属性field
"properties": {
#field name 属性名
"name": {
#类型
"type": "text",
#指定分词器
"analyzer": "ik_max_word",
#指定当前field可以作为查询的条件
"index": true,
#是否需要额外存储
"store": false
},
"auther": {
"type": "keyword"
},
"count": {
"type": "long"
},
"onsale": {
"type": "date",
#时间类型的格式化方式
"format": "yyyy-MM-dd HH:mm:dd||yyyy-MM-dd ||epoch_millis"
},
"descr": {
"type": "text",
"analyzer": "ik_max_word"
}
}
}
}
}

3.6 文档的操作

文档在ES服务中的唯一标识, ‘-index’, ‘-type’, ’_id’ 为组合锁定一个文档,来区分操作是添加还是修改

3.6.1 新建文档

‘自动生成id’

1
2
3
4
5
6
7
8
9
#添加文档,自动生成ID
POST /book/novel
{
"name": "盘龙",
"auther": "我是西红柿",
"count": 1000000,
"onsale": "2000-01-01",
"descr": "哈哈哈哈哈呵呵呵呵嘻嘻嘻"
}

‘手动指定id’

1
2
3
4
5
6
7
8
9
#添加文档,手动指定id
PUT /book/novel/1
{
"name": "红楼梦",
"auther": "曹雪芹",
"count": 10000000,
"onsale": "1980-01-01",
"descr": "红学经典"
}

3.6.2 修改文档

‘覆盖式修改’

1
2
3
4
5
6
7
8
9
#覆盖式修改,如果id存在,则会覆盖,否则新建
PUT /book/novel/1
{
"name": "红楼梦",
"auther": "曹雪芹",
"count": 30000000,
"onsale": "1980-01-01",
"descr": "红学经典"
}

‘doc修改方式’

1
2
3
4
5
6
7
8
#修改文档,基于doc
POST /book/novel/1/_update
{
"doc": {
# 指定需要修改的field以及值
"desc": "红学,经典,红学经典"
}
}

3.6.3 删除文档

‘根据id删除文档’

1
2
#根据id删除文档
DELETE /book/novel/ag4__nQBx-0Rtxi8Bcsw

四.Java操作ElasticSearch

4.1 Java连接ES

创建maven项目

添加依赖

  1. elasticsearch
  2. elasticsearch高级API(REST High Level)
  3. lombok
  4. Junit
1
2
3
4
5
6
7
8
9
10
11
12
13
14
<!--  elasticsearch的依赖  -->
<!-- 1. elasticsearch-->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.5.4</version>
</dependency>

<!-- 2. elasticsearch高级API(REST High Level)-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>6.5.4</version>
</dependency>

连接ES,创建测试类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
package space.wuchao.distribution.common.utils;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;

/**
* @author wuchao
* @class ESClient
* @module ElasticSearch
* @blame wuchao
* @since 2020/10/6 23:35
*/
public class ESClient {
public static RestHighLevelClient getClient(){

// 创建一个HttpHost
HttpHost httpHost = new HttpHost("192.168.192.130", 9200);

// 构建一个RestClientBuilder
RestClientBuilder clientBuilder = RestClient.builder(httpHost);

// 创建一个RestHighLevelClient
RestHighLevelClient client = new RestHighLevelClient(clientBuilder);

//返回
return client;
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
package space.wuchao.distribution.elasticsearch;

import org.elasticsearch.client.ElasticsearchClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import space.wuchao.distribution.common.utils.ESClient;

/**
* @author wuchao
* @class Demo1
* @module Demo1
* @blame wuchao
* @since 2020/10/6 23:41
*/

@RunWith(SpringRunner.class)
@SpringBootTest
public class Demo1 {

@Test
public void testConnect(){
RestHighLevelClient client = ESClient.getClient();
System.out.println(client+"OK!");
}
}

4.2 索引操作

4.2.1 Java创建索引并指定数据结构field

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
package space.wuchao.distribution.elasticsearch;

import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;
import space.wuchao.distribution.common.utils.ESClient;

import java.io.IOException;

/**
* @author wuchao
* @class Demo2
* @module index
* @blame wuchao
* @since 2020/10/22 15:23
*/

@RunWith(SpringRunner.class)
@SpringBootTest
public class Demo2CreateIndex {

String index = "person";
String type = "man";
RestHighLevelClient client = ESClient.getClient();

@Test
public void createIndex() throws IOException {

// 关于索引的settings设置
Settings.Builder settings = Settings.builder()
.put("number_of_shards",5)
.put("number_of_replicas",1);
// 关于索引的结构mappings
XContentBuilder mappings =
JsonXContent.contentBuilder()
.startObject()
.startObject("properties")
.startObject("name")
.field("type","text")
.field("analyzer","ik_max_word")
.field("index",true)
.field("store",false)
.endObject()
.startObject("age")
.field("type","integer")
.endObject()
.startObject("birthday")
.field("type","date")
.field("format","yyyy-MM-dd")
.endObject()
.endObject()
.endObject();
// 将settings和mappings封装到request对象中
CreateIndexRequest request = new CreateIndexRequest(index)
.settings(settings)
.mapping(type,mappings);
// 通过client对象连接ES 并执行创建索引
CreateIndexResponse response =
client.indices().create(request, RequestOptions.DEFAULT);
System.out.println("response"+response.toString());

}
}

4.2.2 检查索引是否存在,删除索引

‘检查索引是否存在’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/**
* 判断索引是否存在
* @throws IOException 异常
*/
@Test
public void exists() throws IOException {
// 1.requst请求

GetIndexRequest request = new GetIndexRequest();
request.indices(index);
// 2.Client操作

Boolean response = client.indices().exists(request,RequestOptions.DEFAULT);
// 3.返回结果

System.out.println(response);
}

‘删除索引操作’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/**
* 删除索引
* @throws IOException 异常
*/
@Test
public void deleteIndex() throws IOException {

// 1.requst请求
DeleteIndexRequest request = new DeleteIndexRequest();
request.indices(index);
// 2.client操作
AcknowledgedResponse response =
client.indices().delete(request,RequestOptions.DEFAULT);
// 3.返回结果
System.out.println(response);
}

4.3 文档操作

4.3.1添加文档操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/**
* 添加文档的操作
* @throws IOException 异常
*/
@Test
public void createdDoc() throws IOException {
// 1.构建json字符串
Person person = new Person(1, "wuchao", 28, new Date());
String json = mapper.writeValueAsString(person);

// 2.构建request对象
IndexRequest request = new IndexRequest(index, type, person.getId().toString());
request.source(person, XContentType.JSON);
// 3.使用client执行
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
// 4.返回结果
System.out.println(response.getResult().toString());
}

4.3.2 修改文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
 /**
* 修改文档
* @throws IOException 异常
*/
@Test
public void updatedDoc() throws IOException {
// 1.创建一个map,指定要修改的内容
Map<String, Object> doc = new HashMap<>();
doc.put("name","吴超");
String docId = "1";
// 2.创建一个request对象,封装数据
UpdateRequest request = new UpdateRequest(index, type, docId);
request.doc(doc);
// 3.client执行
UpdateResponse response = client.update(request, RequestOptions.DEFAULT);
// 4.返回结果
System.out.println(response.getResult().toString());
}

4.3.3 删除文档doc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/**
* 删除文档doc操作
* @throws IOException 异常
*/

@Test
public void deletedDoc() throws IOException {

String docId = "1";
// 1.创建一个request对象

DeleteRequest request = new DeleteRequest(index, type, docId);
// 2.client执行

DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
// 3.返回结果

System.out.println(response.getResult().toString());
}

4.4 批量操作Doc

4.1.1 批量添加

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/**
* 批量创建doc
* @throws IOException 异常
*/
@Test
public void bulkCreate() throws IOException {

// 1.创建多个Json
Person p1 = new Person(1, "张三", 23, new Date());
Person p2 = new Person(2, "李四", 24, new Date());
Person p3 = new Person(3, "王五", 25, new Date());

String json1 = mapper.writeValueAsString(p1);
String json2 = mapper.writeValueAsString(p2);
String json3 = mapper.writeValueAsString(p3);
// 2.封装到request对象中
BulkRequest request = new BulkRequest();
request.add(new IndexRequest(index, type, p1.getId().toString()).source(json1, XContentType.JSON));
request.add(new IndexRequest(index, type, p2.getId().toString()).source(json2, XContentType.JSON));
request.add(new IndexRequest(index, type, p3.getId().toString()).source(json3, XContentType.JSON));
// 3.client执行
BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
// 4.返回结果
System.out.println(response.toString());
}

4.4.2 批量删除

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/**
* 批量删除Doc
* @throws IOException 异常
*/
@Test
public void bulkDelete() throws IOException {
// 2.封装到request对象
BulkRequest request = new BulkRequest();
request.add(new DeleteRequest(index, type, "1"));
request.add(new DeleteRequest(index, type, "2"));
request.add(new DeleteRequest(index, type, "3"));
// 3.client执行
BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
// 4.返回结果
System.out.println(response.toString());
}

五.ElasticSearch练习

索引index

‘创建索引index’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
String index = "sms-logs-index";
String type = "sms_logs_type";
RestHighLevelClient client = ESClient.getClient();

ObjectMapper mapper = new ObjectMapper();

/**
* 创建index
* @throws IOException 异常
*/
@Test
public void createdIndex() throws IOException {
// 1.setting设置

Settings.Builder builder = Settings.builder()
.put("number_of_shards", 5)
.put("number_of_replicas", 1);
// 2.mapping设置

XContentBuilder mapping = JsonXContent.contentBuilder()
.startObject()
.startObject("properties")
.startObject("createDate")
.field("type", "text")
.endObject()
.startObject("smsContent")
.field("type", "text")
.field("analyzer", "ik_max_word")
.endObject()
.startObject("sendDate")
.field("type", "date")
.field("format", "yyyy-MM-dd HH:mm:ss")
.endObject()
.startObject("longCode")
.field("type", "text")
.endObject()
.startObject("mobile")
.field("type", "text")
.endObject()
.startObject("corpName")
.field("type", "text")
.field("analyzer", "ik_max_word")
.endObject()
.startObject("state")
.field("type", "integer")
.endObject()
.startObject("operatorId")
.field("type", "integer")
.endObject()
.startObject("province")
.field("type", "text")
.endObject()
.startObject("ipAddr")
.field("type", "ip")
.endObject()
.startObject("replyTotal")
.field("type", "integer")
.endObject()
.startObject("fee")
.field("type", "integer")
.endObject()
.endObject()
.endObject();
// 3.封装到request请求中

CreateIndexRequest request = new CreateIndexRequest(index)
.settings(builder)
.mapping(type, mapping);
// 4.client执行
CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
// 5.返回结果
System.out.println(response.toString());
}

‘创建文档doc,添加数据’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/**
* 创建doc 给index添加数据
* @throws IOException 异常
*/
@Test
public void createdDoc() throws IOException {
// 1.创建Json数据
SmsLogIndex smsLogIndex = new SmsLogIndex(
1,"2018-1-1 23:59:59", new Date(), "【途虎养车】亲爱的灯先生,您的爱车已经购买",
"106900000009","13811448740","途虎养车",1,1,
"河北省","103.36.136.28",10,3);
SmsLogIndex smsLogIndex1 = new SmsLogIndex(
2,"2018-1-1 23:59:59", new Date(), "【滴滴打车】亲爱的灯先生,您的滴滴打车已经到位",
"906900000010","16619920279","滴滴打车",1,1,
"北京市","221.208.176.25",20,6);
SmsLogIndex smsLogIndex2 = new SmsLogIndex(
3,"2018-1-1 23:59:59", new Date(), "【阿里巴巴】亲爱的灯先生,您的商品已经购买",
"506900000011","13511116665","阿里巴巴",1,1,
"杭州市","103.36.136.28",30,9);
String json = mapper.writeValueAsString(smsLogIndex);
String json1 = mapper.writeValueAsString(smsLogIndex1);
String json2 = mapper.writeValueAsString(smsLogIndex2);

// 2.创建request请求
BulkRequest request = new BulkRequest();
request.add(new IndexRequest(index,type,smsLogIndex.getId().toString())
.source(json, XContentType.JSON));
request.add(new IndexRequest(index,type,smsLogIndex1.getId().toString())
.source(json1, XContentType.JSON));
request.add(new IndexRequest(index,type,smsLogIndex2.getId().toString())
.source(json2, XContentType.JSON));
// 3.client执行
BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
// 4.返回结果
System.out.println(response.toString());
}

六.ElasticSearch各种查询

6.1 term & terms 查询

6.1.1 term查询

term查询是完整查询,搜索之前不会对关键字进行分词,直接用搜索的关键字分词库中去查询,得到结果。

  • json格式
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# term 查询

POST /sms-logs-index/sms-logs-type/_search
{
"from": 0,
"size": 5,
"query": {
"term": {
"province": {
"value": "北京市"
}
}
}
}
  • Java代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  /**
* term查询
* @throws IOException 异常
*/
@Test
public void termQuery() throws IOException {

// 1.request对象

SearchRequest request = new SearchRequest(index);
request.types(type);
// 2.设置查询条件
SearchSourceBuilder builder = new SearchSourceBuilder();
builder.from(0);
builder.size(5);
builder.query(QueryBuilders.termQuery("province","北京市"));

request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
Map<String, Object> result = hit.getSourceAsMap();
System.out.println(result);
}
}

当term查询时候,如果查询的关键字对应的filed的type不是keyword,而是text,则doc中的数据已经被分词,需要关键字与doc中经过分词后的值一致才可以查到结果。即使关键字对应field的type是keyword,如果关键字与doc中的值不完全一致也会查不到结果

‘示例’

1
2
3
4
5
6
7
8
9
10
11
12
13
# corpName field的type为text,当关键字为“滴滴打车”时,并不能查询到结果
POST /sms-logs-index/sms-logs-type/_search
{
"from": 0,
"size": 5,
"query": {
"term": {
"corpName": {
"value": "滴滴打车"
}
}
}
}

‘示例二:’

1
2
3
4
5
6
7
8
9
10
11
12
13
# 当field的type类型为keyword的时候,关键字与doc中的不一致,也查询不到结果
POST /sms-logs-index/sms-logs-type/_search
{
"from": 0,
"size": 5,
"query": {
"term": {
"province": {
"value": "北京"
}
}
}
}

6.1.2 terms查询

terms与term查询的机制一样,不会将查询的关键字进行分词,而是去分词库中直接去匹配,找到相应的doc内容

terms是在针对一个字段包含多个值的情况下使用

term:where province=“北京”

terms:where province=“北京” or province = “另一个值”

‘Json格式’

1
2
3
4
5
6
7
8
9
10
11
12
13
# terms查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"terms": {
"province": [
"北京市",
"杭州市"
]
}
}
}

‘Java实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/**
* terms查询
* @throws IOException 异常
*/
@Test
public void termsQuery() throws IOException {

// 1.request对象

SearchRequest request = new SearchRequest(index);
request.types(type);
// 2.查询条件

List<String> list = new ArrayList<>();
list.add("北京市");
list.add("杭州市");
SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.termsQuery("province",list));
request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 4.返回结果
for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.toString());
}
}

6.2 match查询

match查询是高层查询,根据查询字段的数据类型的不同,采用不同的查询方式。

  • 查询的是日期或者是数值的话,match会将你给的字符串查询内容转换成日期格式或者是数值格式。
  • 如果查询的内容是一个不能被分词的内容(keyword),match查询不会对你指定的查询关键字进行分词。
  • 如果查询的内容是一个可以被分词的内容(text),match查询会将你个查询的关键字根据一定的方式去分词,去分词库中匹配查询的内容。

match查询,实际上底层是多个term查询,将多个term查询的结果封装到一起

6.2.1 match_all查询

‘Json格式’

1
2
3
4
5
6
7
8
#match_all查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"match_all": {}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/**
* match_all查询
* @throws IOException 异常
*/
@Test
public void matchAllQuery() throws IOException {

// 1.request对象

SearchRequest request = new SearchRequest(index);
request.types(type);
// 2.查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.size(20)//查询结果显示条数,ES默认显示10条数据
builder.query(QueryBuilders.matchAllQuery());

request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.toString());
}
}

6.2.2 match查询

‘Json格式’

1
2
3
4
5
6
7
8
9
10
#match 查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"match": {
"smsContent": "亲爱的灯先生"
}
}
}

‘Java实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/**
* match 查询
* @throws IOException 异常
*/
@Test
public void matchQuery() throws IOException {

// 1.query对象

SearchRequest request = new SearchRequest(index);
request.types(type);
// 2.查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.matchQuery("smsContent","亲爱的灯先生"));
request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.toString());
}
}

6.2.3 布尔match查询

对一个field采用and或者or连接来匹配内容

‘Json格式’

1
2
3
4
5
6
7
8
9
10
11
12
13
#布尔match查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"match": {
"smsContent": {
"query": "亲爱的 爱车",
"operator": "and"#即包括亲爱的又包括爱车
}
}
}
}

‘or查询’

1
2
3
4
5
6
7
8
9
10
11
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"match": {
"smsContent": {
"query": "亲爱的 爱车",
"operator": "or"
}
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/**
* 布尔match查询
* @throws IOException 异常
*/
@Test
public void boolMatchQuery() throws IOException {

// 1.创建request对象

SearchRequest request = new SearchRequest(index);
request.types(type);
// 2.指定查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.matchQuery("smsContent", "亲爱的 爱车")
.operator(Operator.AND));// 可以使用and 或者 or 表示对多个内容的连接方式
request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.toString());
}
}

6.2.4 multi_match查询

match 查询针对一个field查询,multi_match查询是针对多个field查询 多个field对应多个text

‘Json格式’

1
2
3
4
5
6
7
8
9
10
#multi_match查询
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"multi_match": {
"query": "亲爱的 135",
"fields": ["smsContent", "mobile"]
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/**
* multi_match查询
* @throws IOException 异常
*/
@Test
public void multiMatchQuery() throws IOException {

// 1.创建request对象

SearchRequest request = new SearchRequest(index);
request.types(type);
// 2.设置查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders
.multiMatchQuery("亲爱的 1235", "smsContent","mobile"));

request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.toString());
}
}

6.3 其他查询

6.3.1 id 查询

id查询,类似于MySQL中的 where id = ?

‘json格式查询’

1
2
#id查询
GET /sms-logs-index/sms-logs-type/2

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/**
* id 查询
* @throws IOException 异常
*/
@Test
public void findById() throws IOException {

// 1.创建request对象

GetRequest request = new GetRequest(index, type, "2");
// 2.client执行

GetResponse response = client.get(request, RequestOptions.DEFAULT);
// 3.返回查询结果

System.out.println(response.toString());
}

6.3.2 ids查询

ids查询类似于MySQL中的 where id in (id1, id2, id3 …….)

‘Json格式实现’

1
2
3
4
5
6
7
8
9
#ids查询
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"ids": {
"values": ["1", "2"]
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* ids查询
* @throws IOException 异常
*/
@Test
public void findByIds() throws IOException {

// 1.创建request查询

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
//需要使用addIds来添加id
builder.query(QueryBuilders.idsQuery().addIds("1", "2"));

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}

6.3.3 prefix查询

前缀查询,可以通过一个关键字作为field的前缀,来查询指定的文档

‘Json格式查询’

1
2
3
4
5
6
7
8
9
10
11
12
#prefix查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"prefix": {
"corpName": {
"value": "滴滴"
}
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/**
* prefix查询
* @throws IOException 异常
*/
@Test
public void prefixQuery() throws IOException {

// 1.创建request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.prefixQuery("corpName", "滴滴"));

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}

6.3.4 fuzzy查询

模糊查询,输入一个模糊的关键字,ES就可以根据输入的关键字大概去匹配一个结果。查询结果不太稳定

可以指定前几个字符不能有错 prefix_length来指定

‘Json格式查询’

1
2
3
4
5
6
7
8
9
10
11
POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"fuzzy": {
"smsContent": {
"value": "亲爱得",
"prefix_length": 2 #指定前几个字不能错
}
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* fuzzy查询
* @throws IOException 异常
*/
@Test
public void fuzzyQuery() throws IOException {

// 1.创建request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.fuzzyQuery("smsContent", "亲爱得")
.prefixLength(2));

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}

6.3.5 wild_card查询

wild_card查询:通配查询,和MySQL中的like是一个套路,可以在查询中,在关键字中指定通配符*和占位符?

‘Json实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#wild_card查询
#通配符*

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"wildcard": {
"smsContent": {
"value": "途*"
}
}
}
}

#占位符?

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"wildcard": {
"smsContent": {
"value": "养?"
}
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* wild_card查询 通配符查询
* @throws IOException 异常
*/
@Test
public void wildCardQuery() throws IOException {

// 1.request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.wildcardQuery("smsContent", "途*"));
// builder.query(QueryBuilders.wildcardQuery("smsContent", "养?"));

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}

6.3.6 range查询

范围查询,只针对数值类型,对某一个field进行大于或者小于的范围查询

‘Json查询’

1
2
3
4
5
6
7
8
9
10
11
12
13
#range查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"range": {
"fee": {
"gte": 3, # "gt": 3, # "lt": 7,
"lte": 7
}
}
}
}

‘Java查询’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* range查询 范围查询 只针对数值类型数据
* @throws IOException 异常
*/
@Test
public void rangeQuery() throws IOException {

// 1.request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.rangeQuery("fee").gte(3).lte(7));

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {

System.out.println(hit.getSourceAsMap());
}
}

6.3.7 regexp查询

正则查询 通过你编写的正则表达式去匹配内容

prefix、fuzzy、wildcard、regexp查询效率相对比较低,要求效率时候,尽量避免使用这几种查询方式

‘Json格式查询’

1
2
3
4
5
6
7
8
9
10
#regezp 查询 正则查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"regexp": {
"mobile": "135[0-9]{8}"
}
}
}

‘Java查询’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* regexp查询 正则表达式查询
* @throws IOException 异常
*/
@Test
public void regexpQuery() throws IOException {

// 1.request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.regexpQuery("mobile", "135[0-9]{8}"));

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {

System.out.println(hit.getSourceAsMap());
}
}

6.4 深分页查询 Scroll查询

ES对from+size分页是有限制的 from+size的之和不能超过一万

原理:

from+size ES查询数据的方式:

  1. 先将用户指定的关键字进行一个分词。
  2. 将分词后的关键字在分词库中检索,获取到文档doc的id
  3. 去各个分片中拉取指定的数据, 查询时间比较长
  4. 将查询的到的数据按着一定的规律score进行排序
  5. 根据from的值,将查询到的数据舍弃一部分
  6. 返回结果

Scroll+size 在ES查询数据的方式:

  1. 先将用户指定的关键字进行分词
  2. 将分词后的关键字在分词库中检索,获取文档doc的id
  3. 将查询到的数据放入到ES的上下文中
  4. 根据size的个数在ES中查询指定个数的数据,同时在ES的上下文中删除已查询到的数据的id
  5. 如果要查询下一页数据,从ES上下文中找后续的数据,循环执行,知道上下文中的数据被全部取出
  6. 循环第四步和第五步,返回结果

Scroll查询方式,不能查询实时数据

‘Json格式查询’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#Scroll查询,返回第一页数据,将scroll_id存储在ES的上下文中,并且指定id的存活时间

POST /sms-logs-index/sms-logs-type/_search?scroll=1m
{
"query": {
"match_all": {}
},
"size": 2,
"sort": [
{
"fee": {
"order": "desc"
}
}
]
}

#Scroll查询下一页数据
POST /_search/scroll
{
"scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAA2bqFjk1TC1lbHJ3U0pXRVV1NkRGYVllZUEAAAAAAANm7hY5NUwtZWxyd1NKV0VVdTZERmFZZWVBAAAAAAADZusWOTVMLWVscndTSldFVXU2REZhWWVlQQAAAAAAA2bsFjk1TC1lbHJ3U0pXRVV1NkRGYVllZUEAAAAAAANm7RY5NUwtZWxyd1NKV0VVdTZERmFZZWVB",
"scroll": "1m"
}

#删除scroll的上下文中所有的数据
DELETE /_search/scroll/DnF1ZXJ5VGhlbkZldGNoBQAAAAAAA2bqFjk1TC1lbHJ3U0pXRVV1NkRGYVllZUEAAAAAAANm7hY5NUwtZWxyd1NKV0VVdTZERmFZZWVBAAAAAAADZusWOTVMLWVscndTSldFVXU2REZhWWVlQQAAAAAAA2bsFjk1TC1lbHJ3U0pXRVV1NkRGYVllZUEAAAAAAANm7RY5NUwtZWxyd1NKV0VVdTZERmFZZWVB

‘Java代码查询’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
/**
* Scroll 查询
* @throws IOException 异常
*/
@Test
public void scrollQuery() throws IOException {

// 1.创建SearchRequest请求

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.指定Scorll信息、Scroll存活时间

request.scroll(TimeValue.timeValueMinutes(1L));

// 3.指定查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.size(2);
builder.sort("fee", SortOrder.DESC);
builder.query(QueryBuilders.matchAllQuery());

request.source(builder);

// 4.client执行,查询首页数据

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 5.返回结果 scrollId source

String scrollId = response.getScrollId();
System.out.println("----------首页----------");
for (SearchHit hit : response.getHits().getHits()) {

System.out.println(hit.getSourceAsMap());
}

// 6.循环---创建SearchScrollRequest 查询下一页数据

while (true) {
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
//5.1 设置scroll存活时间
scrollRequest.scroll(TimeValue.timeValueMinutes(2L));

//5.2 执行下一页查询
SearchResponse scroll = client.scroll(scrollRequest, RequestOptions.DEFAULT);

//5.3 获取下一页数据
SearchHit[] hits = scroll.getHits().getHits();

//5.4 判断下一页数据是否存在,存在则输出,不存在,跳出循环
if (hits.length>0 && hits != null){
System.out.println("----------下一页----------");
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsMap());
}
}else {
System.out.println("----------结束----------");
break;
}
}

//7.清除scroll上下文数据

ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
clearScrollRequest.addScrollId(scrollId);
ClearScrollResponse clearScrollResponse =
client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
System.out.println("删除scroll:"+clearScrollResponse.isSucceeded());
}

6.5 delete-by-query查询

将根据关键字查询出的数据删除,删除过程也是查询一条符合条件的,删除一条。

如果删除数据是index中的大部分的数据,不推荐使用delete-by-query;只需要重新创建一个index,将要保留的数据复制到新的index中。

‘Json格式’

1
2
3
4
5
6
7
8
9
10
11
12
# delete-by-query

POST /sms-logs-index/sms-logs-type/_delete_by_query
{
"query": {
"range": {
"fee": {
"lt": 4
}
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/**
* deleteByQuery 删除查询出的数据
* @throws IOException 异常
*/
@Test
public void DeleteByQuery() throws IOException {

// 1.创建request对象

DeleteByQueryRequest request = new DeleteByQueryRequest(index);
request.types(type);

// 2.添加条件

request.setQuery(QueryBuilders.rangeQuery("fee").lt(4));

// 3.client 执行

BulkByScrollResponse response =
client.deleteByQuery(request, RequestOptions.DEFAULT);

// 4.返回结果
System.out.println(response.toString());
}

6.6 复合查询

6.6.1 bool查询

复合过滤器,将你的多个查询条件根据一定的逻辑组合在一起

  • must:所有的条件,用must组合在一起就是AND的意思
  • must_not:全部都不能匹配,标识NO的意思
  • should:将所有条件用should组合在意思,标识OR的意思

‘Json格式实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#复合查询 bool查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"province": {
"value": "北京市"
}
}
},
{
"term": {
"province": {
"value": "杭州市"
}
}
}
],
"must_not": [
{
"match": {
"operatorId": "2"
}
}
],
"must": [
{
"match": {
"smsContent": "亲爱的"
}
},
{
"match": {
"smsContent": "灯先生"
}
}
]
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/**
* 复合查询 bool查询 将多个查询条件通过must、must_not、should组合在一起查询
* @throws IOException 异常
*/
@Test

public void BoolQuery() throws IOException {

// 1.创建request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加条件

SearchSourceBuilder builder = new SearchSourceBuilder();
/*
builder.query(QueryBuilders.boolQuery()
.should(QueryBuilders.termQuery("province", "北京市"))
.should(QueryBuilders.termQuery("province", "杭州市"))
.mustNot(QueryBuilders.matchQuery("operatorId", 2))
.must(QueryBuilders.matchQuery("smsContent", "亲爱的"))
.must(QueryBuilders.matchQuery("smsContent", "灯先生")));

*/

BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.should(QueryBuilders.termQuery("province", "北京市"));
boolQueryBuilder.should(QueryBuilders.termQuery("province", "杭州市"));
boolQueryBuilder.mustNot(QueryBuilders.matchQuery("operatorId", 2));
boolQueryBuilder.must(QueryBuilders.matchQuery("smsContent", "亲爱的"));
boolQueryBuilder.must(QueryBuilders.matchQuery("smsContent", "灯先生"));

builder.query(boolQueryBuilder);

request.source(builder);
// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回查询结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}

6.6.2 boosting查询

boosting查询 可以帮助我们去影响查询后的score

  • positive:只有匹配上positive的查询结果,才会被放在返回的结果集中
  • negative:如果匹配上和positive并且也匹配上negative,就可以降低这样的文档的score
  • negative_boost:指定系数,必须小于1.0

关于查询时,分数是如何计算的

  • 搜索的关键字在文档中出现的频次越高,分数越高
  • 指定的文档内容越短,分数就越高
  • 我们在搜索时,指定的关键字也会被分词,这个分词的内容,被分词库匹配的个数越多,分数越高

‘Json实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 复合查询 boosting查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"boosting": {
"positive": {
"match": {
"smsContent": "亲爱的"
}
},
"negative": {
"match": {
"smsContent": "阿里巴巴"
}
},
"negative_boost": 0.5
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
    /**
* boosting 查询 影响对查询结果的分数
* positive 查询的条件
* negative 减低符合此条件的分数
* negative_boost 减低的系数 一帮小于1
* @throws IOException 异常
*/
@Test
public void boostingQuery() throws IOException {

// 1.创建request对象

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
/* builder.query(QueryBuilders.boostingQuery(
QueryBuilders.matchQuery("smsContent", "亲爱的"),
QueryBuilders.matchQuery("smsContent", "阿里巴巴"),
).negativeBoost(0.5f));*/
BoostingQueryBuilder boostingQueryBuilder = new BoostingQueryBuilder(
QueryBuilders.matchQuery("smsContent", "亲爱的"),
QueryBuilders.matchQuery("smsContent", "阿里巴巴")
);
boostingQueryBuilder.negativeBoost(0.5f);
builder.query(boostingQueryBuilder);

request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回查询结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getScore());
}
}
}

6.7 filter查询

query查询: 根据你的查询条件,去计算文档的匹配度,得到一个分数,并且根据分数进行拍讯,不会做缓存

filter查询:根据你的查询条件去查询文档,不去计算分数,filter会对经常被过滤的数据进行缓存

‘Json格式’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

#filter 查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"smsContent": "阿里巴巴"
}
},{
"range": {
"fee": {
"lte": 9
}
}
}
]
}
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
    /**
* FilterQuery 过滤查询 根据查询条件查询结果,不会计算score,不会拍讯,会对数据进行缓存
* @throws IOException 异常
*/
@Test
public void filterQuery() throws IOException {

// 1.创建request请求

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
boolQuery.filter(QueryBuilders.termQuery("smsContent", "阿里巴巴"));
boolQuery.filter(QueryBuilders.rangeQuery("fee").lte(9));

/* builder.query(QueryBuilders.boolQuery().filter(
QueryBuilders.termQuery("smsContent", "阿里巴巴"),
QueryBuilders.rangeQuery("fee").lte(9),
));*/
builder.query(boolQuery);
request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}

6.8 HightLight 高亮查询

将客户输入的关键字,以一定的特殊样式展示给用户,让用户知道为什么这个结果被检索出来

高亮展示的数据,本身就是doc中的field,单独将field以hightlight的形式返回,不会影响原来的显示结果

ES提供了一个hightlight属性,与query同级别的

  • fragment_size:指定高亮数据展现的字符个数 默认是100个
  • pre_tags:指定前缀标签 例如:
  • post_tags:指定后缀标签 例如:

‘Json格式实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

#HightLight 高亮查询

POST /sms-logs-index/sms-logs-type/_search
{
"query": {
"term": {
"smsContent": {
"value": "亲爱的"
}
}
},
"highlight": {
"fields": {
"smsContent": {}
},
"pre_tags": "<font color='red'>",
"post_tags": "</font>",
"fragment_size": "10"
}
}

‘Java代码实现’

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
RestHighLevelClient client = ESClient.getClient();

String index = "sms-logs-index";
String type = "sms-logs-type";

/**
* highLight 查询 高亮查询
* @throws IOException 异常
*/
@Test
public void hightLightQuery() throws IOException {

// 1.request对象创建

SearchRequest request = new SearchRequest(index);
request.types(type);

// 2.添加查询条件

SearchSourceBuilder builder = new SearchSourceBuilder();
builder.query(QueryBuilders.termsQuery("smsContent", "亲爱的"));

HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("smsContent",6);
highlightBuilder.preTags("<font color='red'>");
highlightBuilder.postTags("</font>");

builder.highlighter(highlightBuilder);
request.source(builder);

// 3.client执行

SearchResponse response = client.search(request, RequestOptions.DEFAULT);

// 4.返回结果

for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getHighlightFields());
}
}

6.9 聚合查询

ES的聚合查询比MySQL的聚合查询更强大,ES提供的统计数据的方式多种多样,

‘RestFul语法格式’

1
2
3
4
5
6
7
8
9
10
POST /index/type/_search
{
"aggs": {
"名字(agg)": {
"agg_type": {
"属性": "值"
}
}
}
}

6.9.1 去重计数查询

去重计数,也就是cardinality

  • 第一步,先将返回的文档中的一个指定的field进行去重。
  • 第二步,然后统计一共有多少条

‘Json格式实现’

1
2
3
4
5
6
7
8
9
10
11
12
#聚合查询之去重计数查询 cardinality

POST /sms-logs-index/sms-logs-type/_search
{
"aggs": {
"agg": {
"cardinality": {
"field": "province"
}
}
}
}

‘Java代码实现’

1
2


吴超 wechat
subscribe to my blog by scanning my public wechat account