当前位置：首页 > news >正文

教育教学网站建设万江做网站的公司

news 2025/10/14 16:58:29

教育教学网站建设,万江做网站的公司,商务定制网站,搜索引擎seo外包1 elasticsearch概述 1.1 elasticsearch简介官网: https://www.elastic.co/ ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎#xff0c;基于RESTful web接口。Elasticsearch是用Java开发的#xff0c;并作为Apache许可条款下的…1 elasticsearch概述 1.1 elasticsearch简介官网: https://www.elastic.co/ ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎基于RESTful web接口。Elasticsearch是用Java开发的并作为Apache许可条款下的开放源码发布是当前流行的企业级搜索引擎。 Elastic官方宣布Elasticsearch进入Version 8在速度、扩展、高相关性和简单性方面开启了一个全新的时代。说明Elasticsearch 8最低jdk版本要求jdk17当前我们选择Elasticsearch版本Elasticsearch8.5.3 DB-Engines Ranking - popularity ranking of database management systems 1.2 Elasticsearch的特性实时理论上数据从写入Elasticsearch到数据可以被搜索只需要1秒左右的时间实现准实时的数据索引和查询。分布式、可扩展天生的分布式的设计数据分片对于应用层透明扩展性良好可以轻易的进行节点扩容支持上百甚至上千的服务器节点支持PB级别的数据存储和搜索。稳定可靠 Elasticsearch的分布式、数据冗余特性提供更加可靠的运行机制且经过大型互联网公司众多项目使用可靠性得到验证。高可用数据多副本、多节点存储单节点的故障不影响集群的使用。 Rest API Elasticsearch提供标准的Rest API这使得所有支持Rest API的语言都能够轻易的使用Elasticsearch具备多语言通用的支持特性易于使用。Elasticsearch Version 8以后去除了以前Transport API、High-Level API、Low-Level API统一标准的Rest API这将使得Elasticsearch更加容易使用原来被诟病的API混乱问题终于得到完美解决。高性能 Elasticsearch底层构建基于Lucene具备强大的搜索能力即便是PB级别的数据依然能够实现秒级的搜索。多客户端支持支持Java、Python、Go、PHP、Ruby等多语言客户端还支持JDBC、ODBC等客户端。安全支持提供单点登录SSO、加密通信、集群角色、属性的访问控制支持审计等功能在安全层面上还支持集成第三方的安全组件在Version 8以后默认开启了HTTPS大大简化了安全上的配置。 1.3 Elasticsearch应用场景搭建日志系统日志系统应该是Elasticsearch使用最广泛的场景之一了Elasticsearch支持海量数据的存储和查询特别适合日志搜索场景。广泛使用的ELK套件(Elasticsearch、Logstash、Kibana)是日志系统最经典的案例使用Logstash和Beats组件进行日志收集Elasticsearch存储和查询应用日志Kibana提供日志的可视化搜索界面。搭建数据分析系统 Elasitcsearch支持数据分析例如强大的数据聚合功能通过搭配Kibana提供诸如直方图、统计分组、范围聚合等方便使用的功能能够快速实现一些数据报表等功能。在数字化转型的大行其道的当下需要从海量数据中发现数据的规律从而做出一定的决策Elasticsearch一定是最适合的解决方案之一。搭建搜索系统 Elasticsearch为搜索而生用于搭建全文搜索系统是自然而然的事情它能够提供快速的索引和搜索功能还有相关的评分功能、分词插件等支持丰富的搜索特性可以用于搭建大型的搜索引擎更加常用语实现站内搜索例如银行App、购物App等站内商品、服务搜索。构建海量数据业务系统即席查询服务目前大量的需要支持事务的系统使用MySQL作为数据库但随着业务的开展数据量会越来越大而MySQL的性能会越来越差虽然可以通过分库分表的方案进行解决但是操作比较复杂而且往往每隔一段时间就需要进行扩展且代码需要配合修改。这种情况下可以将数据从MySQL同步到Elasticsearch针对实时性要求不太高或者主要查询历史数据且数据量比较大的场景使用Elasticsearch提供查询而对需要事务实时控制的即时数据还是通过MySQL存储和查询。作为独立数据库系统 Elasticsearch本身提供了数据持久化存储的能力并且提供了增删改查的功能在某些应用场景下可以直接当做数据库系统来使用既提供了存储能力又能够同时具备搜索能力整体技术架构会比较简单例如博客系统、评论系统。需要注意的是Elasticsearch不支持事务且写入的性能相对关系型数据库稍弱所有需要使用事务的场景都不能将Elasticsearch当做唯一的数据库系统这使得这种使用场景很少见。 1.4 全文搜索引擎 Google百度类的网站搜索它们都是根据网页中的关键字生成索引我们在搜索的时候输入关键字它们会将该关键字即索引匹配到的所有网页返回还有常见的项目中应用日志的搜索等等。对于这些非结构化的数据文本关系型数据库搜索不是能很好的支持。一般传统数据库全文检索都实现的很鸡肋因为一般也没人用数据库存文本字段。进行全文检索需要扫描整个表如果数据量大的话即使对SQL的语法优化也收效甚微。建立了索引但是维护起来也很麻烦对于 insert 和 update 操作都会重新构建索引。这里说到的全文搜索引擎指的是目前广泛应用的主流搜索引擎。它的工作原理是计算机索引程序通过扫描文章中的每一个词对每一个词建立一个索引指明该词在文章中出现的次数和位置当用户查询时检索程序就根据事先建立的索引进行查找并将查找的结果反馈给用户的检索方式。这个过程类似于通过字典中的检索字表查字的过程。 1.5 lucene介绍 Lucene是Apache软件基金会Jakarta项目组的一个子项目提供了一个简单却强大的应用程式接口能够做全文索引和搜寻。在Java开发环境里Lucene是一个成熟的免费开源工具。就其本身而言Lucene是当前以及最近几年最受欢迎的免费Java信息检索程序库。但Lucene只是一个提供全文搜索功能类库的核心工具包而真正使用它还需要一个完善的服务框架搭建起来进行应用。目前市面上流行的搜索引擎软件主流的就两款Elasticsearch和Solr,这两款都是基于Lucene搭建的可以独立部署启动的搜索引擎服务软件。由于内核相同所以两者除了服务器安装、部署、管理、集群以外对于数据的操作修改、添加、保存、查询等等都十分类似。 1.6 倒排索引倒排索引步骤: 数据根据词条进行分词同时记录文档索引位置将词条相同的数据化进行合并对词条进行排序搜索过程: 先将搜索词语进行分词分词后再倒排索引列表查询文档位置(docId)。根据docId查询文档数据。 1.7 elasticsearch、solr对比 ElasticSearch vs Solr 总结 es基本是开箱即用非常简单。Solr安装略微复杂。Solr 利用 Zookeeper 进行分布式管理而 Elasticsearch 自身带有分布式协调管理功能。Solr 支持更多格式的数据比如JSON、XML、CSV而 Elasticsearch 仅支持json文件格式。Solr 是传统搜索应用的有力解决方案但 Elasticsearch 更适用于新兴的实时搜索应用。现在很多互联网应用都是要求实时搜索的所以我们选择了elasticsearch。 2 elasticSearch的安装详见《软件环境安装》 3 elasticsearch核心概念 3.1 es对照数据库 3.2 索引(Index) 一个索引就是一个拥有几分相似特征的文档的集合。比如说你可以有一个客户数据的索引另一个产品目录的索引还有一个订单数据的索引。一个索引由一个名字来标识必须全部是小写字母并且当我们要对这个索引中的文档进行索引、搜索、更新和删除的时候都要使用到这个名字。在一个集群中可以定义任意多的索引。能搜索的数据必须索引这样的好处是可以提高查询速度比如新华字典前面的目录就是索引的意思目录可以提高查询速度。 Elasticsearch索引的精髓一切设计都是为了提高搜索的性能。 3.3 类型(Type) 在一个索引中你可以定义一种或多种类型。一个类型是你的索引的一个逻辑上的分类/分区其语义完全由你来定。通常会为具有一组共同字段的文档定义一个类型。不同的版本类型发生了不同的变化版本Type5.x支持多种type6.x只能有一种type7.x默认不再支持自定义索引类型默认类型为_doc8.x默认类型为_doc 3.4 文档(Document) 一个文档是一个可被索引的基础信息单元也就是一条数据比如你可以拥有某一个客户的文档某一个产品的一个文档当然也可以拥有某个订单的一个文档。文档以JSONJavascript Object Notation格式来表示而JSON是一个到处存在的互联网数据交互格式。在一个index/type里面你可以存储任意多的文档。 3.5 字段(Field) 相当于是数据表的字段对文档数据根据不同属性进行的分类标识。 3.6 映射(Mapping) mapping是处理数据的方式和规则方面做一些限制如某个字段的数据类型、默认值、分析器、是否被索引等等。这些都是映射里面可以设置的其它就是处理ES里面数据的一些使用规则设置也叫做映射按着最优规则处理数据对性能提高很大因此才需要建立映射并且需要思考如何建立映射才能对性能更好。 4 Elasticsearch 基础功能参考文档https://www.elastic.co/guide/en/elasticsearch/reference/8.5/elasticsearch-intro.html 我们在Kibana前面已经安装过软件给大家演示基本操作详见《软件环境安装》 4.1 分词器官方提供的分词器有这么几种: Standard、Letter、Lowercase、Whitespace、UAX URL Email、Classic、Thai等中文分词器可以使用第三方的比如IK分词器。前面我们已经安装过了。 IK分词器: POST _analyze {analyzer: ik_smart,text: 我是中国人 } 结果: {tokens: [{token: 我,start_offset: 0,end_offset: 1,type: CN_CHAR,position: 0},{token: 是,start_offset: 1,end_offset: 2,type: CN_CHAR,position: 1},{token: 中国人,start_offset: 2,end_offset: 5,type: CN_WORD,position: 2}] }POST _analyze {analyzer: ik_max_word,text: 我是中国人 } 结果: {tokens: [{token: 我,start_offset: 0,end_offset: 1,type: CN_CHAR,position: 0},{token: 是,start_offset: 1,end_offset: 2,type: CN_CHAR,position: 1},{token: 中国人,start_offset: 2,end_offset: 5,type: CN_WORD,position: 2},{token: 中国,start_offset: 2,end_offset: 4,type: CN_WORD,position: 3},{token: 国人,start_offset: 3,end_offset: 5,type: CN_WORD,position: 4}] }Standard分词器: POST _analyze {analyzer: standard,text: 我是中国人 }结果: {tokens: [{token: 我,start_offset: 0,end_offset: 1,type: IDEOGRAPHIC,position: 0},{token: 是,start_offset: 1,end_offset: 2,type: IDEOGRAPHIC,position: 1},{token: 中,start_offset: 2,end_offset: 3,type: IDEOGRAPHIC,position: 2},{token: 国,start_offset: 3,end_offset: 4,type: IDEOGRAPHIC,position: 3},{token: 人,start_offset: 4,end_offset: 5,type: IDEOGRAPHIC,position: 4}] }4.2 索引操作 ES 软件的索引可以类比为 MySQL 中表的概念创建一个索引类似于创建一个表 4.2.1 创建索引语法: PUT /{索引名称} PUT /my_index结果: {acknowledged : true,shards_acknowledged : true,index : my_index }4.2.2 查看所有索引 GET /_cat/indices?v4.2.3 查看单个索引语法: GET /{索引名称} GET /my_index 结果: {my_index: {aliases: {},mappings: {},settings: {index: {routing: {allocation: {include: {_tier_preference: data_content}}},number_of_shards: 1,provided_name: my_index,creation_date: 1693294063006,number_of_replicas: 1,uuid: kYMuXUZQRumMGqHoV0fDJw,version: {created: 8050099}}}} }4.2.4 删除索引语法: DELETE /{索引名称} DELETE /my_index 结果: {acknowledged : true }4.3 文档操作文档是 ES 软件搜索数据的最小单位, 不依赖预先定义的模式所以可以将文档类比为表的一行JSON类型的数据。我们知道关系型数据库中要提前定义字段才能使用在Elasticsearch中对于字段是非常灵活的有时候我们可以忽略该字段或者动态的添加一个新的字段。 4.3.1 创建文档语法: PUT /{索引名称}/{类型}/{id} { jsonbody } 在创建数据时需要指定唯一性标识那么请求范式 POSTPUT 都可以 PUT /my_index/_doc/1 {title: 小米手机,category: 小米,images: http://www.gulixueyuan.com/xm.jpg,price: 3999 }返回结果: {_index: my_index,_id: 1,_version: 3,_seq_no: 2,_primary_term: 1,found: true,_source: {title: 小米手机,category: 小米,images: http://www.gulixueyuan.com/xm.jpg,price: 3999} }4.3.2 查看文档语法:GET /{索引名称}/{类型}/{id} GET /my_index/_doc/1 结果: {_index : my_index,_type : _doc,_id : 1,_version : 1,_seq_no : 0,_primary_term : 1,found : true,_source : {title : 小米手机,category : 小米,images : http://www.gulixueyuan.com/xm.jpg,price : 3999} }4.3.3 查询所有文档语法: GET /{索引名称}/_search GET /my_index/_search结果: {took: 941,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 1,relation: eq},max_score: 1,hits: [{_index: my_index,_id: 1,_score: 1,_source: {title: 小米手机,category: 小米,images: http://www.gulixueyuan.com/xm.jpg,price: 3999}}]} }4.3.4 修改文档语法: PUT /{索引名称}/{类型}/{id} { jsonbody } PUT /my_index/_doc/1 {title: 小米手机,category: 小米,images: http://www.gulixueyuan.com/xm.jpg,price: 4500 }4.3.5 修改局部属性语法: POST /{索引名称}/_update/{docId} { “doc”: { “属性”: “值” } } 注意这种更新只能使用post方式。 POST /my_index/_update/1 {doc: {price: 4500} }4.3.6 删除文档语法: DELETE /{索引名称}/{类型}/{id} DELETE /my_index/_doc/1 结果: {_index: my_index,_id: 1,_version: 5,result: deleted,_shards: {total: 2,successful: 1,failed: 0},_seq_no: 6,_primary_term: 1 }4.4 映射mapping 创建数据库表需要设置字段名称类型长度约束等索引库也一样需要知道这个类型下有哪些字段每个字段有哪些约束信息这就叫做映射(mapping)。 4.4.1 查看映射语法: GET /{索引名称}/_mapping GET /my_index/_mapping 结果: {my_index: {mappings: {properties: {category: {type: text,fields: {keyword: {type: keyword,ignore_above: 256}}},images: {type: text,fields: {keyword: {type: keyword,ignore_above: 256}}},price: {type: long},title: {type: text,fields: {keyword: {type: keyword,ignore_above: 256}}}}}} }4.4.2 动态映射在关系数据库中需要事先创建数据库然后在该数据库下创建数据表并创建表字段、类型、长度、主键等最后才能基于表插入数据。而Elasticsearch中不需要定义Mapping映射即关系型数据库的表、字段等在文档写入 Elasticsearch时会根据文档字段自动识别类型这种机制称之为动态映射。映射规则对应: 数据对应的类型null字段不添加true|flaseboolean字符串text数值long小数float日期date 4.4.3 静态映射静态映射是在Elasticsearch中也可以事先定义好映射即手动映射包含文档的各字段类型、分词器等这称为静态映射。 #删除原创建的索引 DELETE /my_index#创建索引并同时指定映射关系和分词器等。 PUT /my_index {mappings: {properties: {title: {type: text,index: true,store: true,analyzer: ik_max_word,search_analyzer: ik_smart},category: {type: keyword,index: true,store: true},images: {type: keyword,index: true,store: true},price: {type: integer,index: true,store: true}}} }结果: {acknowledged : true,shards_acknowledged : true,index : my_index }type分类如下: 字符串text(支持分词)和 keyword(不支持分词)。text该类型被用来索引长文本在创建索引前会将这些文本进行分词转化为词的组合建立索引允许es来检索这些词text类型不能用来排序和聚合。keyword该类型不能分词可以被用来检索过滤、排序和聚合keyword类型不可用text进行分词模糊检索。数值型long、integer、short、byte、double、float日期型date布尔型boolean特殊数据类型nested 4.4.4 nested 介绍 nested类型是一种特殊的对象object数据类型(specialised version of the object datatype )允许对象数组彼此独立地进行索引和查询。 demo 建立一个普通的index 如果linux 中有这个my_comment_index 先删除DELETE /my_comment_index 步骤1建立一个索引存储博客文章及其所有评论 PUT my_comment_index/_doc/1 {title: 狂人日记,body: 《狂人日记》是一篇象征性和寓意很强的小说当时鲁迅对中国国民精神的麻木愚昧颇感痛切。,comments: [{name: 张三,age: 34,rating: 8,comment: 非常棒的文章,commented_on: 30 Nov 2023},{name: 李四,age: 38,rating: 9,comment: 文章非常好,commented_on: 25 Nov 2022},{name: 王五,age: 33,rating: 7,comment: 手动点赞,commented_on: 20 Nov 2021}] }如上所示所以我们有一个文档描述了一个帖子和一个包含帖子上所有评论的内部对象评论。但是Elasticsearch搜索中的内部对象并不像我们期望的那样工作。步骤2 : 执行查询 GET /my_comment_index/_search {query: {bool: {must: [{match: {comments.name: 李四}},{match: {comments.age: 34}}]}} }查询结果居然正常的响应结果了原因分析comments 字段默认的数据类型是Object故我们的文档内部存储为 { “title”: [ 狂人日记], “body”: [ 《狂人日记》是一篇象征性和寓意很强的小说当时… ], “comments.name”: [ 张三, 李四, 王五 ], “comments.comment”: [ 非常棒的文章,文章非常好,王五,… ], “comments.age”: [ 33, 34, 38 ], “comments.rating”: [ 7, 8, 9 ] } 我们可以清楚地看到comments.name和comments.age之间的关系已丢失。这就是为什么我们的文档匹配李四和34的查询。步骤3删除当前索引 DELETE /my_comment_index步骤4建立一个nested 类型的comments字段映射为nested类型而不是默认的object类型 PUT my_comment_index {mappings: {properties: {comments: {type: nested }}} }PUT my_comment_index/_doc/1 {title: 狂人日记,body: 《狂人日记》是一篇象征性和寓意很强的小说当时鲁迅对中国国民精神的麻木愚昧颇感痛切。,comments: [{name: 张三,age: 34,rating: 8,comment: 非常棒的文章,commented_on: 30 Nov 2023},{name: 李四,age: 38,rating: 9,comment: 文章非常好,commented_on: 25 Nov 2022},{name: 王五,age: 33,rating: 7,comment: 手动点赞,commented_on: 20 Nov 2021}] }重新执行步骤1使用nested 查询 GET /my_comment_index/_search {query: {nested: {path: comments,query: {bool: {must: [{match: {comments.name: 李四}},{match: {comments.age: 34}}]}}}} }结果发现没有返回任何的文档这是何故当将字段设置为nested 嵌套对象将数组中的每个对象索引为单独的隐藏文档这意味着可以独立于其他对象查询每个嵌套对象。文档的内部表示 { { “comments.name”: [ 张三], “comments.comment”: [ 非常棒的文章 ], “comments.age”: [ 34 ], “comments.rating”: [ 9 ] }, { “comments.name”: [ 李四], “comments.comment”: [ 文章非常好 ], “comments.age”: [ 38 ], “comments.rating”: [ 8 ] }, { “comments.name”: [ 王五], “comments.comment”: [手动点赞], “comments.age”: [ 33 ], “comments.rating”: [ 7 ] }, { “title”: [ 狂人日记 ], “body”: [ 《狂人日记》是一篇象征性和寓意很强的小说当时鲁迅对中国… ] } } 每个内部对象都在内部存储为单独的隐藏文档。这保持了他们的领域之间的关系。 5 DSL高级查询 5.1 DSL概述 Query DSL概述: Domain Specific Language(领域专用语言)Elasticsearch提供了基于JSON的DSL来定义查询。 DSL概览: 准备数据: PUT /my_index/_doc/1 {id:1,title:华为笔记本电脑,category:华为,images:http://www.gulixueyuan.com/xm.jpg,price:5388}PUT /my_index/_doc/2 {id:2,title:华为手机,category:华为,images:http://www.gulixueyuan.com/xm.jpg,price:5500}PUT /my_index/_doc/3 {id:3,title:VIVO手机,category:vivo,images:http://www.gulixueyuan.com/xm.jpg,price:3600}5.2 DSL查询 5.2.1 查询所有文档 match_all: POST /my_index/_search {query: {match_all: {}} }结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : 1.0,hits : [{_index : my_index,_type : _doc,_id : 1,_score : 1.0,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388}},{_index : my_index,_type : _doc,_id : 2,_score : 1.0,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500}},{_index : my_index,_type : _doc,_id : 3,_score : 1.0,_source : {id : 3,title : VIVO手机,category : vivo,images : http://www.gulixueyuan.com/xm.jpg,price : 3600}}]} }5.2.2 匹配查询(match) match: POST /my_index/_search {query: {match: {title: 华为智能手机}} }结果: {took : 3,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 2,relation : eq},max_score : 0.5619608,hits : [{_index : my_index,_type : _doc,_id : 2,_score : 0.5619608,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500}},{_index : my_index,_type : _doc,_id : 1,_score : 0.35411233,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388}}]} }5.2.3 多字段匹配 POST /my_index/_search {query: {multi_match: {query: 华为智能手机,fields: [title,category]}} }结果: {took: 1,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 3,relation: eq},max_score: 1.5619192,hits: [{_index: my_index,_id: 2,_score: 1.5619192,_source: {id: 2,title: 华为手机,category: 华为,images: http://www.gulixueyuan.com/xm.jpg,price: 5500}},{_index: my_index,_id: 1,_score: 1.489748,_source: {id: 1,title: 华为笔记本电脑,category: 华为,images: http://www.gulixueyuan.com/xm.jpg,price: 5388}},{_index: my_index,_id: 3,_score: 0.59518534,_source: {id: 3,title: VIVO手机,category: vivo,images: http://www.gulixueyuan.com/xm.jpg,price: 3600}}]} }5.2.4 关键字精确查询 term:关键字不会进行分词。相当于where title ?; POST /my_index/_search {query: {term: {title: {value: 华为手机}}} }结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 0,relation : eq},max_score : null,hits : [ ]} }5.2.6 多关键字精确查询 where title in () POST /my_index/_search {query: {terms: {title: [华为手机,华为]}} }结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 2,relation : eq},max_score : 1.0,hits : [{_index : my_index,_type : _doc,_id : 1,_score : 1.0,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388}},{_index : my_index,_type : _doc,_id : 2,_score : 1.0,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500}}]} }5.2.7 范围查询范围查询使用range。 gte: 大于等于lte: 小于等于gt: 大于lt: 小于 POST /my_index/_search {query: {range: {price: {gte: 3000,lte: 5000}}} } 结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 1,relation : eq},max_score : 1.0,hits : [{_index : my_index,_type : _doc,_id : 3,_score : 1.0,_source : {title : VIVO手机,category : vivo}}]} }5.2.8 指定返回字段 query同级增加_source进行过滤。 POST /my_index/_search {query: {terms: {title: [华为手机,华为]}},_source: [title,category] }5.2.9 组合查询 bool 各条件之间有and,or或not的关系 must: 各个条件都必须满足所有条件是and的关系should: 各个条件有一个满足即可即各条件是or的关系must_not: 不满足所有条件即各条件是not的关系filter: 与must效果等同但是它不计算得分效率更高点。 must 各个条件都必须满足所有条件是and的关系 POST /my_index/_search {query: {bool: {must: [{match: {title: 华为}},{range: {price: {gte: 3000,lte: 5400}}}]}} } 结果: {took: 1,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 1,relation: eq},max_score: 1.2923405,hits: [{_index: my_index,_id: 1,_score: 1.2923405,_source: {id: 1,title: 华为笔记本电脑,category: 华为,images: http://www.gulixueyuan.com/xm.jpg,price: 5388}}]} }should 各个条件有一个满足即可即各条件是or的关系 POST /my_index/_search {query: {bool: {should: [{match: {title: 华为}},{range: {price: {gte: 3000,lte: 5000}}}]}} }结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : 1.0,hits : [{_index : my_index,_type : _doc,_id : 3,_score : 1.0,_source : {id : 3,title : VIVO手机,category : vivo,images : http://www.gulixueyuan.com/xm.jpg,price : 3600}},{_index : my_index,_type : _doc,_id : 2,_score : 0.5619608,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500}},{_index : my_index,_type : _doc,_id : 1,_score : 0.35411233,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388}}]} }如果should和must同时存在他们之间是and关系 POST /my_index/_search {query: {bool: {should: [{match: {title: 华为}},{range: {price: {gte: 3000,lte: 5000}}}],must: [{match: {title: 华为}},{range: {price: {gte: 3000,lte: 5000}}}]}} }结果: {took : 1,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 0,relation : eq},max_score : null,hits : [ ]} }must_not 不满足所有条件即各条件是not的关系 POST /my_index/_search {query: {bool: {must_not: [{match: {title: 华为}},{range: {price: {gte: 3000,lte: 5000}}}]}} } 结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 0,relation : eq},max_score : null,hits : [ ]} }filter 与must效果等同但是它不计算得分效率更高点。 _score的分值为0 在Elasticsearch中_score 字段代表每个文档的相关性分数relevance score。这个分数用于衡量一个文档与特定查询的匹配程度它是基于搜索查询的条件和文档的内容来计算的。相关性分数越高表示文档与查询的匹配度越高排名也越靠前。 POST /my_index/_search {query: {bool: {filter: [{match: {title: 华为}}]}} }结果: {took : 1,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 2,relation : eq},max_score : 0.0,hits : [{_index : my_index,_type : _doc,_id : 1,_score : 0.0,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388}},{_index : my_index,_type : _doc,_id : 2,_score : 0.0,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500}}]} }5.2.10 聚合查询聚合允许使用者对es文档进行统计分析类似与关系型数据库中的group by当然还有很多其他的聚合例如取最大值、平均值等等。 max POST /my_index/_search {query: {match_all: {}{}},size: 0, aggs: {max_price: {max: {field: price}}} }结果: {took : 1,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {max_price : {value : 5500.0}} }min POST /my_index/_search {query: {match_all: {}},size: 0, aggs: {min_price: {min: {field: price}}} }结果: {took : 12,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {max_price : {value : 3600.0}} }avg POST /my_index/_search {query: {match_all: {}},size: 0, aggs: {avg_price: {avg: {field: price}}} } 结果: {took : 12,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {avg_price : {value : 4829.333333333333}} }sum POST /my_index/_search {query: {match_all: {}},size: 0, aggs: {sum_price: {sum: {field: price}}} } 结果: {took : 3,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {sum_price : {value : 14488.0}} }stats POST /my_index/_search {query: {match_all: {}},size: 0, aggs: {stats_price: {stats: {field: price}}} } 结果: {took : 20,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {stats_price : {count : 3,min : 3600.0,max : 5500.0,avg : 4829.333333333333,sum : 14488.0}} }terms 桶聚合相当于sql中的group by语句 POST /my_index/_search {query: {match_all: {}},size: 0, aggs: {groupby_category: {terms: {field: category,size: 10}}} } 结果: {took : 16,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {groupby_category : {doc_count_error_upper_bound : 0,sum_other_doc_count : 0,buckets : [{key : 华为,doc_count : 2},{key : vivo,doc_count : 1}]}} }还可以对桶继续下钻计算每个品牌对应的平均值是多少 POST /my_index/_search {query: {match_all: {}},size: 0, aggs: {groupby_category: {terms: {field: category,size: 10},aggs: {avg_price: {avg: {field: price}}}}} } 结果: {took : 2,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : null,hits : [ ]},aggregations : {groupby_category : {doc_count_error_upper_bound : 0,sum_other_doc_count : 0,buckets : [{key : 华为,doc_count : 2,avg_price : {value : 5444.0}},{key : vivo,doc_count : 1,avg_price : {value : 3600.0}}]}} }5.2.11 排序 POST /my_index/_search {query: {bool: {must: [{match: {title: 华为}}]}},sort: [{price: {order: asc}},{_score: {order: desc}}] } 结果: {took : 0,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 2,relation : eq},max_score : null,hits : [{_index : my_index,_type : _doc,_id : 1,_score : 0.35411233,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388},sort : [5388,0.35411233]},{_index : my_index,_type : _doc,_id : 2,_score : 0.5619608,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500},sort : [5500,0.5619608]}]} }5.2.12 分页查询分页的两个关键属性:from、size。 from: 当前页的起始索引默认从0开始。 from (pageNum - 1) * sizesize: 每页显示多少条 POST /my_index/_search {query: {match_all: {}},from: 0,size: 2 } 结果: {took : 3,timed_out : false,_shards : {total : 1,successful : 1,skipped : 0,failed : 0},hits : {total : {value : 3,relation : eq},max_score : 1.0,hits : [{_index : my_index,_type : _doc,_id : 1,_score : 1.0,_source : {id : 1,title : 华为笔记本电脑,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5388}},{_index : my_index,_type : _doc,_id : 2,_score : 1.0,_source : {id : 2,title : 华为手机,category : 华为,images : http://www.gulixueyuan.com/xm.jpg,price : 5500}}]} }5.2.13 高亮显示无检索不高亮 # 检索数据 GET /my_index/_search {query: {match: {title: 华为手机}},highlight: {fields: {title: {}},pre_tags: [font color:#e4393c],post_tags: [/font]} } 效果 {took: 2,timed_out: false,_shards: {total: 1,successful: 1,skipped: 0,failed: 0},hits: {total: {value: 3,relation: eq},max_score: 1.996705,hits: [{_index: my_index,_id: 2,_score: 1.996705,_source: {id: 2,title: 华为手机,category: 华为,images: http://www.gulixueyuan.com/xm.jpg,price: 5500},highlight: {title: [font color:#e4393c华/fontfont color:#e4393c为/fontfont color:#e4393c手/fontfont color:#e4393c机/font]}},{_index: my_index,_id: 3,_score: 1.100845,_source: {id: 3,title: VIVO手机,category: vivo,images: http://www.gulixueyuan.com/xm.jpg,price: 3600},highlight: {title: [VIVOfont color:#e4393c手/fontfont color:#e4393c机/font]}},{_index: my_index,_id: 1,_score: 0.78038335,_source: {id: 1,title: 华为笔记本电脑,category: 华为,images: http://www.gulixueyuan.com/xm.jpg,price: 5388},highlight: {title: [font color:#e4393c华/fontfont color:#e4393c为/font笔记本电脑]}}]} }6 Java Api操作ES 6.1 Elasticsearch Java API Client 官方文档https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/8.5/installation.html 6.1.1 搭建项目 1、创建项目elasticsearch_demo 2、导入pom.xml ?xml version1.0 encodingUTF-8? project xmlnshttp://maven.apache.org/POM/4.0.0 xmlns:xsihttp://www.w3.org/2001/XMLSchema-instancexsi:schemaLocationhttp://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsdmodelVersion4.0.0/modelVersionparentgroupIdorg.springframework.boot/groupIdartifactIdspring-boot-starter-parent/artifactIdversion3.0.5/versionrelativePath/ !-- lookup parent from repository --/parentgroupIdcom.atguigu/groupIdartifactIdelasticsearch_demo/artifactIdversion0.0.1-SNAPSHOT/versionpropertiesjava.version17/java.version/propertiesdependenciesdependencygroupIdorg.springframework.boot/groupIdartifactIdspring-boot-starter-web/artifactId/dependencydependencygroupIdco.elastic.clients/groupIdartifactIdelasticsearch-java/artifactIdversion8.5.3/version/dependencydependencygroupIdcom.fasterxml.jackson.core/groupIdartifactIdjackson-databind/artifactIdversion2.12.3/version/dependencydependencygroupIdorg.springframework.boot/groupIdartifactIdspring-boot-starter-test/artifactIdscopetest/scope/dependency/dependenciesbuildpluginsplugingroupIdorg.springframework.boot/groupIdartifactIdspring-boot-maven-plugin/artifactId/plugin/plugins/build/project6.1.2 配置连接在启动类配置es连接 package com.atguigu.elasticsearch;import co.elastic.clients.elasticsearch.ElasticsearchClient; import co.elastic.clients.json.jackson.JacksonJsonpMapper; import co.elastic.clients.transport.ElasticsearchTransport; import co.elastic.clients.transport.rest_client.RestClientTransport; import org.apache.http.Header; import org.apache.http.HttpHost; import org.apache.http.auth.AuthScope; import org.apache.http.auth.UsernamePasswordCredentials; import org.apache.http.impl.client.BasicCredentialsProvider; import org.apache.http.message.BasicHeader; import org.elasticsearch.client.RestClient; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.annotation.Bean;SpringBootApplication public class ElasticsearchDemoApplication {public static void main(String[] args) {SpringApplication.run(ElasticsearchDemoApplication.class, args);}Beanpublic ElasticsearchClient buildElasticsearchClient() {BasicCredentialsProvider credsProv new BasicCredentialsProvider();credsProv.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(elastic, 111111));RestClient restClient RestClient.builder(HttpHost.create(http://139.198.127.41:9200)).setHttpClientConfigCallback(hc - hc.setDefaultCredentialsProvider(credsProv)).build();// Create the transport with a Jackson mapperElasticsearchTransport transport new RestClientTransport(restClient, new JacksonJsonpMapper());// And create the API clientElasticsearchClient esClient new ElasticsearchClient(transport);return esClient;} }6.1.3 测试查询 package com.atguigu.elasticsearch_demo;import co.elastic.clients.elasticsearch.ElasticsearchClient; import co.elastic.clients.elasticsearch.core.SearchResponse; import org.junit.jupiter.api.Test; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;SpringBootTest class ElasticsearchDemoApplicationTests {Autowiredprivate ElasticsearchClient elasticsearchClient;Testvoid contextLoads() throws IOException {// 创建查询请求对象// SearchRequest.Builder request new SearchRequest.Builder();// // QueryBuilders.queryString()// SearchRequest searchRequest request.index(my_index).query(q - {// return q.match(f - {// return f.field(title).query(华为);// });// }).build();// // 获取到查询结果集对象// SearchResponseObject search elasticsearchClient.search(searchRequest, Object.class);// System.out.println(search);SearchResponseObject search elasticsearchClient.search(s - s.index(my_index).query(f - f.match(f1 - f1.field(title).query(华为))), Object.class);// 遍历获取数据for (HitObject hit : search.hits().hits()) {System.out.println(hit.source());}} }打印结果 {id2, title华为手机, category华为, imageshttp://www.gulixueyuan.com/xm.jpg, price5500} {id1, title华为笔记本电脑, category华为, imageshttp://www.gulixueyuan.com/xm.jpg, price5388}

查看全文

http://www.yingshimen.cn/news/92851/