ElasticSearch_dsl实现多字段查询去重过滤详解(script)
  TEZNKK3IfmPf 2023年11月14日 32 0

ElasticSearch单字段去重详见博文:ElasticSearch单字段查询去重详解_IT之一小佬的博客-CSDN博客

ElasticSearch多字段去重详见博文:ElasticSearch多字段查询去重过滤详解_IT之一小佬的博客-CSDN博客

本博文将详细介绍使用elasticsearch_dsl进行多字段进行去重。本文示例数据详见上文单字段博文数据。

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

1、对条件进行查询

示例代码:

from elasticsearch_dsl import connections, Search, A, Q

# 连接es
es = connections.create_connection(hosts=['192.168.124.49:9200'], timeout=20)
print(es)

s = Search(using=es, index='person_info')
q = Q('match', provience='北京')
res = s.query(q)
for data in res:
    print(data.to_dict())

print("共查到%d条数据" % res.count())

运行结果:

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

2、使用script_fields脚本多字段去重

示例代码:

from elasticsearch_dsl import connections, Search, Q

# 连接es
es = connections.create_connection(hosts=['192.168.124.49:9200'], timeout=20)
print(es)

s = Search(using=es, index='person_info')
q = Q('match', provience='北京')
# res = s.query(q).script_fields(age_gender_aggs={'script': {'lang': 'painless', 'source': "doc['age'].value + doc['gender'].value"}})
res = s.query(q).script_fields(age_gender_aggs={'script': {'lang': 'painless', 'source': "'age:' + doc['age'].value + ',gender:' + doc['gender'].value"}})

count = 0
for data in res:
    print(data.to_dict(), type(data.to_dict()))
    count += 1
print("共查到%d条数据" % count)

运行结果:

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

3、使用script_fields脚本多字段去重并显示需要的字段

示例代码:

from elasticsearch_dsl import connections, Search, Q

# 连接es
es = connections.create_connection(hosts=['192.168.124.49:9200'], timeout=20)
print(es)

s = Search(using=es, index='person_info')
q = Q('match', provience='北京')
res = s.query(q)\
    .script_fields(age_gender_aggs={'script': {'lang': 'painless', 'source': "'age:' + doc['age'].value + ',gender:' + doc['gender'].value"}})\
    .source(['name', 'age', 'gender', 'address'])

count = 0
for data in res:
    print(data.to_dict(), type(data.to_dict()))
    count += 1
print("共查到%d条数据" % count)

运行结果:

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

4、使用script_fields脚本多字段去重并显示所有字段

示例代码:

from elasticsearch_dsl import connections, Search, Q

# 连接es
es = connections.create_connection(hosts=['192.168.124.49:9200'], timeout=20)
print(es)

s = Search(using=es, index='person_info')
q = Q('match', provience='北京')
res = s.query(q)\
    .script_fields(age_gender_aggs={'script': {'lang': 'painless', 'source': "'age:' + doc['age'].value + ',gender:' + doc['gender'].value"}})\
    .source([])\
    .execute()  # 这一行可写可不写

count = 0
for data in res:
    print(data.to_dict(), type(data.to_dict()))
    count += 1
print("共查到%d条数据" % count)

运行结果:

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

5、使用script_fields脚本多字段去重统计数量

示例代码:

from elasticsearch_dsl import connections, Search, Q

# 连接es
es = connections.create_connection(hosts=['192.168.124.49:9200'], timeout=20)
print(es)

s = Search(using=es, index='person_info')
q = Q('match', provience='北京')
res = s.query(q).script_fields(age_gender_aggs={'script': {'lang': 'painless', 'source': "doc['age'].value + doc['gender'].value"}})

lst = []
for data in res:
    print(data.to_dict(), type(data.to_dict()))
    lst.append(str(data.to_dict()))
print(set(lst))
print("共查到%d条数据" % len(set(lst)))

运行结果:

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

6、使用聚合中script脚本多字段去重统计数量

示例代码:

from elasticsearch_dsl import connections, Search, Q, A

# 连接es
es = connections.create_connection(hosts=['192.168.124.49:9200'], timeout=20)
print(es)

s = Search(using=es, index='person_info')
q = Q('match', provience='北京')
search = s.query(q)
search.aggs.bucket('age_gender_agg',
                   A('cardinality', script={'lang': 'painless', 'source': "doc['age'].value + doc['gender'].value"}))
ret = search.execute()
print(ret)
print(ret.aggregations.age_gender_agg)
print(ret.aggregations.age_gender_agg.value)

运行结果:

ElasticSearch_dsl实现多字段查询去重过滤详解(script)

参考博文:

Retrieve selected fields from a search | Elasticsearch Guide [8.5] | Elastic

API Documentation — Elasticsearch DSL 7.2.0 documentation

【版权声明】本文内容来自摩杜云社区用户原创、第三方投稿、转载,内容版权归原作者所有。本网站的目的在于传递更多信息,不拥有版权,亦不承担相应法律责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@moduyun.com

  1. 分享:
最后一次编辑于 2023年11月14日 0

暂无评论

推荐阅读
  TEZNKK3IfmPf   2023年11月14日   86   0   0 scriptansible
  TEZNKK3IfmPf   2023年11月14日   37   0   0 elasticsearch
TEZNKK3IfmPf