ORCA优化器浅析——Exploration and Implementation Apply CXform Phase
  5b99XfAwWKiH 2023年11月02日 44 0


GPORCA通过规则对表达式做转换,规则分为两类:Exploration,是对逻辑表达式做等价变换的;Implementation,是把逻辑操作符转换为物理操作符。Except the types of operator generated during Exploration (Logical => Logical) and Implementation (Logical => Physical) Phase, the process is similar. There are no Transform / CXform for Optimization Phase, so have excluded it from the below diagram. Will talk about Optimization phase a little later.
由于Exploration和Implementation流程一致,因此合在一起描述主要流程:(本篇文章仅关注第一步和第二步)

  1. Get the Group Expression to optimize
  2. Get the CXform Applicable for it
  3. Xform Pattern directs the Expected Child Group Expression to be extracted from the MEMO for that transform (CXform).
exploration xform pattern code snippet:
CXformExpandNAryJoinDPv2::CXformExpandNAryJoinDPv2(CMemoryPool *mp)
 : CXformExploration(
       // pattern
       GPOS_NEW(mp) CExpression(mp, GPOS_NEW(mp) CLogicalNAryJoin(mp),
                  GPOS_NEW(mp) CExpression(mp, GPOS_NEW(mp) CPatternMultiLeaf(mp)),
                  GPOS_NEW(mp) CExpression(mp, GPOS_NEW(mp) CPatternTree(mp))))
{}
Above exploration pattern represents the below tree shape
+-CLogicalNAryJoin
  |--CPatternMultiLeaf
  +--CPatternTree

implementation xform pattern code snippet:
CXformImplementUnionAll::CXformImplementUnionAll(CMemoryPool *mp)
:  // pattern
CXformImplementation(GPOS_NEW(mp) CExpression(
          mp, GPOS_NEW(mp) CLogicalUnionAll(mp),
                          GPOS_NEW(mp) CExpression(mp, GPOS_NEW(mp) CPatternMultiLeaf(mp))))
{}
Above implementation pattern represents the below tree shape
+--CLogicalUnionAll
   +--CPatternMultiLeaf
  1. Construct the expression tree by pulling out Group Expression from the MEMO matching the pattern. If there is no match, that’s fine, it just means that the transform is not applicable in this case
  2. Apply the ::Transform method of the xform to produce the resulting expressions
  3. Insert the resulting expression back to the MEMO
  4. Newly inserted expressions will be eligible to be pulled if they have PxfsCandidates to be applied, and will go through Step 1 -> 6
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
                              │    6.Insert the resulting Expressions Back to the MEMO                                                                                                                                       │
                              │                                                                                                                                                                                              │
                              │                                                                                                                                                                                              │
                              ▼                                                                                                                                                                                              │
                                                                                                                                                                                                                             │
┌───────────────────────────────┐                                                                                                                                                                                            │
│MEMO                           │                                                                                                                                                                                            │
│                               │                                                                                                                                                                                            │
│                               │                                                                                                                                                                                            │
│  ROOT                         │ 4.Extract the Group Expressions from the MEMO based on                                                                                                                                     │
│ ┌───────────────────────────┐ │    the Xform Pattern                                                                                                                                                                       │
│ │ Group N                   │ │ ◄─────────────────────────────────────────────────────────────────────┐                                                                                                                    │
│ │ Group Expression 1[x x]   │ │                                                                       │                                                                                                                    │
│ │                           ├─┼──────┐                                                                │                                                                                                                    │
│ └───────────────────────────┘ │      │                                                                │                                                                                                                    │
│     .                         │      │                                                                │                                                                                                                    │
│     .                         │      │                                                                │                                                                                                                    │
│     .                         │      │                                                                │                                                                   Possible Number of Outputs >= 0                  │
│ ┌───────────────────────────┐ │      │1. Get the Group Expression                                     │                                                                                                                    │
│ │ Group 2                   │ │      │                                                                │                                                                                                                    │
│ │ Group Expression 2[x x x] │ │      │                                                                │                                                                   ┌─────────────────────────────────────────────┐  │
│ └───────────────────────────┘ │      ▼                                                                │                                                               ┌──►│ Output Logical / Physical Expression Tree 1 ├──┤
│                               │  ┌──────────────────────────────────┐      ┌──────────────────────────┴────────────────────────────┐   ┌───────────────┐              │   └─────────────────────────────────────────────┘  │
│ ┌───────────────────────────┐ │  │                                  │      │                            Xform Pattern              │   │4.Constructed  │              │                                                    │
│ │ Group 1                   │ │  │  Group Expression 2 - Xform 1    ├─────►│  Group Expression [x x x]  C<Operator><Identifier>    │   │   Logical     │     ┌────────┴───────────┐                                        │
│ │ Group Expression 3[x x x] │ │  │                                  │      │                                      +-CPatternLeaf   ├──►│     or        ├────►│   5.Apply Xform    │                                        │       
│ │                           │ │  │  Group Expression 2 - Xform 2    │      │                                      +-CPatternLeaf   │   │   Physical    │     └────────┬───────────┘                                        │
│ └───────────────────────────┘ │  │                                  │      │                                      +-CPatternTree   │   │   Expression  │              │                                                    │
│                               │  │  Group Expression 2 - Xform 3    │      │3.Take the 1st GE - Xform combination and lookup the   │   │               │              │   ┌─────────────────────────────────────────────┐  │
└───────────────────────────────┘  │  (All Combination for GE  )      │      │     child GEs from MEMO based on the Xform Pattern    │   │               │              └──►│ Output Logical / Physical Expression Tree 2 ├──┘
                                   └──────────────────────────────────┘      └──────────────┬────────────────────────────────────────┘   └───────────────┘                  └─────────────────────────────────────────────┘
┌───────────────────────────────┐      ▲                             ▲                      │
│ Xform PxfsCandidates for      │      │                             │                      │
│ operator of Group Expression  │      │                             │                      │
│                               │      │2. Get Exploration           │                      │
│  Exploration:                 │      │          or                 │                      │
│   Xform 1                     │      │   Implementation            └──────────────────────┘
│   Xform 2                     │      │         Xform                   7. Take the next combination
│   Xform 3                     │      │    for the Group
│                               │      │     Expression
│  Implementation:              │ ◄────┘
│   Xform 4                     │
│   Xform 5                     │
│   Xform n                     │
│                               │
└───────────────────────────────┘

CJobGroupExpressionImplementation和CJobGroupExpressionExploration

ORCA优化器浅析——Exploration and Implementation Apply CXform Phase_sed


如上图所示导出红线的虚线直角框图就是CJobGroupExpressionImplementation和CJobGroupExpressionExploration任务进行Apply CXform(Get the Group Expression to optimize和Get the CXform Applicable for it)的流程。获取Group Expression就无需解释,其实就是这些任务中的m_pgexpr成员。而Get the CXform Applicable for it流程中,首先需要获取Group Expression中的操作副的规则位图 【每个操作符,都存有自己需要的规则ID构成的规则的子集,这里面包括了逻辑变换和实现变换的规则】,如下代码中的PxfsCandidates就是获取操作符的规则位图,从下图中的CLogicalExternalGet代码中可以看出其规则位图为xform_set->ExchangeSet(cxform::ExfExternalGet2ExternalScan);然后需要和逻辑变换集合或实现变换集合取与【Exploration阶段,拿操作符的规则集和逻辑变换集合取与,得到操作符这个阶段需要的规则;Implementation阶段,拿操作符的规则集和实现变换集合取与,得到操作符这个阶段需要的规则集】,如下图代码CXformFactory::Pxff()->PxfsExploration()CXformFactory::Pxff()->PxfsImplementation()就是获取CXformFactory中的m_pxfsExploration和m_pxfsImplementation,它们分别为bitset of exploration xforms和bitset of implementation xforms。

void CJobGroupExpressionExploration::ScheduleApplicableTransformations(CSchedulerContext *psc) {
	// get all applicable xforms
	COperator *pop = m_pgexpr->Pop();
	CXformSet *xform_set = CLogical::PopConvert(pop)->PxfsCandidates(psc->GetGlobalMemoryPool());

	// intersect them with required xforms and schedule jobs
	xform_set->Intersection(CXformFactory::Pxff()->PxfsExploration());
	xform_set->Intersection(psc->Peng()->PxfsCurrentStage());
	ScheduleTransformations(psc, xform_set);
	xform_set->Release();

	SetXformsScheduled();
}

void CJobGroupExpressionImplementation::ScheduleApplicableTransformations(CSchedulerContext *psc){
	// get all applicable xforms
	COperator *pop = m_pgexpr->Pop();
	CXformSet *xform_set = CLogical::PopConvert(pop)->PxfsCandidates(psc->GetGlobalMemoryPool());

	// intersect them with required xforms and schedule jobs
	xform_set->Intersection(CXformFactory::Pxff()->PxfsImplementation());
	xform_set->Intersection(psc->Peng()->PxfsCurrentStage());
	ScheduleTransformations(psc, xform_set);
	xform_set->Release();

	SetXformsScheduled();
}

ORCA优化器浅析——Exploration and Implementation Apply CXform Phase_sed_02


总结概述一下就是:Exploration阶段,拿操作符的规则集和逻辑变换集合取与,得到操作符这个阶段需要的规则;Implementation阶段,拿操作符的规则集和实现变换集合取与,得到操作符这个阶段需要的规则集。

ORCA优化器浅析——Exploration and Implementation Apply CXform Phase_Express_03


GPORCA通过规则工程管理规则集,包括Exploration和Implementation。上图中,0和1有8列,表示规则集。为了简单,我们假设CLogicalGet需要的规则ID是4。LogicalGet所在的行,左边的规则集记录了LogicalGet需要的规则。可以看到第四位被设置为1,可以知道LogicalGet需要ID为4的规则进行转换。Exploration所在的行的规则集,表示所有的规则的集合。5,6,7,8位被设置为1,所以Exploration的规则集是ID为{5,6,7,8}的规则集。同样的,Implementation的规则集为{1,2,3,4}

如下代码即是从上述流程中获取交集的xform规则(即是CXform类的子类)集合中遍历CXform类,为表达式m_pgexpr应用对应的xform规则创建CJobTransformation任务。

//---------------------------------------------------------------------------
//	@function:
//		CJobGroupExpression::ScheduleTransformations
//	@doc:
//		Schedule transformation jobs for the given set of xforms
//---------------------------------------------------------------------------
void CJobGroupExpression::ScheduleTransformations(CSchedulerContext *psc, CXformSet *xform_set) {	
	CXformSetIter xsi(*(xform_set)); // iterate on xforms
	while (xsi.Advance()){
		CXform *pxform = CXformFactory::Pxff()->Pxf(xsi.TBit());
		CJobTransformation::ScheduleJob(psc, m_pgexpr, pxform, this);
	}
}

CJobTransformation

如下为CJobTransformation任务实际执行的函数,其从类成员中获取到表达式和规则,并调用表达式的Transform方法实现【Xform Pattern directs the Expected Child Group Expression to be extracted from the MEMO for that transform (CXform);Construct the expression tree by pulling out Group Expression from the MEMO matching the pattern. If there is no match, that’s fine, it just means that the transform is not applicable in this case;Apply the ::Transform method of the xform to produce the resulting expressions】,最终Insert the resulting expression back to the MEMO。

CJobTransformation::EEvent CJobTransformation::EevtTransform(CSchedulerContext *psc, CJob *pjOwner) {	
	CJobTransformation *pjt = PjConvert(pjOwner); // get a job pointer
	CMemoryPool *pmpGlobal = psc->GetGlobalMemoryPool(); CMemoryPool *pmpLocal = psc->PmpLocal();
	CGroupExpression *pgexpr = pjt->m_pgexpr; CXform *pxform = pjt->m_xform;

    ULONG ulElapsedTime = 0; ULONG ulNumberOfBindings = 0;
	// insert transformation results to memo
	CXformResult *pxfres = GPOS_NEW(pmpGlobal) CXformResult(pmpGlobal);	
	pgexpr->Transform(pmpGlobal, pmpLocal, pxform, pxfres, &ulElapsedTime, &ulNumberOfBindings);
	psc->Peng()->InsertXformResult(pgexpr->Pgroup(), pxfres, pxform->Exfid(), pgexpr, ulElapsedTime, ulNumberOfBindings);
	pxfres->Release();
	
	return eevCompleted;
}

https://github.com/greenplum-db/gpdb/wiki https://mp.weixin.qq.com/s?__biz=MzA4NjQzNjUyOA==&mid=2650571780&idx=1&sn=1b14e76622cb70b6dd008b4927df8c5f&chksm=87c09bdbb0b712cd8228fab3f2f44fa2aecab0ece6f5d9cdb199efb1ca34bdce58d8f46b92bd&scene=27


【版权声明】本文内容来自摩杜云社区用户原创、第三方投稿、转载,内容版权归原作者所有。本网站的目的在于传递更多信息,不拥有版权,亦不承担相应法律责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@moduyun.com

  1. 分享:
最后一次编辑于 2023年11月08日 0

暂无评论

推荐阅读
  bzUvzvVq9oY1   2023年11月02日   46   0   0 数据类型json数据库
  JiJ96DoSHEh4   2023年11月13日   28   0   0 分隔符字段sed
  JiJ96DoSHEh4   2023年11月13日   174   0   0 上传文件列表sed
  20xfzlOvosRH   2023年12月05日   29   0   0 mysql数据库
  JiJ96DoSHEh4   2023年11月19日   26   0   0 bashbcsed
5b99XfAwWKiH
最新推荐 更多