×
技术社区 >  技术博客 >  OceanBase 索引创建进度查询指南

OceanBase 索引创建进度查询指南

做 OceanBase 运维或开发的同学,一定遇到过这样的场景:
执行了索引创建语句后,看着控制台一动不动,不知道索引到底建到哪了?是还在扫描数据,还是已经进入排序阶段?会不会中途失败了?

尤其是面对大数据量表,索引创建可能要耗时几小时甚至更久,“盲等”不仅浪费时间,还容易错过问题排查时机。

今天就给大家整理一份「OceanBase 索引创建进度查询指南」,新手也能轻松上手!

一、可用的监控视图概览

OceanBase 提供了多个视图来查询 DDL(含索引创建)的执行进度:

视图 用途 数据来源
GV$SESSION_LONGOPS 推荐首选。实时进度,包含行数、子阶段信息 DDL 任务的 collect_longops_stat() 动态生成
__all_virtual_ddl_task_status DDL 任务状态(持久化),包含状态码和原始 DDL 语句 __all_ddl_task_status 系统表
__all_virtual_table 查看索引可用状态 Schema 元数据
GV$SQL_PLAN_MONITOR 深入分析排序线程级别的执行统计 SQL 执行引擎
__all_virtual_ddl_diagnose_info DDL 诊断信息 聚合 task_status + error_message
__all_virtual_ddl_error_message DDL 错误详情 DDL 执行错误记录

二、快速判断索引是否创建完成

方法 1:查看索引状态(最简单)

SELECT table_id, table_name, index_status
FROM oceanbase.__all_virtual_table
WHERE table_name like '__idx%索引名%';

index_status 含义:

含义
1 INDEX_STATUS_UNAVAILABLE — 构建中
2 INDEX_STATUS_AVAILABLE可用(创建完成)
3 INDEX_STATUS_UNIQUE_CHECKING — 唯一性校验中

示例:

__idx_3749302_idx_longtext 索引在创建中,index_status = 1;

__idx_3749302_idx_longtext 索引已构建完成, index_status = 2;

方法 2:查看 DDL 任务状态

SELECT task_id, status, ret_code, ddl_stmt_str
FROM oceanbase.__all_virtual_ddl_task_status
WHERE tenant_id = 
  AND status NOT IN (99, 100);  -- 排除已完成/已失败

status 关键值(源码 ObDDLTaskStatus 枚举,定义于 src/share/ob_ddl_common.h):

状态 说明
0 PREPARE 准备阶段
2 WAIT_TRANS_END 等待事务结束
3 REDEFINITION 数据回填(最耗时阶段)
4 VALIDATE_CHECKSUM 校验数据一致性
6 TAKE_EFFECT 索引生效
99 FAIL 失败
100 SUCCESS 成功

查询示例:

方法 3:查看 LONGOPS 是否还有记录

SELECT COUNT(*) FROM oceanbase.GV$SESSION_LONGOPS;

无记录则所有 DDL 长操作已完成。


三、查看详细执行进度

3.1 GV$SESSION_LONGOPS(推荐)

SELECT SID, OPNAME, TARGET, ELAPSED_SECONDS, TIME_REMAINING, MESSAGE
FROM oceanbase.GV$SESSION_LONGOPS;

输出示例:

-- ROW_SCANNED 阶段输出示例
ocp_monitor@[oceanbase]>SELECT SID, OPNAME, TARGET, ELAPSED_SECONDS, TIME_REMAINING, MESSAGE FROM oceanbase.GV$SESSION_LONGOPS;
+-----+--------------+---------+-----------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SID | OPNAME       | TARGET  | ELAPSED_SECONDS | TIME_REMAINING | MESSAGE                                                                                                                                                                                                                                                                  |
+-----+--------------+---------+-----------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  -1 | create index | 3749303 |              21 |              0 | TENANT_ID: 1002, TASK_ID: 74233447, STATUS: REPLICA BUILD, SCANNING, PARALLELISM: 1, ROW_COUNT_INFO:{ ROW_SCANNED: 4315904, ROW_SORTED: 0, ROW_INSERTED: 0 }, SCAN_INFO:{ SCAN_TIME_ELAPSED: 21.000s, MAX_THREAD_ROW_SCANNED: 4315904, MIN_THREAD_ROW_SCANNED: 4315904 } |
+-----+--------------+---------+-----------------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.376 sec)


-- ROW_SORTED 阶段输出示例
ocp_monitor@[oceanbase]>SELECT SID, OPNAME, TARGET, ELAPSED_SECONDS, TIME_REMAINING, MESSAGE FROM oceanbase.GV$SESSION_LONGOPS;
+-----+--------------+---------+-----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SID | OPNAME       | TARGET  | ELAPSED_SECONDS | TIME_REMAINING | MESSAGE                                                                                                                                                                                                                                                                                                                                                                                                    |
+-----+--------------+---------+-----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  -1 | create index | 3749303 |              51 |              0 | TENANT_ID: 1002, TASK_ID: 74233447, STATUS: REPLICA BUILD, SORT_PHASE1, PARALLELISM: 1, SORT_PHASE1_THREAD_NUM: 1, ROW_COUNT_INFO:{ ROW_SCANNED: 6402493, ROW_SORTED: 709701, ROW_INSERTED: 0 }, SORT_PHASE1_PROGRESS_INFO:{ SORT_PHASE1_TIME_ELAPSED: 19.139s, SORT_PHASE1_PROGRESS: 11.08%, SORT_PHASE1_TIME_REMAINING: 153.521s }, SLOWEST_THREAD_INFO:{ THREAD_ID: 2572679, SORTED_ROW_COUNT: 709701 } |
+-----+--------------+---------+-----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.318 sec)


-- ROW_INSERTED 阶段输出示例:
ocp_monitor@[oceanbase]>SELECT SID, OPNAME, TARGET, ELAPSED_SECONDS, TIME_REMAINING, MESSAGE FROM oceanbase.GV$SESSION_LONGOPS;
+-----+--------------+---------+-----------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SID | OPNAME       | TARGET  | ELAPSED_SECONDS | TIME_REMAINING | MESSAGE                                                                                                                                                                                                                                                                                                                                                                           |
+-----+--------------+---------+-----------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  -1 | create index | 3749303 |             206 |              0 | TENANT_ID: 1002, TASK_ID: 74233447, STATUS: REPLICA BUILD, INSERT, PARALLELISM: 1, INSERT_THREAD: 1, ROW_COUNT_INFO:{ ROW_SCANNED: 6402493, ROW_SORTED: 6402493, ROW_INSERTED: 1842400 }, INSERT_PROGRESS_INFO:{ INSERT_TIME_ELAPSED: 23.460s, INSERT_PROGRESS: 28.78%, INSERT_TIME_REMAINING: 58.064s }, SLOWEST_THREAD_INFO:{ THREAD_ID: 2572679, INSERTED_ROW_COUNT: 1842400 } |
+-----+--------------+---------+-----------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.324 sec)

关键字段解读:

字段 含义
ROW_SCANNED 已扫描的源表行数(扫描完成后不再变化)
ROW_SORTED 排序操作累计处理行数(注意:不等于已排序的唯一行数
ROW_INSERTED 已写入索引表的行数
ELAPSED_SECONDS 已执行时间(秒)
TIME_REMAINING 预估剩余时间(秒),该字段未实现,
PARALLELISM 并行度
MESSAGE SCAN_INFO/SORT_PHASE1_PROGRESS_INFO/INSERT_PROGRESS_INFO 有各个阶段的详细信息,该阶段的已执行时间、进度和线程信息等。

四、索引创建的完整执行流程

索引创建在 REDEFINITION(status=3)阶段完成数据回填,内部分为以下子阶段:

SCAN(扫描源表)
  │
  ▼
SORT_PHASE1(内存排序,生成有序块写入临时文件)
  │
  ▼
SORT_PHASE2(外部归并排序,可能多轮)
  │
  ▼
INSERT(将排序结果写入索引表)
  │
  ▼
VALIDATE_CHECKSUM → TAKE_EFFECT(校验 + 生效)

源码位置:src/rootserver/ddl_task/ob_index_build_task.cppObIndexBuildTask::collect_longops_stat()


五、ROW_SORTED 的正确理解

5.1 ROW_SORTED 可能大于 ROW_SCANNED

这是正常现象。源码(src/share/ob_ddl_common.cpp)中:

// 显示的 ROW_SORTED = row_sorted_ + row_merge_sorted_
row_scanned_, row_sorted_ + row_merge_sorted_, row_inserted_file_
  • row_sorted_:Phase1 内存排序处理的行数(≈ N)
  • row_merge_sorted_:Phase2 归并排序额外处理的行数

归并排序每轮都会处理全部 N 行,如果有 R 轮,则 row_merge_sorted_ ≈ N × R


六、常用 SQL 速查

查看所有正在执行的 DDL 任务

SELECT task_id, tenant_id, ddl_type, status, trace_id,
       gmt_create, gmt_modified
FROM oceanbase.__all_virtual_ddl_task_status
WHERE status NOT IN (99, 100);

查看索引创建实时进度

SELECT now(), SID, TARGET AS index_table_id,
       ELAPSED_SECONDS, TIME_REMAINING, MESSAGE
FROM oceanbase.GV$SESSION_LONGOPS
WHERE OPNAME = 'create index';

查看索引是否已可用

SELECT table_id, table_name, index_status
FROM oceanbase.__all_virtual_table
WHERE table_name like '__idx%索引名%';

查看排序线程级详情

SELECT PROCESS_NAME, PLAN_LINE_ID, PLAN_OPERATION,
       OUTPUT_ROWS,
       OTHERSTAT_1_VALUE AS row_sorted,
       OTHERSTAT_2_VALUE AS merge_sort_round,
       OTHERSTAT_6_VALUE AS dump_size_bytes,
       OTHERSTAT_7_VALUE AS row_count,
       OTHERSTAT_8_VALUE AS expected_round_count,
       FIRST_CHANGE_TIME, LAST_CHANGE_TIME
FROM GV$SQL_PLAN_MONITOR
WHERE TRACE_ID = ''
  AND PLAN_OPERATION = 'PHY_SORT'
ORDER BY PROCESS_NAME;

> trace_id 可从 __all_virtual_ddl_task_statusGV$SESSION_LONGOPS 获取。

查看 DDL 失败原因

SELECT * FROM oceanbase.__all_virtual_ddl_error_message
WHERE task_id = ;

常见失败原因:磁盘空间不足


七、关键源码文件索引

功能 源码路径
DDL 任务状态枚举 src/share/ob_ddl_common.hObDDLTaskStatus
索引构建任务 & longops 采集 src/rootserver/ddl_task/ob_index_build_task.cppcollect_longops_stat()
进度消息生成(详细 & 简化) src/share/ob_ddl_common.cppgenerate_session_longops_message()
ROW_SORTED 计算逻辑 src/share/ob_ddl_common.cppcalculate_merge_sort_info()
外部排序实现 & expected_rounds src/sql/engine/sort/ob_sort_op_impl.cppsort() 方法
归并路数 k 的确定 src/sql/engine/sort/ob_sort_op_impl.cppbuild_ems_heap()
MAX_MERGE_WAYS / BLOCK_SIZE src/sql/engine/sort/ob_sort_op_impl.h / src/sql/engine/basic/ob_chunk_datum_store.h
DDL 任务状态表定义 src/share/inner_table/ob_inner_table_schema.301_350.cpp
LONGOPS 虚拟表实现 src/observer/virtual_table/ob_all_virtual_long_ops_status.cpp
DDL 诊断视图实现 src/observer/virtual_table/ob_all_virtual_ddl_diagnose_info.cpp

八、注意事项与最佳实践

  1. 并行度选择:可以在索引创建时间和集群负载之间选择合适的并行度。
  2. ROW_SORTED 不能直接算百分比:需要结合 expected_rounds 才能计算准确进度。直接用 ROW_SORTED / ROW_SCANNED 在 SORT_PHASE2 阶段会超过 100%。
  3. 排序行长 ≠ 表行长:估算排序内存用量时,应使用索引列 + 主键列的未压缩大小,而非表级别的压缩后平均行长。

精选推荐