【GaussDB】使用gdb定位GaussDB编译package报错

【GaussDB】使用gdb定位GaussDB编译package报错

背景

在某次迁移Oracle到GaussDB时,应用开发人员将改好的package在GaussDB里进行创建,没有ERROR也没有WARNING,但是编译无效对象的时候报错了。虽然已经找到了是哪个包编译报错,但是这个包有上万行,而且里面也有好几十个procedure,而报错信息仅仅只有 ERROR: Failed to query the 323 type in the cache.,没有上下文信息,连行号都没有,根本无从判断是哪里出了问题。

基本排查

  • 尝试drop这个package,然后重建,再编译,现象一样
  • 尝试重启数据库以清空全局PLSQL缓存,再编译,现象一样

这意味着这个问题与缓存无关,大概率也与其他依赖对象无关,所以暂时针对这个package进行排查。

对于不会gdb调试的人来说,要排查这个问题只能对着这个package,使用二分法来删除里面的procedure,直到删到某个procedure前后报错发生变化,但要注意里面的procedure的依赖。当时就这么一路删,最后的确发现了原因,但是耗费的时间非常久,解决依赖关系时还要手动改代码。

那么有没有一种方式能迅速定位是哪个procedure的问题么?
当然有,那就是使用gdb直接进行内核级别调试,因为Gauss系数据库编译package时,都是会逐个对里面的每一个procedure和function进行编译。

gdb调试前置准备

由于已经找到了触发这个报错的package特征,因此下面就用最小化模拟用例来进行演示:

测试用例

create package pkg_test_4 is
procedure p1(i1 in varchar2,i2 out varchar2,i3 out varchar2);
end;  
/
create package body pkg_test_4 is
procedure p1(i1 in varchar2,i2 in varchar2,i3 out varchar2) isbeginnull;end;
end;  
/
alter package pkg_test_4 compile;

执行效果

gaussdb=# alter package pkg_test_4 compile;
gaussdb=# create or replace package pkg_test_4 is
gaussdb$# procedure p1(i1 in varchar2,
gaussdb$#              i2 out varchar2,
gaussdb$#              i3 out varchar2);
gaussdb$# end;
gaussdb$# /
CREATE PACKAGE
gaussdb=# create or replace package body pkg_test_4 is
gaussdb$# procedure p1(i1 in varchar2,
gaussdb$#              i2 in varchar2,
gaussdb$#              i3 out varchar2) is
gaussdb$#              begin
gaussdb$#                null;
gaussdb$#              end;
gaussdb$# end;
gaussdb$# /
CREATE PACKAGE BODY
gaussdb=# alter package pkg_test_4 compile;
ERROR:  Failed to query the 323 type in the cache.
gaussdb=#

使用gdb调试找问题有个关键,就是这个问题最好是能稳定复现的,否则gdb抓不到报错现场也很难分析问题。

另外,开始gdb调试前,一定要先把对应版本的符号表准备好,比较简单的方式就是直接把符号表里的bin和lib解压到GaussDB的bin和lib目录。

在之前分析MogDB的问题时,我们内核研发有教过我可以使用 b errstart if elevel>19设置断点来断住所有 ERROR以上级别的报错,但是这招在GaussDB似乎不行了

(gdb) b errstart if elevel>19
No symbol "elevel" in current context.
(gdb)

如果直接 b errstart,是可以断,但是会老是断,根本没法跑起来,因为这里就算没报错也会调用进来,几乎所有线程都在频繁走到这里,参考openGauss源码中的错误级别代码,里面连INFO/NOTICE都有

/* Error level codes */
#define DEBUG5                                 \10 /* Debugging messages, in categories of \* decreasing detail. */
#define DEBUG4 11
#define DEBUG3 12
#define DEBUG2 13
#define DEBUG1 14 /* used by GUC debug_* variables */
#define LOG                                         \15 /* Server operational messages; sent only to \* server log by default. */
#define COMMERROR                                    \16 /* Client communication problems; same as LOG \* for server reporting, but never sent to    \* client. */
#define INFO                                          \17 /* Messages specifically requested by user (eg \* VACUUM VERBOSE output); always sent to      \* client regardless of client_min_messages,   \* but by default not sent to server log. */
#define NOTICE                                        \18 /* Helpful messages to users about query       \* operation; sent to client and server log by \* default. */
#define WARNING                                      \19 /* Warnings.  NOTICE is for expected messages \* like implicit sequence creation by SERIAL. \* WARNING is for unexpected messages. */
#define ERROR                                       \20 /* user error - abort transaction; return to \* known state */
#define VERBOSEMESSAGE                                  \9 /* indicates to show verbose info for CN and DNs; \* for DNs means to send info back to CN */
/* Save ERROR value in PGERROR so it can be restored when Win32 includes* modify it.  We have to use a constant rather than ERROR because macros* are expanded only when referenced outside macros.*/
#ifdef WIN32
#define PGERROR 20
#endif
#define FATAL 21 /* fatal error - abort process */
#define PANIC 22 /* take down the other backends with me *//* MAKE_SQLSTATE('P', '1', '0' , '0', '0')=96 */
#define CUSTOM_ERRCODE_P1 96

看一下 b errstart 会断到哪里

(gdb) b errstart
Breakpoint 1 at 0x564904871b20: errstart. (3 locations)
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   <MULTIPLE>
1.1                         y   0x0000564904871b20 in errstart(int, char const*, int, char const*, char const*)at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:4108
1.2                         y   0x00007f6b9cd5e5b0 <errstart(int, char const*, int, char const*, char const*)@plt>
1.3                         y   0x00007f6b9d03eb30 <errstart(int, char const*, int, char const*, char const*)@plt>
(gdb)

从这个断点信息里来看,errstartelog.cpp的4108行,这很是可疑,因为无论是openGauss还是MogDB,这个errstart函数应该在更前面的位置,大概是第两百多行的地方。
由于没有源码,就只能反汇编看下有没有能参考的信息了

(gdb) disassemble /m errstart
Dump of assembler code for function errstart(int, char const*, int, char const*, char const*):
238     /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp: No such file or directory.0x0000564904871b25 <+5>:     push   %rbp0x0000564904871b26 <+6>:     mov    %rsp,%rbp0x0000564904871b29 <+9>:     push   %r150x0000564904871b2b <+11>:    push   %r140x0000564904871b2d <+13>:    push   %r130x0000564904871b2f <+15>:    push   %r120x0000564904871b31 <+17>:    mov    %rsi,%r140x0000564904871b34 <+20>:    push   %rbx0x0000564904871b35 <+21>:    mov    %edi,%ebx0x0000564904871b37 <+23>:    sub    $0x58,%rsp0x0000564904871b3e <+30>:    mov    %edx,-0x64(%rbp)0x0000564904871b41 <+33>:    mov    %rcx,-0x70(%rbp)0x0000564904871b45 <+37>:    mov    %r8,-0x78(%rbp)239     in /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp
240     in /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp

可以看到这个函数第一次出现其实是在第238行,从这几个寄存器的操作来看,应该就是函数的入口,也就是说,实际上这个函数的定义应该在第238行,而不是前面的4108行。我观察了下,4108行的elevel一直是被优化掉的,看不到里面的值,只有238行的elevel能观测到值。
所以在GaussDB里要断ERROR及以上级别的错误,断点应该设置为

b elog.cpp:238 if (elevel > 19)

可以提前先准备好下面的命令,gdb进去后直接复制粘贴,减少进程中断时间

b elog.cpp:238 if (elevel > 19) 
handle SIGUSR1 nostop noprint
handle SIGUSR2 nostop noprint
handle SIGPIPE nostop
set pagi off
set print elements 300
continue

正式开始gdb调试

先用 ps -ef |grep gaussdb找到进程号
然后gdb -p 进程号

[gaussdb506@ky10-sp3 ~]$ ps -ef |grep gaussdb
root      426694  426551  0 09:19 pts/0    00:00:00 su - gaussdb506
gaussdb+  426699  426694  0 09:19 pts/0    00:00:00 -bash
gaussdb+  427027  426699  0 09:19 pts/0    00:00:00 ps -ef
gaussdb+  427028  426699  0 09:19 pts/0    00:00:00 grep gaussdb
og_last+ 3231792       1  1 Jul22 ?        05:58:52 /opt/og_lastest/openGauss-server/dest/bin/gaussdb
og700rc1 3508544       1  1 Aug04 ?        00:33:58 /opt/og700rc1/app/bin/gaussdb -D /opt/og700rc1/data -M primary
gaussdb+ 3864702       1 29 Aug05 ?        06:33:50 /data/gaussdb506/app/bin/gaussdb
[gaussdb506@ky10-sp3 ~]$ gdb -p 3864702
GNU gdb (GDB) KylinOS 9.2-3.p01.ky10
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-kylin-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:<http://www.gnu.org/software/gdb/documentation/>.For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 3864702
[New LWP 3864703]
[New LWP 3864749]
...#省略
[New LWP 4081944]warning: File "/usr/lib64/libthread_db-1.0.so" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
--Type <RET> for more, q to quit, c to continue without paging--
To enable execution of this file addadd-auto-load-safe-path /usr/lib64/libthread_db-1.0.so
line to your configuration file "/home/gaussdb506/.gdbinit".
To completely disable this security protection addset auto-load safe-path /
line to your configuration file "/home/gaussdb506/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:info "(gdb)Auto-loading safe path"warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.warning: File "/usr/lib64/libthread_db-1.0.so" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
0x00007f6bbc53c849 in poll () from /usr/lib64/libc.so.6
(gdb) b elog.cpp:238 if (elevel > 19)
Breakpoint 1 at 0x564904871b25: file /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp, line 238.
(gdb) handle SIGUSR1 nostop noprint
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        No      Yes             User defined signal 1
(gdb) handle SIGUSR2 nostop noprint
Signal        Stop      Print   Pass to program Description
SIGUSR2       No        No      Yes             User defined signal 2
(gdb) handle SIGPIPE nostop
Signal        Stop      Print   Pass to program Description
SIGPIPE       No        Yes     Yes             Broken pipe
(gdb) set pagi off
(gdb) set print elements 300
(gdb) continue
Continuing.
[New LWP 427599]
[New LWP 427600]
[LWP 427599 exited]
[New LWP 427601]
[LWP 427601 exited]
[New LWP 427602]
[LWP 427602 exited]
[LWP 427600 exited]

当后面不断有输出 [New LWP xxxxxx]时,gaussdb就是正常运行中了。
接下来可以开一个客户端连接,执行上面用于模拟测试的sql,会卡在 alter package pkg_test_4 compile; 这个语句上,同时gdb的窗口不再连续打印 [New LWP xxxxxx],而是命中了断点

[Switching to LWP 3864766]Thread 16 "TPLworker" hit Breakpoint 1, errstart (elevel=20, filename=0x564908abfcc8 "format_type.cpp", lineno=216, funcname=0x564908abfdd0 <format_type_internal(unsigned int, int, bool, bool, bool)::__func__> "format_type_internal", domain=0x5649087e1004 "plpgsql-9.2") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:238
238     in /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp
(gdb)

接着输入bt查看堆栈

(gdb) bt
#0  errstart (elevel=20, filename=0x564908abfcc8 "format_type.cpp", lineno=216, funcname=0x564908abfdd0 <format_type_internal(unsigned int, int, bool, bool, bool)::__func__> "format_type_internal", domain=0x5649087e1004 "plpgsql-9.2") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:238
#1  0x00005649043cc8ae in format_type_internal (type_oid=323, typemod=-1, typemod_given=<optimized out>, allow_invalid=<optimized out>, include_nspname=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/adt/format_type.cpp:212
#2  0x00005649045b82e8 in format_procedure (procedure_oid=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/adt/regproc.cpp:473
#3  0x0000564904a2d53d in do_compile (fcinfo=0x7f6a520460c0, proc_tup=0x7f69b7c375a0, func=0x7f6a39c64050, compile_func_head_info=0x7f6a52046740, for_validator=true, hashkey=0x7f6a52045d50) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp:921
#4  0x0000564904a35f3b in gsplsql_compile (fcinfo=0x7f6a520460c0, compile_func_head_info=0x7f6a52046740, for_validator=true, isRecompile=false, func_runtime_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp:3106
#5  0x0000564906c64eeb in plpgsql_validator (fcinfo=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/plpgsql/src/pl_handler.cpp:1481
#6  0x00005649048a56cb in OidFunctionCall4Coll (function_id=10790, collation=0, arg1=97664, arg2=0, arg3=0, arg4=140094619281216, is_null=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/fmgr/fmgr.cpp:2512
#7  0x0000564904a55363 in gsplsql_func_in_pkg_compile (pkg=0x7f6a455ec050) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_pkg_main.cpp:1210
#8  0x0000564904a571fc in gsplsql_pkg_init (pkg=0x7f6a455ec050, isCreate=false, isSpec=false, ret_pkg_runtime=0x7f6a52046a18, is_need_compile_func=true, pkg_debug_query_string=<optimized out>, old_pkg_runtime=0x7f6a39a48050) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_pkg_main.cpp:1545
#9  0x0000564904a584a3 in gsplsql_pkg_compile (pkg_oid=97663, for_validator=true, is_spec=false, is_create=false, is_recompile=true, pkg_runtime_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_pkg_main.cpp:956
#10 0x0000564905100dc6 in recompile_single_package (package_oid=97663, is_spec=false) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/packagecmds.cpp:329
#11 0x0000564905101212 in recompile_package_by_oid (pkg_oid=97663, recompile_invalid_pkg=false) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/packagecmds.cpp:416
#12 0x0000564905101262 in recompile_package (stmt=0x7f6a3f4954c0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/packagecmds.cpp:437
#13 0x0000564905431abe in sqlcmd_standard_process_utility (parse_tree=0x7f6a3f4954c0, query_string=0x7f6a3f496050 "alter package pkg_test_4 compile", params=0x0, is_top_level=<optimized out>, dest=0x56490a6e0720 <donothingDR>, sent_to_remote=<optimized out>, completion_tag=0x7f6a5204a430 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/utility.cpp:6813
#14 0x00007f6b9cd9b759 in gsaudit_ProcessUtility_hook (parsetree=0x7f6a3f4954c0, queryString=0x7f6a3f496050 "alter package pkg_test_4 compile", params=0x0, isTopLevel=<optimized out>, dest=0x56490a6e0720 <donothingDR>, sentToRemote=<optimized out>, completionTag=0x7f6a5204a430 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/security_plugin/security_policy_plugin.cpp:856
#15 0x00005649059a0f52 in audit_process_utility (parsetree=0x7f6a3f4954c0, query_string=0x7f6a3f496050 "alter package pkg_test_4 compile", params=<optimized out>, is_top_level=<optimized out>, dest=<optimized out>, sent_to_remote=<optimized out>, completion_tag=0x7f6a5204a430 "", is_ctas=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/audit/security_auditfuncs.cpp:1512
#16 0x000056490543c71d in sqlcmd_process_utility (parse_tree=0x7f6a3f4954c0, query_string=0x7f6a3f496050 "alter package pkg_test_4 compile", params=0x0, is_top_level=<optimized out>, dest=<optimized out>, sent_to_remote=<optimized out>, completion_tag=0x7f6a5204a430 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/utility.cpp:1974
#17 0x000056490541d83f in PortalRunUtility (portal=0x7f6a48878050, utilityStmt=0x7f6a3f4954c0, isTopLevel=true, dest=0x56490a6e0720 <donothingDR>, completionTag=0x7f6a5204a430 "") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:2140
#18 0x000056490541f0be in PortalRunMulti (portal=0x7f6a48878050, isTopLevel=true, dest=0x56490a6e0720 <donothingDR>, altdest=0x56490a6e0720 <donothingDR>, completionTag=0x7f6a5204a430 "", snapshot=0x0, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:2326
#19 0x00005649054232dc in PortalRun (portal=0x7f6a48878050, count=200, isTopLevel=true, dest=0x7f6a3f4a84d0, altdest=0x7f6a3f4a84d0, completionTag=0x7f6a5204a430 "", snapshot=0x0, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:1501
#20 0x00005649054158af in exec_execute_message (max_rows=200, portal_name=0x7f6a3f4a8050 "") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:7071
#21 gs_process_command (firstchar=<optimized out>, input_message=<optimized out>, send_ready_for_query=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:12314
#22 0x000056490541b9c0 in PostgresMain (argc=<optimized out>, argv=0x7f6a49e45b20, dbname=<optimized out>, username=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:11313
#23 0x000056490539f2df in backend_run (port=0x7f6a5204a890) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:12482
#24 0x00005649053de1b0 in gauss_db_worker_thread_main<(knl_thread_role)2> (arg=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:19086
#25 0x000056490539f39a in internal_thread_func (args=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:20196
#26 0x00007f6bbc60ff1b in ?? () from /usr/lib64/libpthread.so.0
#27 0x00007f6bbc547320 in clone () from /usr/lib64/libc.so.6

可以看到在报错的 format_type_internal里,出现了type_oid=323,的确是在报错中出现的数字,但是323这么小的数字明显不可能是用户自定义类型,因为小数字的oid都是被系统保留的。这里肯定是有bug的,但没有源码不方便找bug原因,本次调试的主要目的是找到出错的procedure。

format_type_internalformat_procedure里,有用的参数都显示成 <optimized out>了,这表示内核把这些变量优化掉了,不给看。于是继续看下一行 do_compile,打印几个参数看看

(gdb) f 3
#3  0x0000564904a2d53d in do_compile (fcinfo=0x7f6a520460c0, proc_tup=0x7f69b7c375a0, func=0x7f6a39c64050, compile_func_head_info=0x7f6a52046740, for_validator=true, hashkey=0x7f6a52045d50) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp:921
921     /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp: No such file or directory.
(gdb) p *fcinfo
$1 = {flinfo = 0x7f6a52046020, context = 0x0, resultinfo = 0x0, fncollation = 0, isnull = false, nargs = 0, arg = 0x7f6a39a4ca50, argnull = 0x0, argTypes = 0x0, prealloc_arg = {0 <repeats 20 times>}, prealloc_argnull = {false <repeats 20 times>}, prealloc_argTypes = {0 <repeats 20 times>}, argVector = 0x0, refcursor_data = {argCursor = 0x0, returnCursor = 0x0, return_number = 0}, out_tmtype = 0 '\000', out_decimals = 0 '\000', udfInfo = {UDFArgsHandlerPtr = 0x0, UDFResultHandlerPtr = 0x0, udfMsgBuf = 0x0, msgReadPtr = 0x0, argBatchRows = 0, allocRows = 0, arg = 0x0, null = 0x0, result = 0x0, resultIsNull = 0x0, valid_UDFArgsHandlerPtr = false}, swinfo = {sw_econtext = 0x0, sw_exprstate = 0x0, sw_is_flt_frame = false}, out_typmode = 0x0, fn_typmode = 0, plfunc_exec_mode = 0, plfunc_exec_state = 0x0, args_done = 0x0, prealloc_args_done = {0 <repeats 20 times>}, arginfo = {{in_tmtype = 0 '\000', in_decimals = 0 '\000', argTypModes = 0, set_enum_typeoid = 0}}}
(gdb) p *proc_tup
Attempt to dereference a generic pointer.
(gdb) p *func
$2 = {type = T_PLpgSQL_FUNCTION, fn_oid = 97664, pkg_oid = 97663, namespaceOid = 2200, fn_owner = 16728, fn_input_collation = 0, fn_signature = 0x0, fn_searchpath = 0x0, namespace_searchpath = 0x0, fn_hashkey = 0x0, fn_cxt = 0x7f6a6a9599d0, fn_rettype = 0, fn_rettyplen = 0, glc_func_life = 1, fn_rettypioparam = 0, fn_retbyval = false, fn_retistuple = false, fn_retset = false, fn_readonly = false, out_param_varno = -1, found_varno = 0, fn_nallargs = 0, argmods = 0x0, argtypes = 0x0, sql_cursor_found_varno = 0, sql_notfound_varno = 0, sql_isopen_varno = 0, sql_rowcount_varno = 0, sql_bulk_exceptions_varno = 0, sqlcode_varno = 0, sqlstate_varno = 0, sqlerrm_varno = 0, new_varno = 0, old_varno = 0, tg_name_varno = 0, tg_when_varno = 0, tg_level_varno = 0, tg_op_varno = 0, tg_relid_varno = 0, tg_relname_varno = 0, tg_table_name_varno = 0, tg_table_schema_varno = 0, tg_nargs_varno = 0, tg_argv_varno = 0, retvarno = 0, guc_stat = 5, use_count = 0, resolve_option = GSPLSQL_RESOLVE_COLUMN, ndatums = 0, datums = 0x0, datum_need_free = 0x0, action = 0x0, goto_labels = 0x0, invalItems = 0x0, saved_unique_id = 4294967295, nPlaceHolders = 0, placeholders = 0x0, cur_estate = 0x0, tg_relation = 0x0, debug = 0x0, ns_top = 0x0, is_private = false, fn_is_trigger = false, pre_parse_trig = false, is_autonomous = false, is_inline_handler = false, is_valid = true, is_plpgsql_func_with_outparam = false, need_skip_process_autonm_pkg = false, remembered_by_resowner = false, typeList = 0x0, namespace_name = 0x0, expr_list = 0x0, fn_retinput = {fn_addr = 0x0, fn_oid = 0, fn_nargs = 0, fn_strict = false, fn_retset = false, fn_extra = 0x0, fn_mcxt = 0x0, fn_expr = 0x0, fn_rettype = 0, fn_rettypemod = 0, fnName = '\000' <repeats 63 times>, fnLibPath = 0x0, vec_fn_addr = 0x0, vec_fn_cache = 0x0, genericRuntime = 0x0, max_length = 0, fn_languageId = 0, fn_stats = 0 '\000', fn_fenced = false, fn_volatile = 0 '\000', decimals = 0 '\000'}, glc_status = {m_type = GLC_FUNCTION_OBJ, m_location = GLC_OBJECT_IN_SESSION_WAIT_REMOVE, m_glc_object_state = GLC_OBJECT_IS_VALID, m_refcount = 1}, expired_cell = {dle_next = 0x0, dle_prev = 0x0, dle_val = 0x7f6a39c64050, dle_list = 0x7f6a48876ea0}, compiled_dlist_elem = {dle_next = 0x0, dle_prev = 0x0, dle_val = 0x7f6a39c64050, dle_list = 0x0}, parent_pro_ndatum = 0, subparam = 0x0, fn_nargs = 0, copiable_size = 0, deep_datums = 0x0, deep_ndatums = 0, cursor_datums = 0x0, cursor_ndatums = 0, placeholder_datums = 0x0, placeholder_ndatums = 0, fn_argvarnos = 0x0, depend_info_list = 0x0, plan_total_mem_size = 0, block_level = 0x0}
(gdb)

可以在 func里看到,fn_oid=97664 ,这意味着是在编译pg_proc里oid为97664的对象。于是我们输入q退出gdb,然后回到客户端查询

gaussdb=# select proname,g.pkgname from pg_proc p,gs_package g where p.oid=97664 and g.oid=p.propackageid;proname |  pkgname
---------+------------p1      | pkg_test_4
(1 row)

可以看到这个oid对应的就是pkg_test_4这个包里的p1,于是就知道了一定是编译p1的时候出了问题。
到此,出问题的procedure就直接找出来了,肉眼一看包头和包体的定义,发现有个参数的in/out方向没匹配,但GaussDB在创建这个package时竟然没有报错…

其他Gauss系数据库的情况

同样的这个代码,在openGauss 7.0.0 RC1 是不会报错的,这个package还能正常调用,查了下数据字典,出入参方向是按包体生效的,这同样也是个BUG,没有做严格判断。

openGauss=# create package pkg_test_4 is
openGauss$# procedure p1(i1 in varchar2,
openGauss$#              i2 out varchar2,
openGauss$#              i3 out varchar2);
openGauss$# end pkg_test_4;
openGauss$# /
;end;
end pkg_test_4;
/CREATE PACKAGE
openGauss=# create package body pkg_test_4 is
openGauss$# procedure p1(i1 in varchar2,
openGauss$#              i2 in varchar2,
openGauss$#              i3 out varchar2) is
openGauss$#              begin
openGauss$#                  null;
openGauss$#              end;
openGauss$# end pkg_test_4;
openGauss$# /
CREATE PACKAGE BODY
openGauss=# alter package pkg_test_4 compile;
ALTER PACKAGE
openGauss=# call pkg_test_4.p1(null,null,null);i3
----(1 row)openGauss=#

在MogDB 5.2.0里则是在创建package body时就报错了,能正确检查到包头里的procedure在包体里没定义

MogDB=# create package pkg_test_4 is
MogDB$# procedure p1(i1 in varchar2,
MogDB$#              i2 out varchar2,
MogDB$#              i3 out varchar2);
MogDB$# end pkg_test_4;
MogDB$# /
CREATE PACKAGE
MogDB=# create package body pkg_test_4 is
MogDB$# procedure p1(i1 in varchar2,
MogDB$#              i2 in varchar2,
MogDB$#              i3 out varchar2) is
MogDB$#              begin
MogDB$#                  null;
MogDB$#              end;
MogDB$# end pkg_test_4;
MogDB$# /
ERROR:  Function definition not found: p1
MogDB=#

总结

本次触发 ERROR: Failed to query the 323 type in the cache.这个报错的直接原因是创建的package和package body中,有个procedure的参数in/out方向不匹配导致。虽然客户代码的确有问题,但根本原因还是数据库有BUG,未将这种异常场景检查出来。

想要深入排查国产数据库使用中的问题,学会使用gdb是必不可少的。我曾参与过不少国产数据库PoC,亲眼看到各个数据库厂家的技术人员在客户现场都曾用过gdb调试来定位问题。虽然大部分排行靠前的国产数据库都基本已经稳定应用在各行各业了,但是仍要注意一些不起眼的小角落是否还有虫子。

  • 本文作者: DarkAthena
  • 本文链接: https://www.darkathena.top/archives/Debugging-GaussDB-Locating-Package-Compilation-Errors-with-GDB
  • 版权声明: 本博客所有文章除特别声明外,均采用CC BY-NC-SA 3.0 许可协议。转载请注明出处

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。
如若转载,请注明出处:http://www.pswp.cn/bicheng/94092.shtml
繁体地址,请注明出处:http://hk.pswp.cn/bicheng/94092.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

One Commander:强大的Windows文件管理器

在日常使用电脑的过程中&#xff0c;文件管理和浏览是必不可少的任务。One Commander作为一款功能强大的Windows文件管理器&#xff0c;提供了丰富的功能和便捷的操作方式&#xff0c;帮助用户更高效地管理和浏览文件。它不仅支持多种文件操作&#xff0c;还提供了丰富的自定义…

SPUpDate Application 程序卸载

我安装了 EzvizStudioSetups.exe 软件&#xff0c;卸载后会在电脑遗留 SPUpDate Application 程序&#xff1b;在某一时刻会占用 CPU 资源&#xff1b;应用卸载方法一&#xff1a;在任务管理器搜索 SPUpDate Application&#xff1b;定位到文件位置&#xff1b;我的路径如下C:\…

算法题(187):程序自动分析

审题&#xff1a; 本题需要我们判断是否可以同时满足题目给定的若干等式或不等式&#xff0c;判断出后根据结果输出YES或NO 思路&#xff1a; 方法一&#xff1a;离散化并查集 使用并查集&#xff1a;其实题目中只存在两者相等或不等两种情况&#xff0c;而等于具有传递性&…

strcasecmp函数详解

strcasecmp 是 C 语言中用于不区分大小写比较两个字符串的函数&#xff0c;主要用于忽略字符大小写差异的场景&#xff08;如用户输入验证、不区分大小写的字符串匹配等&#xff09;。它属于 POSIX 标准库&#xff0c;定义在 <string.h> 头文件中。 一、函数原型与参数 函…

Voronoi图

本文将详细解释 Voronoi 图&#xff0c;它在空间分析和插值中非常常用。1. 概念 Voronoi 图是一种空间划分方法&#xff0c;它把平面&#xff08;或空间&#xff09;划分成若干个区域&#xff0c;使得每个区域内的任意一点都比该区域外的任何一点更靠近该区域的“生成点”&…

BioScientist Agent:用于药物重定位和作用机制解析的知识图谱增强型 LLM 生物医学代理技术报告

BioScientist Agent:用于药物重定位和作用机制解析的知识图谱增强型 LLM 生物医学代理技术报告 一、项目概述 药物研发是一个周期长、成本高的过程,平均需要超过 10 年时间和 20 亿美元才能将一种新药推向市场,且 90% 以上的候选药物最终失败(1)。这种低成功率主要归因于对…

5G视频终端详解 无人机图传 无线图传 便携式5G单兵图传

前言单兵图传设备&#xff0c;是一种集视频采集、编码压缩、无线传输等多种功能于一体的便携式通信终端。它以嵌入式系统为基础&#xff0c;搭载高性能 H.265 编解码处理器&#xff0c;能够将现场的音视频信息进行高效处理后&#xff0c;通过无线网络快速稳定地传输至后端指挥中…

【苹果软件】Prism Mac 9.4苹果系统免费安装包英文版 Graphpad Prism for Mac 9.4软件免费下载与详细图文教程!!

软件下载与系统要求 软件&#xff1a;Prism9.4 语言&#xff1a;英文 大小&#xff1a;103.41M 安装环境&#xff1a;MacOS12.0&#xff08;或更高&#xff0c;支持IntelM芯片&#xff09; MacOS苹果系统GraphPad Prism&#xff08;科学数据分析与图形绘制&#xff09;&am…

Redis 奇葩问题

先贴错误码Unexpected exception while processing command这个奇葩的问题查了很久&#xff0c;后面突然顿悟&#xff0c;应该是Redis记住了第一次的数据类型&#xff0c;后面即使换了数据类型也不会改变之前的数据类型。跟代码发现是codec变成了默认的了后续public RedissonBa…

C ++代码学习笔记(一)

1、GetStringUTFChars用于将 Java 字符串&#xff08;jstring&#xff09;转换为 UTF-8 编码的 C 风格字符串&#xff08;const char*&#xff09;。必须在使用完后调用 ReleaseStringUTFChars 释放内存&#xff0c;否则可能导致内存泄漏。std::string data_converter::convert…

【学习嵌入式day-29-网络】

进程和线程的区别&#xff1a;都是系统执行的任务进程是资源分配的基本单位线程是调度执行的最小单位进程的创建和切换的开销大&#xff0c;速度慢&#xff0c;效率低空间独立、----- 安全&#xff0c;稳定进程间通信不方便线程创建和切换的开销小&#xff0c;速度快&#xff0…

Eino 框架组件协作指南 - 以“智能图书馆建设手册”方式理解

Eino 框架组件关系 - 形象比喻指南 &#x1f3d7;️ 项目概览&#xff1a;构建一个智能图书馆 想象一下&#xff0c;你要建设一个现代化的智能图书馆&#xff0c;能够帮助用户快速找到所需信息并提供智能问答服务。Eino 框架就像是这个智能图书馆的建设工具包&#xff0c;每个组…

网络打印机自动化部署脚本

下面是一个全面的、交互式的PowerShell脚本&#xff0c;用于自动化网络打印机部署过程。这个脚本提供了图形化界面&#xff0c;让用户可以轻松地搜索、选择和安装网络打印机。 备注&#xff1a;这个脚本未在生产环境测试过&#xff0c;请大家测试一下&#xff0c;有问题或优化&…

探索工业自动化核心:ZMC 系列 EtherCAT 主站控制器

ZLG致远电子的ZMC系列EtherCAT主站控制器&#xff0c;凭借多元内核、丰富接口、卓越通信能力及开放开发环境&#xff0c;为工业自动化提供全方位解决方案&#xff0c;助力企业智能化升级。 前言在工业自动化领域不断演进的今天&#xff0c;可靠且高效的控制解决方案成为企业提…

rt-thread使用sfud挂载qspi flash的trace分析

说明 trace log先贴在这里&#xff0c;待分析完成后&#xff0c;完善文章。 [0m[D/drv.sdram] sdram init success, mapped at 0xC0000000, size is 33554432 bytes, data width is 16[0m\ | / - RT - Thread Operating System/ | \ 5.2.0 build Aug 21 2025 14:44:332…

服务发现与负载均衡:Kubernetes Service核心机制深度解析

目录 专栏介绍 作者与平台 您将学到什么&#xff1f; 学习特色 一、 服务发现与负载均衡&#xff1a;云原生应用的核心支柱 1.1 Kubernetes Service的设计哲学 1.2 服务发现的核心组件 二、 Service核心类型深度解析&#xff1a;从ClusterIP到LoadBalancer 2.1 ClusterI…

【基础排序】CF - 赌场游戏Playing in a Casino

题目描述 在整个太阳系都很有名的赌场 Galaxy Luck 推出了一种新的纸牌游戏。 在这个游戏中&#xff0c;有一副由 nnn 张牌组成的牌堆。每张牌上写有 mmm 个整数。nnn 位玩家各自从牌堆中获得一张牌。 然后所有玩家两两对局&#xff0c;每一对玩家恰好对局一次。 例如&#…

Jenkins启动端口修改失败查找日志

# 查看Jenkins服务启动时的环境变量sudo systemctl show jenkins | grep -i port从systemd服务信息可以看到&#xff0c;Jenkins的环境变量中 JENKINS_PORT8080&#xff0c;这说明systemd服务配置覆盖了 /etc/default/jenkins 文件中的设置1. 查找Jenkins的systemd服务文件# 查…

Rancher部署的K8S集群服务节点上执行 kubectl 命令

文章目录1、Rancher UI 和执行 kubectl 命令之间的关系1.1、Rancher 的架构和 kubectl1.2、Rancher 内置 kubectl 的位置1.3、执行权限和安全2、Rancher UI 的使用操作2.1、UI 界面内置的 Kubectl 命令工具2.2、在服务节点执行 kubectl 命令的方法2.3、创建一个集群上下文文件 …

基于Nodejs作为服务端,React作为前端框架,axios作为通讯框架,实现滑块验证

文章目录基于Nodejs作为服务端&#xff0c;React作为前端框架&#xff0c;axios作为通讯框架&#xff0c;实现滑块验证1. 为什么要自己写滑块验证2. 滑块验证的整体思路3. 具体实现3.1 服务端3.2 前端4. 总结基于Nodejs作为服务端&#xff0c;React作为前端框架&#xff0c;axi…