作者:洪斌
MySQL數據庫最大的優勢,想必就是可以直接通過代碼調試來學習數據庫內部邏輯。任何問題、任何疑惑在debug源碼面前都無法掩蓋,還可以提升對數據庫內核的理解能力,是不是有一種可以掌控一切的感覺!
一直以來Mac都是我的主力機,嘗試了幾次gdb體驗都不怎麼好用。幾個明顯的問題,gdb加載程序源碼構建的MySQL時Reading symbols很久CPU飆升(lldb在symbol處理效率要更好),Mac系統的symbol gdb也無法識別。嘗試了lldb沒有這些問題,使用起來也很流暢。
LLDB目標是成為新一代、高性能的基礎debugger平臺。從Xcode 5開始已用lldb完全替代了gdb,與Xcode完美結合可實現非常友好的可視化調試工作,gdb一直都欠缺好的GUI前端。
- 它還有這些特點:
- 高性能和高效率的內存使用
- 優秀的多線程調試能力
- 插件式架構,支持Python可編程擴展
- 完美支持C,C++,Objective-C
- 多平臺支持Mac OS X,iOS,Linux,FreeBSD,Windows
gdb像是個“年邁而經驗豐富的老人”,lldb像是個“與時俱進的活力青年”。就像它們的創造者一樣,一個是自由軟件的靈魂人物 Richard Stallman,一個是扛起了Apple Swift大旗的Chris Lattner,都是神級的人物。致敬!
Richard Stallman
Chris Lattner
源碼構建MySQL
調試MySQL前需要準備具備完整symbol信息的mysqld程序,官方發佈的版本通常都是stripped的,也沒有啟用debug編譯,缺少足夠的symbol信息,調試時無法看到相應源碼。所以我選擇從MySQL源碼倉庫構建版本,在整個源碼庫下你可以checkout任何branch,構建任意版本,例如:
$ git clone https://github.com/mysql/mysql-server.git
$ cd mysql-server
$ git checkout mysql-5.7.17
$ cd BUILD; cmake .. -DWITH_DEBUG=1 -DWITH_BOOST=/usr/local/Cellar/[email protected]/1.59.0/ -DWITH_UNIT_TESTS=off
$ make
$ make install DESTDIR="/Users/hongbin/mysql"
$ git clean -df
如果是Linux系統需要先安裝這些程序
$ sudo yum -y install gcc gcc-c++ gcc-g77 autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmake readline-devel
LLDB調試
Mac自帶了lldb,Linux還需要安裝下。
首先啟動lldb,加載從源碼構建的二進制程序,指定MySQL配置文件,就可以啟動MySQL了
(lldb) file /Users/hongbin/bin/mysqld
(lldb) process launch -- --defaults-file=/Users/hongbin/.my.cnf
如何設置好的斷點是調試必備技能,lldb也提供了多種靈活的斷點設置方法。
比如:以函數名設置斷點,按tab鍵還可以補全函數名稱
(lldb) br set -n do_comm
Available completions:
XA_prepare_log_event::do_commit(THD*)
Xid_log_event::do_commit(THD*)
do_command(THD*)
(lldb) br set -n do_command
Breakpoint 3: where = mysqld`do_command(THD*) + 15 at sql_parse.cc:874, address = 0x0000000100be53ef
又或者以指定文件名和行號。
(lldb) br s -f mysql
Available completions:
mysqld.cc
mysqld_thd_manager.cc
mysqld_daemon.cc
mysql_malloc_service.c
mysql_string_service.c
(lldb) br s -f mysqld.cc -l 6973
Breakpoint 5: where = mysqld`mysql_init_variables() + 47 at mysqld.cc:6973, address = 0x0000000100da789f
設了哪些斷點不記得了?列出所有斷點信息,pending表示沒有找到此斷點位置,也可以set breakpoint pending off關閉這個設置
(lldb) br l
Current breakpoints:
3: name = 'do_command', locations = 1, resolved = 1, hit count = 1
3.1: where = mysqld`do_command(THD*) + 15 at sql_parse.cc:874, address = 0x0000000100be53ef, resolved, hit count = 1
4: file = 'select_lex_visitor.cc', line = 300, exact_match = 0, locations = 0 (pending)
5: file = 'mysqld.cc', line = 6973, exact_match = 0, locations = 1, resolved = 1, hit count = 0
5.1: where = mysqld`mysql_init_variables() + 47 at mysqld.cc:6973, address = 0x0000000100da789f, resolved, hit count = 0
想刪除某個斷點?just do it
(lldb) br de 4
1 breakpoints deleted; 0 breakpoint locations disabled.
觸發到斷點程序會掛起,讓你一探究竟,lldb會溫馨提示你當前thread id,frame id,停止的原因。
按c繼續運行程序,等待下一次觸發斷點。
(lldb) c
Process 34336 resuming
Process 34336 stopped
* thread #28, stop reason = breakpoint 3.1
frame #0: 0x0000000100be53ef mysqld`do_command(thd=0x00000001040efa00) at sql_parse.cc:874
871 bool return_value;
872 int rc;
873 const bool classic=
-> 874 (thd->get_protocol()->type() == Protocol::PROTOCOL_TEXT ||
875 thd->get_protocol()->type() == Protocol::PROTOCOL_BINARY);
876
877 NET *net= NULL;
遇到斷點處想要一步步執行,按n(代碼級逐步執行), 它是thread step-over的別名,ni(指令級逐步執行)是thread step-inst-over的別名
(lldb) n
Process 34336 stopped
* thread #29, stop reason = step over
frame #0: 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880
877 NET *net= NULL;
878 enum enum_server_command command;
879 COM_DATA com_data;
-> 880 DBUG_ENTER("do_command");
881
882 /*
883 indicator of uninitialized lex => normal flow of errors handling
想知道當前變量到底是什麼值,這對調試非常重要,按p打印你想看的變量內容。
(lldb) p com_data
(COM_DATA) $8 = {
com_init_db = (db_name = , length = 4491639808)
com_refresh = (options = '\x06')
com_shutdown = (level = 6)
com_kill = (id = 72057594037927942)
com_set_option = (opt_command = 6)
com_stmt_execute = (stmt_id = 72057594037927942, flags = 4491639808, params = "\b\n\xffffffd8\x01\x01", params_length = 123145543216640)
com_stmt_fetch = (stmt_id = 72057594037927942, num_rows = 4491639808)
com_stmt_send_long_data = (stmt_id = 72057594037927942, param_number = 196672512, longdata = "\b\n\xffffffd8\x01\x01", length = 123145543216640)
com_stmt_prepare = (query = , length = 196672512)
com_stmt_close = (stmt_id = 6)
com_stmt_reset = (stmt_id = 6)
com_query = (query = , length = 196672512)
com_field_list = (table_name = , table_name_length = 196672512, query = "\b\n\xffffffd8\x01\x01", query_length = 240905728)
}
還可以這樣,貼心吧
(lldb) p com_data->com_query
(COM_QUERY_DATA) $9 = (query = , length = 196672512)
Fix-it applied, fixed expression was:
com_data.com_query
(lldb) p com_data.com_query
(COM_QUERY_DATA) $10 = (query = , length = 196672512)
對調試多線程程序,查看有哪些線程很非常重要吧,顯示所有線程給我吧。
(lldb) th list
Process 34336 stopped
thread #1: tid = 0x152a4ff, 0x00007fff9318d19e libsystem_kernel.dylib`poll + 10, queue = 'com.apple.main-thread'
thread #2: tid = 0x152a516, 0x00007fff9318cd96 libsystem_kernel.dylib`kevent + 10
thread #3: tid = 0x152a518, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #4: tid = 0x152a519, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #5: tid = 0x152a51a, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #6: tid = 0x152a51b, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #7: tid = 0x152a51c, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #8: tid = 0x152a51d, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #9: tid = 0x152a51e, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #10: tid = 0x152a51f, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #11: tid = 0x152a520, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #12: tid = 0x152a521, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #13: tid = 0x152a522, 0x00007fff9318c47e libsystem_kernel.dylib`__write_nocancel + 10
thread #14: tid = 0x152a525, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #15: tid = 0x152a526, 0x00007fff9318bc22 libsystem_kernel.dylib`__psynch_mutexwait + 10
thread #16: tid = 0x152a527, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #17: tid = 0x152a528, 0x00007fff931eebf3 libsystem_malloc.dylib`default_zone_free_definite_size + 58
thread #18: tid = 0x152a529, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #19: tid = 0x152a52a, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #20: tid = 0x152a52b, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #21: tid = 0x152a52c, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #22: tid = 0x152a52d, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #23: tid = 0x152a52e, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #24: tid = 0x152a52f, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #25: tid = 0x152a530, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #26: tid = 0x152a53b, 0x00007fff9318c1fe libsystem_kernel.dylib`__sigwait + 10
thread #27: tid = 0x152a53d, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #28: tid = 0x152a62e, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
* thread #29: tid = 0x152b38e, 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880, stop reason = step over
來,選擇一個你要查看的線程。
(lldb) th se 29
* thread #29, stop reason = step over
frame #0: 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880
877 NET *net= NULL;
878 enum enum_server_command command;
879 COM_DATA com_data;
-> 880 DBUG_ENTER("do_command");
881
882 /*
883 indicator of uninitialized lex => normal flow of errors handling
當前線程調用棧是什麼,bt變態一下,這個也是調試經常會用到的。
(lldb) bt
* thread #29, stop reason = step over
* frame #0: 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880
frame #1: 0x0000000100d7f7e0 mysqld`::handle_connection(arg=0x000000010b73b980) at connection_handler_per_thread.cc:300
frame #2: 0x000000010155698c mysqld`::pfs_spawn_thread(arg=0x000000010b73bfe0) at pfs.cc:2188
frame #3: 0x00007fff9327693b libsystem_pthread.dylib`_pthread_body + 180
frame #4: 0x00007fff93276887 libsystem_pthread.dylib`_pthread_start + 286
frame #5: 0x00007fff9327608d libsystem_pthread.dylib`thread_start + 13
我想看所有線程調用棧,全體變態。
bt all
我想看下那一幀。
(lldb) fr s 1
frame #1: 0x0000000100d7f7e0 mysqld`::handle_connection(arg=0x000000010b73b980) at connection_handler_per_thread.cc:300
297 {
298 while (thd_connection_alive(thd))
299 {
-> 300 if (do_command(thd))
301 break;
302 }
303 end_connection(thd);
我想看當前幀的全部參數和局部變量。
(lldb) fr v
(void *) arg = 0x000000010b73b980
(Global_THD_manager *) thd_manager = 0x000000010b829200
(Connection_handler_manager *) handler_manager = 0x0000000103d000c0
(Channel_info *) channel_info = 0x000000011bbb2b70
(bool) pthread_reused = true
(THD *) thd = 0x000000010bb8fc00
(PSI_thread *) psi = 0x0000000108f1c180
我想查看當前源文件的全局變量。so easy!
(lldb) tar v
Global variables for /Users/hongbin/workbench/mysql-server/sql/conn_handler/connection_handler_per_thread.cc in /Users/hongbin/mysql/bin/mysqld:
(Error_log_throttle) create_thd_err_log_throttle = {
Log_throttle = (window_end = 0, window_size = 60000000, count = 0, summary_template = "Error log throttle: %10lu 'Can't create thread to handle new connection' error(s) suppressed")
log_summary = 0x00000001009efda0
}
(ulong) Per_thread_connection_handler::max_blocked_pthreads = 9
(mysql_mutex_t) Per_thread_connection_handler::LOCK_thread_cache = {
m_mutex = {
global = (__sig = 1297437786, __opaque = "")
mutex = (__sig = 1297437786, __opaque = "")
file = 0x0000000101bd31a1 "/Users/hongbin/workbench/mysql-server/sql/conn_handler/connection_handler_per_thread.cc"
line = 145
count = 0
thread = 0x0000000000000000
}
m_psi = 0x0000000108dcc600
}
我想看反編譯的寄存器指令。
(lldb) di -n get_instance
mysqld`Global_THD_manager::get_instance:
0x100db3780 : pushq %rbp
0x100db3781 : movq %rsp, %rbp
0x100db3784 : leaq 0x10d195d(%rip), %rax ; Global_THD_manager::thd_manager
0x100db378b : cmpq $0x0, (%rax)
0x100db378f : setne %cl
0x100db3792 : xorb $-0x1, %cl
0x100db3795 : testb $0x1, %cl
0x100db3798 : jne 0x100db37a3 ; at mysqld_thd_manager.h:98
0x100db379e : jmp 0x100db37c2 ; at mysqld_thd_manager.h:98
0x100db37a3 : leaq 0xe298c4(%rip), %rdi ; "get_instance"
0x100db37aa : leaq 0xe42de5(%rip), %rsi ; "/Users/hongbin/workbench/mysql-server/sql/mysqld_thd_manager.h"
0x100db37b1 : movl $0x62, %edx
0x100db37b6 : leaq 0xe2a24f(%rip), %rcx ; "thd_manager != __null"
0x100db37bd : callq 0x1016cb0d6 ; symbol stub for: __assert_rtn
0x100db37c2 : jmp 0x100db37c7 ; at mysqld_thd_manager.h:98
0x100db37c7 : leaq 0x10d191a(%rip), %rax ; Global_THD_manager::thd_manager
0x100db37ce : movq (%rax), %rax
0x100db37d1 : popq %rbp
0x100db37d2 : retq
我想看mysqld的所有symbol。
(lldb) image dump symtab mysqld
我想看這個內存地址是什麼鬼。
(lldb) image lookup -a 0x0000000100d86667
我想獨立控制每個線程的斷點,請開啟non-stop模式。
(lldb) settings set target.non-stop-mode true
這些基礎指令基本和gdb類似,整個使用還是蠻流暢的,也符合使用者的習慣,但沒有自帶pager,翻頁有點痛苦。還有太多好玩的內容等你深挖,玩的開心!