GNU Debugger Tutorial [GDB walkthrough]


GNU Debugger Tutorial [GDB walkthrough]

GDB debugger is almost an unavoidable part of core C/C++ programmer, white box tester or reverse engineer’s life. You might need it on some of the CyberSecurity courses or for the pleasure of hacking/cracking things on your own. Everyone probably used it at least once. This GNU Debugger Tutorial will be a quick overview or a rough collection of basic info, commands and useful details related to GDB, a short gdb walkthrough maybe.

Related articles:

Debugging Basics

Debugging is the process of finding ad resolving defects or problems within a computer program. It can be a simple functional or security issue. It involves interactive debugging, control flow analysis, unit testing, integration testing, etc. There are vast number of debugger out there:

  • GNU Debugger (GDB)
  • Interl Debugger (IDB)
  • OllyDBG
  • SoftIce
  • WinDBG

Here we’ll focus on GDB. Sample:

 #include <stdio.h>
 int main(void)
 {
   printf("Hello to a Debugging world!\n");
   return (0);
 }

Compile it:

$ gcc dbg.c -o dbg

and run it:

$ ./dbg
Hello to a Debugging world!

Now, if you try and jump into a debugger:

$ gdb ./dbg

You’ll most likely encounter a message like “No debugging symbols found in dbg“. Debugger symbols offer info on variables, functions, etc. They can be a part of the binary, generated at the compile time or they can be situated in a separated file. GCC uses the -g option (-ggdb for GDB specific symbols). As for the debug symbol file types:

  • DWARF
  • Stabs
  • XCOFF
  • COFF

So, to correct the previous “No debugging symbols found”, include -ggdb option while compiling:

$ gcc -ggdb dbg.c -o dbg
$ gdb ./dbg
...
Reading symbols from dbg...
(gdb)

You could add -Wall option to enable all warnings in the code. Now we’re are ready. To print source code, use list:

(gdb) list 1
1    #include <stdio.h>
2    int globVar;
3    int main(void)
4    {
5      int localVar=3;
6      printf("Hello to a Debugging world!\n");
7      return (0);
8    }

Symbol Files

Source code is not a part of symbol file, it relies on the source file itself, so if we don’t have a source code file, list cmd is useless. Listing functions:

(gdb) info functions

 All defined functions:
 File dbg.c:
 3:    int main(void);          -> PART OF DBG SYMBOLS

 Non-debugging symbols:
 0x00001000  _init
 0x00001030  puts@plt
 0x00001040  __libc_start_main@plt
 0x00001050  __cxa_finalize@plt
 0x00001060  _start
 0x000010a0  __x86.get_pc_thunk.bx
 0x000010b0  deregister_tm_clones
 0x000010f0  register_tm_clones
 0x00001140  __do_global_dtors_aux
 0x00001190  frame_dummy
 0x00001195  __x86.get_pc_thunk.dx
 0x000011d5  __x86.get_pc_thunk.ax
 0x000011e0  __libc_csu_init
 0x00001240  __libc_csu_fini
 0x00001241  __x86.get_pc_thunk.bp
 0x00001248  _fini

Checking the source files from which the symbols have been read in:

(gdb) info sources
 Source files for which symbols have been read in:
 /root/TEST_AREA/dbg.c, 
 /usr/lib/gcc/i686-linux-gnu/8/include/stddef.h, 
 /usr/include/i386-linux-gnu/bits/types.h, 
 /usr/include/i386-linux-gnu/bits/types/struct_FILE.h, 
 /usr/include/i386-linux-gnu/bits/types/FILE.h, /usr/include/stdio.h, 
 /usr/include/i386-linux-gnu/bits/sys_errlist.h
 Source files for which symbols will be read in on demand:

Listing global and static variables (this doesn’t include local ones):

(gdb) info variables

 All defined variables:
2:    int globVar;

 Non-debugging symbols:
 0x00002000  fp_hw 
 0x00002004  _IO_stdin_used 
 0x00002024  GNU_EH_FRAME_HDR 
 0x000021a8  __FRAME_END 
 0x00003ef4  frame_dummy_init_array_entry 
 0x00003ef4  __init_array_start 
 0x00003ef8  __do_global_dtors_aux_fini_array_entry 
 0x00003ef8  __init_array_end 
 0x00003efc  _DYNAMIC 
 0x00004000  _GLOBAL_OFFSET_TABLE
 0x00004014  __data_start
 0x00004014  data_start
 0x00004018  __dso_handle
 0x0000401c  __TMC_END
 0x0000401c  __bss_start
 0x0000401c  _edata
 0x0000401c  completed
 0x00004020  _end

To get info on or list local variables we have to rely on specific scope. Type $ info scope and hit tab. You should be presented with a list of available scopes. In our case, we’re looking at the main function:

(gdb) info scope main
 Scope for main:
 Symbol localVar is a complex DWARF expression:
      0: DW_OP_breg5 -12 [$ebp]
 , length 4.

You can use additional tools (binutils like objcopy or objdump) to extract symbols.

ObjCopy

Basically for copying/translating objfiles. It can remove/add sections and symbols, but it’s most commonly used for file format change (ELF/COFF to S-Record or Intel i-Hex).

$ objcopy --only-keep-debug dbg dbg_symbols

or remove symbols and everything unnecessary:

$ strip --strip-debug --strip-unneeded dbg

The strip is particularly useful and recommended, especially when deploying your solutions (as a developer). You shouldn’t provide any symbols or unnecessary info. Just in case you needed it, to add symbols to a binary:

$ objcopy --add-gnu-debuglink=dbg_symbols dbg

or you can add symbols explicitly within GDB:

(gdb) symbol-file dbg_symbols

ObjDump

A handy tool for debugging and getting the info about the contents of an object/executable file. It can display symbol tables, sections headers, disassemble, archives/libraries info on stored files, etc. Might be worthwhile to play around with.

Headers:

$ objdump -f dbg.o 

dbg.o:     file format elf32-i386
architecture: i386, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x00000000

Section headers:

$ objdump -h dbg.o 

dbg.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .group        00000008  00000000  00000000  00000034  2**2
                  CONTENTS, READONLY, GROUP, LINK_ONCE_DISCARD
  1 .text         00000046  00000000  00000000  0000003c  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .data         00000000  00000000  00000000  00000082  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  3 .bss          00000000  00000000  00000000  00000082  2**0
                  ALLOC
  4 .rodata       0000001c  00000000  00000000  00000082  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .text.__x86.get_pc_thunk.ax 00000004  00000000  00000000  0000009e  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  6 .comment      0000001e  00000000  00000000  000000a2  2**0
                  CONTENTS, READONLY
  7 .note.GNU-stack 00000000  00000000  00000000  000000c0  2**0
                  CONTENTS, READONLY
  8 .eh_frame     00000060  00000000  00000000  000000c0  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

Print all headers:

$ objdump -x dbg.o 

dbg.o:     file format elf32-i386
dbg.o
architecture: i386, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x00000000

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .group        00000008  00000000  00000000  00000034  2**2
                  CONTENTS, READONLY, GROUP, LINK_ONCE_DISCARD
  1 .text         00000046  00000000  00000000  0000003c  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .data         00000000  00000000  00000000  00000082  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  3 .bss          00000000  00000000  00000000  00000082  2**0
                  ALLOC
  4 .rodata       0000001c  00000000  00000000  00000082  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .text.__x86.get_pc_thunk.ax 00000004  00000000  00000000  0000009e  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  6 .comment      0000001e  00000000  00000000  000000a2  2**0
                  CONTENTS, READONLY
  7 .note.GNU-stack 00000000  00000000  00000000  000000c0  2**0
                  CONTENTS, READONLY
  8 .eh_frame     00000060  00000000  00000000  000000c0  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
SYMBOL TABLE:
00000000 l    df *ABS*	00000000 dbg.c
00000000 l    d  .text	00000000 .text
00000000 l    d  .data	00000000 .data
00000000 l    d  .bss	00000000 .bss
00000000 l    d  .rodata	00000000 .rodata
00000000 l    d  .text.__x86.get_pc_thunk.ax	00000000 .text.__x86.get_pc_thunk.ax
00000000 l    d  .note.GNU-stack	00000000 .note.GNU-stack
00000000 l    d  .eh_frame	00000000 .eh_frame
00000000 l    d  .comment	00000000 .comment
00000000 l    d  .group	00000000 .group
00000004       O *COM*	00000004 globVar
00000000 g     F .text	00000046 main
00000000 g     F .text.__x86.get_pc_thunk.ax	00000000 .hidden __x86.get_pc_thunk.ax
00000000         *UND*	00000000 _GLOBAL_OFFSET_TABLE_
00000000         *UND*	00000000 puts


RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000013 R_386_PC32        __x86.get_pc_thunk.ax
00000018 R_386_GOTPC       _GLOBAL_OFFSET_TABLE_
00000028 R_386_GOTOFF      .rodata
00000030 R_386_PLT32       puts


RELOCATION RECORDS FOR [.eh_frame]:
OFFSET   TYPE              VALUE 
00000020 R_386_PC32        .text
00000054 R_386_PC32        .text.__x86.get_pc_thunk.ax

Disassemble (-d):

$ objdump -d dbg.o 

dbg.o:     file format elf32-i386


Disassembly of section .text:

00000000 <main>:
   0:	8d 4c 24 04          	lea    0x4(%esp),%ecx
   4:	83 e4 f0             	and    $0xfffffff0,%esp
   7:	ff 71 fc             	pushl  -0x4(%ecx)
   a:	55                   	push   %ebp
   b:	89 e5                	mov    %esp,%ebp
   d:	53                   	push   %ebx
   e:	51                   	push   %ecx
   f:	83 ec 10             	sub    $0x10,%esp
  12:	e8 fc ff ff ff       	call   13 <main+0x13>
  17:	05 01 00 00 00       	add    $0x1,%eax
  1c:	c7 45 f4 03 00 00 00 	movl   $0x3,-0xc(%ebp)
  23:	83 ec 0c             	sub    $0xc,%esp
  26:	8d 90 00 00 00 00    	lea    0x0(%eax),%edx
  2c:	52                   	push   %edx
  2d:	89 c3                	mov    %eax,%ebx
  2f:	e8 fc ff ff ff       	call   30 <main+0x30>
  34:	83 c4 10             	add    $0x10,%esp
  37:	b8 00 00 00 00       	mov    $0x0,%eax
  3c:	8d 65 f8             	lea    -0x8(%ebp),%esp
  3f:	59                   	pop    %ecx
  40:	5b                   	pop    %ebx
  41:	5d                   	pop    %ebp
  42:	8d 61 fc             	lea    -0x4(%ecx),%esp
  45:	c3                   	ret    

Disassembly of section .text.__x86.get_pc_thunk.ax:

00000000 <__x86.get_pc_thunk.ax>:
   0:	8b 04 24             	mov    (%esp),%eax
   3:	c3                   	ret

Print full content:

objdump -s dbg.o 

dbg.o:     file format elf32-i386

Contents of section .group:
 0000 01000000 07000000                    ........        
Contents of section .text:
 0000 8d4c2404 83e4f0ff 71fc5589 e5535183  .L$.....q.U..SQ.
 0010 ec10e8fc ffffff05 01000000 c745f403  .............E..
 0020 00000083 ec0c8d90 00000000 5289c3e8  ............R...
 0030 fcffffff 83c410b8 00000000 8d65f859  .............e.Y
 0040 5b5d8d61 fcc3                        [].a..          
Contents of section .rodata:
 0000 48656c6c 6f20746f 20612044 65627567  Hello to a Debug
 0010 67696e67 20776f72 6c642100           ging world!.    
Contents of section .text.__x86.get_pc_thunk.ax:
 0000 8b0424c3                             ..$.            
Contents of section .comment:
 0000 00474343 3a202844 65626961 6e20382e  .GCC: (Debian 8.
 0010 332e302d 31392920 382e332e 3000      3.0-19) 8.3.0.  
Contents of section .eh_frame:
 0000 14000000 00000000 017a5200 017c0801  .........zR..|..
 0010 1b0c0404 88010000 30000000 1c000000  ........0.......
 0020 00000000 46000000 00440c01 00471005  ....F....D...G..
 0030 02750044 0f037578 06100302 757c71c1  .u.D..ux....u|q.
 0040 0c010041 c341c543 0c040400 10000000  ...A.A.C........
 0050 50000000 00000000 04000000 00000000  P...............

Print simble table (-t/-T), relocation entries (-R) or a section info (-j). Enough options to keep you occupied. We’re not going to dive deeper into it at this point, but there’s enough documentation out there to find your way around (help, man, etc)

Analyzing symbols with Nm

NM is used to examine binary files (libraries, objects, executables) and to display contents, meta, or symbol tables.

$ nm dbg
0000401c B bss_start 
0000401c b completed.6887          w _cxa_finalize@@GLIBC_2.1.3 
00004014 D __data_start 
00004014 W data_start 
000010b0 t deregister_tm_clones 
00001140 t __do_global_dtors_aux 
00003ef8 d __do_global_dtors_aux_fini_array_entry 
00004018 D __dso_handle 
00003efc d _DYNAMIC 
0000401c D _edata 
00004024 B _end 
00001258 T _fini 
00002000 R _fp_hw 
00001190 t frame_dummy 
00003ef4 d __frame_dummy_init_array_entry 
000021a8 r __FRAME_END
00004000 d _GLOBAL_OFFSET_TABLE
00004020 B globVar
          w gmon_start
00002024 r GNU_EH_FRAME_HDR 
00001000 t _init 
00003ef8 d __init_array_end 
00003ef4 d __init_array_start 
00002004 R _IO_stdin_used          w _ITM_deregisterTMCloneTable          w _ITM_registerTMCloneTable 
00001250 T __libc_csu_fini 
000011f0 T __libc_csu_init          U __libc_start_main@@GLIBC_2.0 
00001199 T main          U puts@@GLIBC_2.0 
000010f0 t register_tm_clones 
00001060 T _start 
0000401c D __TMC_END
000011df T __x86.get_pc_thunk.ax
00001251 T __x86.get_pc_thunk.bp
000010a0 T __x86.get_pc_thunk.bx
00001195 T __x86.get_pc_thunk.dx 

<VA> <Symbol Type> <Symbol Name>

Symbol types (not complete):

  • A : Absolute Symbol
  • a : Local absolute Symbol
  • B : Unitialized Data Section (BSS)
  • b : Local bss Symbol
  • D : Initialized Data Section
  • d : Local Data Symbol
  • N : Debugging Symbol
  • T : Text Section
  • U : Undefined Symbol

Lower case is a Local symbol, Upper case is External one. Use in combination with grep when conducting specific searches. Check the man for extensive info on options.

System Call Tracing with STrace

STrace is a diagnostic, debugging tool that helps you understand how your program interacts with OS. It intercepts and traces all system calls made by the program, provides info on passed arguments, memory allocation, filtering, etc. In case you don’t have it, add it: $ apt install strace

  • -t : Timestamps for each step
  • -r : Relative Timestamp
  • -e [open | write | socket | connect | send | recv…] : trace specific syscall
  • -p <PID> : trace specific PID
  • -c : statistics/summary on program exit

Btw, similar to strace, GDB can also attach itself to a running process:

$ gdb --quiet
(gdb) attach <PID>

or

$ gdb -p <PID>

Breakpoints in GDB

Refered to as a “Pause”, it’s basically intentional stopping in a program to examine certain values (registers, memory, etc) and behaviours. You can create breakpoints before running or while running. Setting a breakpoint in GDB:

(gdb) break [ function_name | linenumber | *address ]
Breakpoint 1 at 0x11b5: file dbg.c, line 5.

List current breakpoints:

(gdb) info break 

Num     Type           Disp Enb Address    What

1       breakpoint     keep y   0x004011b5 in main at dbg.c:5

    breakpoint already hit 1 time

Delete breakpoint:

(gdb) del [ breakpoint number ]

Disable breakpoint:

(gdb) dis [ breakpoint number ]  

Enable breakpoint:

(gdb) en [ breakpoint number ]   

Ignore breakpoint for certain number of times:

(gdb) ignore [ breakpoint number ]  [number]

Break into a line relative to current one:

(gdb) b +linenum

Break into a Function or line a file:

(gdb) b filename:[ function | line ]

Run till the point where function returns:

(gdb) fin

Current stack & line number:

(gdb) where
(gdb) frame

Looking at the previous code:

(gdb) list 1

 1    #include <stdio.h>
 2    int globVar;
 3    int main(void)
 4    {
 5      int localVar=3;
 6      printf("Hello to a Debugging world!\n");
 7      return (0);
 8    }

We could set one one on main:

(gdb) break main
Breakpoint 1 at 0x11b5: file dbg.c, line 5.

Running the program would pause at that breakpoint:

 (gdb) run
Starting program: /root/TEST_AREA/dbg

Breakpoint 1, main () at dbg.c:5
5 int localVar=3;

At that moment you could inspect the current state of the program, registers, memory, etc. To continue execution:

(gdb) [ con | continue]

To step one line, till the next line:

(gdb) step

To step one instruction, till program reaches next instruction:

(gdb) stepi

In certain situations (loops, comparisons, etc), you might be interesting in setting a certain coditions under which breakpoint applies. For instance:

(gdb) condition <BREAKPOINT_NUMBER> [ PROGRAM_VARIABLE | CONVENIENCE_VARIABLE | Register] == 5
(gdb) condition <BREAKPOINT_NUMBER> $eax != 0

GDB Convenience Vars and Func/Proc Calls

GDB provides variables that you can use within GDB to hold values and refer them later. They are prefixed with “$”:

(gdb) set $v = 10
(gdb) print $v
(gdb) set $dyn = (char *)malloc(15)
(gdb) call strcpy($dyn, argv[1])
(gdb) call functionName($v, $dyn)

Not much to it, but it’s worth mentioning them.

Cracking vs Debug Symbols

When cracking things, you might be interesting in looking at certain things. The strings command, which prints the strings (printable chars) in files, might reveal private or secret info on poorly coded programs. Runtime analysis and digging up info on scopes, functions, variables, etc.

(gdb) info [ variables | functions ]
(gdb) info scope <FunctionName>

Disassembling Binary

Disassemble default style is AT&T. Most of us probaly got used to Intel style, so you can change the format with a flavor cmd:

(gdb) set disassembly-flavor intel

Example: disassemble [ function | addr | start , end ]

(gdb) disassemble main
 Dump of assembler code for function main:
    0x00001199 <+0>:    lea    0x4(%esp),%ecx
    0x0000119d <+4>:    and    $0xfffffff0,%esp
    0x000011a0 <+7>:    pushl  -0x4(%ecx)
    0x000011a3 <+10>:    push   %ebp
    0x000011a4 <+11>:    mov    %esp,%ebp
    0x000011a6 <+13>:    push   %ebx
    0x000011a7 <+14>:    push   %ecx
    0x000011a8 <+15>:    sub    $0x10,%esp
    0x000011ab <+18>:    call   0x11df <__x86.get_pc_thunk.ax>
    0x000011b0 <+23>:    add    $0x2e50,%eax
    0x000011b5 <+28>:    movl   $0x3,-0xc(%ebp)
...
(gdb) set disassembly-flavor intel
(gdb) disassemble main
 Dump of assembler code for function main:
    0x00001199 <+0>:    lea    ecx,[esp+0x4]
    0x0000119d <+4>:    and    esp,0xfffffff0
    0x000011a0 <+7>:    push   DWORD PTR [ecx-0x4]
    0x000011a3 <+10>:    push   ebp
    0x000011a4 <+11>:    mov    ebp,esp
    0x000011a6 <+13>:    push   ebx
    0x000011a7 <+14>:    push   ecx
    0x000011a8 <+15>:    sub    esp,0x10
    0x000011ab <+18>:    call   0x11df <__x86.get_pc_thunk.ax>
    0x000011b0 <+23>:    add    eax,0x2e50
    0x000011b5 <+28>:    mov    DWORD PTR [ebp-0xc],0x3

By disassembling things on the run:

(gdb) disassemble main
...
    0x004011b0 <+23>:    add    $0x2e50,%eax
 => 0x004011b5 <+28>:    movl   $0x3,-0xc(%ebp)
    0x004011bc <+35>:    sub    $0xc,%esp
...

you might notice the “=>” sign. It show you the current position of EIP register. If you check the EIP value, we would see 0x4011b5. Fetching the instruction “manually” would show the same thing:

(gdb) x/3i 0x4011b5
 => 0x4011b5 :    movl   $0x3,-0xc(%ebp)
    0x4011bc :    sub    $0xc,%esp
    0x4011bf :    lea    -0x1ff8(%eax),%edx

We’re not going to go further into disassembling at this point.

Final words

We didn’t cover many of the subjects like ARM and 64bit system conventions, Android and IPhone Application debugging, Cracking, etc. The focus in this GNU Debugger Tutorial was a bit more on GDB – GNU Debugger itself, so we’re going to leave those for some other occasion. We’ll try to expand this post as time goes by.

Random Commands

You need to get comfortable with the GDB, commands and overall management, finding your way around. There are a lot of sources out there you could use, google being the prime place to start with. You should definitely rely on gdb help:

(gdb) help x

 Examine memory: x/FMT ADDRESS.
 ADDRESS is an expression for the memory address to examine.
 FMT is a repeat count followed by a format letter and a size letter.
 Format letters are o(octal), x(hex), d(decimal), u(unsigned decimal),
   t(binary), f(float), a(address), i(instruction), c(char), s(string)
   and z(hex, zero padded on the left).
 Size letters are b(byte), h(halfword), w(word), g(giant, 8 bytes).
 The specified number of objects of the specified size are printed
 according to the format.  If a negative number is specified, memory is
 examined backward from the address.

Defaults for format and size letters are those previously used.
 Default count is 1.  Default address is following last thing printed
 with this command or "print".

ASCII details can come in handy:

$ man ascii

Print the current EAX register value:

(gdb) print $eax
$10 = 72
(gdb) print /c $eax
$11 = 72 'H'

Print the program input parameter:

(gdb) print argv[0] 
(gdb) x/s argv[0]

Get the variable address:

(gdb) p &<VARIABLE_NAME>

Changing specific char:

(gdb) set {char} <0xADR> = 'X'
(gdb) set {int} <0xADR> = 66
(gdb) set <VARIABLE> = <VALUE>

Quickly checking where you are:

(gdb) disas $eip

Inspect ARM QEMU (Virtual machine), Armel

Conclusion

We probably didn’t cover many segmetns of the GDB in this GNU Debugger Tutorial, but it’s a start. We’ve shown some basic control & management of gdb. The approach might not be ideal (raw, with not too many examples), but you should be able to handle some things on your own with this. Getting used to it might take some time, but once you get a hang of it.. you’ll see. As mentioned, we’ll try to expand this article with examples and additional info.