My code works, I don’t know why.

國王的耳朵是驢耳朵

Linux中使用C語言載入data Object 檔案資料 (續)

| Comments

動機

前面討論了Linux中使用C語言載入data object 檔案資料,裏面提到資料轉成object,裏面會有三個symbol: _binary_objfile_start, _binary_objfile_end, _binary_objfile_size

Object symbol

man objcopy_size可以看到

1
2
3
4
5
6
   -B bfdarch
   --binary-architecture=bfdarch
       Useful when transforming a architecture-less input file into an object file.  In this case the output architecture can be set to bfdarch.  This
       option will be ignored if the input file has a known bfdarch.  You can access this binary data inside a program by referencing the special
       symbols that are created by the conversion process.  These symbols are called _binary_objfile_start, _binary_objfile_end and
       _binary_objfile_size.  e.g. you can transform a picture file into an object file and then access it in your code using these symbols.

測試: Segmentation fault版

簡單來說,和平台無關的檔案轉成obj檔時,可以透過_binary_objfile_start, _binary_objfile_end, _binary_objfile_size去存取資料。所以我再修改了一下測試程式,印出這些symbol 的內容

test_obj.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <stdio.h>
#include <string.h>

#define MAX_CHARS (32)

extern char _binary_my_data_txt_start;
extern char _binary_my_data_txt_end;
extern int _binary_my_data_txt_size;

int main(int argc, char **argv)
{
    char line[MAX_CHARS] = {0};
    char *my_data_start_addr = (char *) &_binary_my_data_txt_start;
    int str_len = (char) *my_data_start_addr - 0x30; /* 0x30 => '0' */

    if (str_len > MAX_CHARS) {
        str_len = MAX_CHARS;
    }

    strncpy(line, my_data_start_addr + 1, str_len);

    printf("_binary_my_data_txt_start: %c\n", _binary_my_data_txt_start);
    printf("_binary_my_data_txt_end: %c\n", _binary_my_data_txt_end);
    printf("_binary_my_data_txt_size: %d\n", _binary_my_data_txt_size);
    printf("Read string is %s\n", line);

    return 0;
}

而測試檔案為

my_data.txt
1
5Hello

一跑起來會發生Segmentation fault如下

my_data.txt
1
2
3
4
$ ./test_obj
_binary_my_data_txt_start: 5
_binary_my_data_txt_end:
Segmentation fault (core dumped)

分析

使用gdb可以看到當在存取_binary_my_data_txt_size這行,我們還可以進一步來看這個變數

gdb 結果
1
2
3
4
5
6
7
8
_binary_my_data_txt_start: 5
_binary_my_data_txt_end:

Program received signal SIGSEGV, Segmentation fault.
main (argc=1, argv=0x7fffffffe1d8) at test_obj.c:24
24        printf("_binary_my_data_txt_size: %d\n", _binary_my_data_txt_size);
(gdb) p _binary_my_data_txt_size
Cannot access memory at address 0x7

疑?變數位址是0x7?那我把測試檔案內容改一下,增加3個字元,再跑一次gdb。

my_data.txt
1
8HelloABC
gdb 結果
1
2
(gdb) p _binary_my_data_txt_size
Cannot access memory at address 0xa

可以看到位址從0x7跑到0xa,也就是10,這表示這個數字增加三了。這讓我懷疑這個symbol並不是一個變數,而是一個數值。因此我們可以將 * printf("_binary_my_data_txt_size: %d\n", _binary_my_data_txt_size);

改成
  • printf("_binary_my_data_txt_size: %p\n", &_binary_my_data_txt_size);

另外一個有趣的地方是_binary_my_data_txt_end並不是C,而是\0。而_binary_my_data_txt_end前一個字元是\n。使用ghex去看my_data.txt可以看到最後一個資料其實是\n。但是\0怎麼出現的,目前還不清楚。

參考資料

Comments