On Habré there are many articles on the ELF topic - which only people do not do with them - explain their device, assemble them manually and automatically, edit, encrypt and much more. I, in turn, would like to share an interesting, in my opinion, case that introduced me at once to many aspects of low-level programming in practice:
compilation of programs,
a kind of reverse engineering and porting of runtime libraries,
device executable files Windows and Linux,
assembling and editing such files manually
These and some other aspects, as well as many non-standard moves, will be touched upon when porting the compiler . At the same time, the part of the work related to executable files (and ELFs in particular) seemed to me the most interesting, so it will become the leitmotif of the article.
This article is not an exhaustive tutorial, but it may be of interest to readers interested in one or more of the areas above and ready to discover (or brush up on) non-standard approaches to solving problems in these areas.
In the course of work we need:
gcc compiler, ld linker, gdb debugger
utilities from binutils (readelf, strip, hexdump)
basic understanding of Portable executable (PE) and ELF devices
Pascal
, Bero TinyPascal Compiler (, BTPC) 2016 - Pascal, (Benjamin Rosseaux). Pascal' (Delphi 7-XE7 FreePascal >= 3) Windows x32. , .
github . , , :
BTPC ( btpc.exe) self-contained - - ~3 . . - , , , . , , .
Pascal', , - -, , -, . , . " ".
Seg fault
BTPC - PE (Portable Executable, Mircosoft), .. "ELF Windows". , ELF, , - .
, PE 2
Portable Executable (PE) — , , Microsoft Windows. PE , , PE- . , API- ..
PE COFF Unix. «» PE — ELF ( Linux Unix) Mach-O ( Mac OS X).
, , Pascal, , , -:
-
- "-"
- "" PE-
PE- , "" .
- . , (Runtime Library, RTL).
Runtime Library 2
RTL- - CRT C/C++. , , , .. , , .
- pre-start post-exit "", main' . , , main, ( ), main .
PE- RTL 9 :
RTLHalt —
RTLWriteChar — char’ stdout
RTLWriteInteger — integer' stdout
RTLWriteLn — linebreak' stdout
RTLReadChar — EAX STDIN
RTLReadInteger — EAX integer' STDOUT
RTLReadLn — STDIN EOF ( )
RTLEOF — EAX EOF. 0 — .
RTLEOLN — 1 DL, - \n, 0 —
- , :
.ENTRYPOINT
JMP StubEntryPoint #
RTLHalt:
... # RTLHalt
RTLWriteChar:
...
... # RTL
RTLFunctionTable: #
DD OFFSET RTLHalt
DD OFFSET RTLWriteChar
DD OFFSET RTLWriteInteger
...
StubEntryPoint:
INVOKE HeapAlloc ... #
MOV ESI, OFFSET RTLFunctionTable #
ProgramEntryPoint:
, RTL, :
StubEntryPoint
ESI
… , ProgramEntryPoint !
, , BTPC ProgramEntryPoint.
, 2 - .
: btpc.pas rtl.asm . btpc.pas blob, :
{ }
procedure EmitStubCode;
begin
OutputCodeDataSize := 0;
OutputCodeString(#77#90#82#195#66#101#82#111#94#102#114#0#80#69#0#0#76#1#1#0#0#0#0#0#...
OutputCodeString(#0#0#0#0#0#0#0#0#0#0#0#0#0#0#16#0#0#0#16#0#0#143##16#0#0#0#0...
OutputCodeString(#0#0#0#0#0#0#0#0#0#0#255#255#255#255#40#16#0#0#53#0#0#0...
OutputCodeString(#101#110#106#97#109#105#110#32#39#66#101#82#111#...
OutputCodeDataSize := 1423;
end;
{ }
- Pascal- , rtl.asm.
, PE .
, :
RTL ( ) , PE- ( , - )
PE- - ( , 255 )
, - NOP
- ( )
BTPC :
stdin
, -
, -
( - PE-)
, "" , ,
PE32
: - . , , . , , ( ). : , , , .
: 4 , 156 . 100 . , 4 , 100 .
, RTL, Excagena, - PE-. / , Excagena , .
-, PE. , , ProgramEntryPoint - , .
- - PE-. - :
(OptionalHeader.SizeOfCode)
(SectionTable.VirtualSize)
(OptionalHeader.SizeOfImage),
- , , :
{ }
CodeSize := OutputCodeGetInt32($29) + (OutputCodeDataSize - CodeStart);
OutputCodePutInt32($29, CodeSize);
{ }
SectionAlignment := OutputCodeGetInt32($45);
{ }
SectionVirtualSize := CodeSize;
Value := CodeSize mod SectionAlignment;
SectionVirtualSize := SectionVirtualSize + (SectionAlignment - Value);
OutputCodePutInt32($10d, SectionVirtualSize);
{ }
OutputCodePutInt32($5d, SectionVirtualSize + OutputCodeGetInt32($39));
, $29, $45, $115 …
, - Linux x64 . , :
RTL Linux x64
ELF-
ELF-,
- 3
- "" WinApi .
, , . . - , Win32 API, :
ReadCharBuffer: DB 0x90
ReadCharBytesRead: DB 0x90,0x8D,0x40,0x00
ReadCharEx:
PUSHAD
INVOKE ReadFile, DWORD PTR StdHandleInput, OFFSET ReadCharBuffer, 1, OFFSET ReadCharBytesRead, BYTE 0
TEST EAX, EAX
SETZ AL
OR BYTE PTR IsEOF, AL
CMP DWORD PTR [ReadCharBytesRead], 0
SETZ AL
OR BYTE PTR IsEOF, AL
POPAD
RET
, , . , - " " - ReadCharBuffer ReadCharBytesRead. , …
- , , . ( - , - , - ), .
, 64- syscall ( ). , RDI, RSI, RDX . RAX.
x64 pusha, pushad, popa, popad. pushall, popall, . - bss.
, - data.
:
.section .data # -
ReadCharBuffer:
.byte 0x3c
.section .text # -
ReadCharEx:
PUSHALL # - bss
XORQ %RAX, %RAX # syscall #0: read(int fd, void *buf, size_t count)
XORQ %RDI, %RDI # fd : 0 == stdin
MOVQ $ReadCharBuffer, %RSI # buf : ReadCharBuffer
MOVQ $1, %RDX # count : 1 byte
SYSCALL
CMPQ $0, %RAX
SETZ %BL
ORB %BL, (IsEOF)
POPALL
RET
, - . , - , Guard Page.
, , Linux , , , Segmentation fault. , , . , . Guard Page . , , , , .
:
$ gcc -c rtl64.s $ ld rtl64.o -g --output rtl64
ELF - BTPC . .
, BTPC - EmitByte:
procedure OCPopESI;
begin
EmitByte($5e);
LastOutputCodeValue := locPopESI;
end;
procedure OCMovzxEAXAL;
begin
EmitByte($0f);
EmitByte($b6);
EmitByte($c0);
LastOutputCodeValue := locMovzxEAXAL;
end;
, , x64. - ,
1 -
MOV R10, [R12 + R13]
, (, ) "Intel 64 and IA-32 Architectures Developer’s Manual"
. - – R10 "R?".
MOV i8086+. "r/m →R?". , 0x8B.
R10 4 , 3 Rn ModR/M.
"" " REX".
R10 R REX Rn ModR/M.
ModR/M 32- , R/M ModR/M = 100, Mod = 00. SIB.
R12, 1, X REX SIB.
SIB , [#Base + #Index2^(Scale)]. Base R12. 3 = 100 3 Base SIB.
(Index) SIB 3 R13 = 101.
(1 + 1 ), Scale SIB = 00 2^(Scale) = 2^0 = 1.
R13 B REX. SIB.
REX: "0100" + "W:1" + "R:1" + "X:1" + "B:1" = 01001111, hex 0x4F. REX .
ModR/M: "Mod:00" + "Rn:010" + "R/M:100" = 00010100, 0x14.
SIB: "Scale:00" + "Index:101" + "Base:100" = 00101100, 0x2C.
MOV R10, (R12 + R13)
0x4F 0x8B 0x14 0x2C
.
, PE- , , . , BTPC 70 .
: , , .
.
ELF'
, , BTPC - , , "" , , , . .
, , - ELF' - ELF'.
, . 2 :
ELF64 PE32
PE32 ,
readelf' rtl64:
$ readelf --section-headers rtl64 Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 00000000004000b0 000000b0 0000000000000317 0000000000000000 AX 0 0 1 [ 2] .data PROGBITS 00000000006003c7 000003c7 00000000000000bf 0000000000000000 WA 0 0 1 [ 3] .symtab SYMTAB 0000000000000000 00000488 0000000000000408 0000000000000018 4 39 8 [ 4] .strtab STRTAB 0000000000000000 00000890 0000000000000248 0000000000000000 0 0 1 [ 5] .shstrtab STRTAB 0000000000000000 00000ad8 0000000000000027 0000000000000000 0 0 1
- ELF' 6 !:
( )
— text
— data
— shstrtab (Section header string table)
symtab
strtab
, . - . ProgramEntryPoint , " ".
? . , 4 , , ELF'.
ld (--nostdlib, --strip-all):
$ ld rtl64.o -g --output rtl64-min -nostdlib --strip-all
ELF 2 - 1.4 . :
$ readelf --section-headers rtl64-min Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 00000000004000b0 000000b0 0000000000000317 0000000000000000 AX 0 0 1 [ 2] .data PROGBITS 00000000006003c7 000003c7 00000000000000bf 0000000000000000 WA 0 0 1 [ 3] .shstrtab STRTAB 0000000000000000 00000486 0000000000000017 0000000000000000 0 0 1
2 . , shstrtab . , binutils strip, . shstrtab :
$ strip -R shstrtab rtl64-min $ readelf --section-headers rtl64-min Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 00000000004000b0 000000b0 0000000000000317 0000000000000000 AX 0 0 1 [ 2] .data PROGBITS 00000000006003c7 000003c7 00000000000000bf 0000000000000000 WA 0 0 1 [ 3] .shstrtab STRTAB 0000000000000000 00000486 0000000000000017 0000000000000000 0 0 1
Shstrtab . , :
$ hexdump -C rtl64-min 00000480 40 00 00 00 00 00 00 2e 73 68 73 74 72 74 61 62 |@.......shstrtab| 00000490 00 2e 74 65 78 74 00 2e 64 61 74 61 00 00 00 00 |..text..data....| 000004a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
… ! . , .shstrtab . , , - , .
, ( ). . , .
, :
ENTRY(_start) /* */
SECTIONS
{
. = 0x4000b0; /* */
.data : { *(.data) }
.bss : { *(.bss) *(COMMON) }
. = 0x6000d3; /* */
.text : { *(.text) } /* */
}
ld - . , , ld --verbose
- , , . .
. :
$ ld rtl64.o -g --output rtl64-custom-ld -T linkerScript.ld -nostdlib --strip-all $ readelf --section-headers rtl64-custom-ld Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .data PROGBITS 00000000006000b0 000000b0 00000000000000bf 0000000000000000 WA 0 0 1 [ 2] .text PROGBITS 0000000000a00170 00000170 0000000000000317 0000000000000000 AX 0 0 1 [ 3] .shstrtab STRTAB 0000000000000000 00000487 0000000000000017 0000000000000000 0 0 1
, ! text, , data.
, , . gdb, - , , , , .
, - gdb . , …
, , , symtab strtab , . , - - ( , .. )
, . , , . stackoverflow , , . :
, , :
|
|
Elf__hdr.e__shoff |
0x28 |
Text_phdr.p_filesz |
sizeof(Elf__hdr) + sizeof(p_hdr) + 0x20 |
Text_phdr.p_memsz |
sizeof(Elf_hdr) + sizeof(p_hdr) + 0x28 |
Text_shdr.sh_size |
Elf_hdr.e_shoff + sizeof(injection) + 2*sizeof(s_hdr) + 0x20 |
Shstrtab_shdr.sh_offs |
Elf_hdr.e_shoff + sizeof(injection) + 3*sizeof(s_hdr) + 0x18 |
Symtab_shdr.sh_offs |
Elf_hdr.e_shoff + sizeof(injection) + 4*sizeof(s_hdr) + 0x18 |
Strtab_shdr.sh_offs |
Elf_hdr.e_shoff + sizeof(injection) + 5*sizeof(s_hdr) + 0x18 |
: sizeof(injection). .
, , ELF', - , , , .
, , "", "" "", .
. , ELF, , . - . , RTL.
.
- bootstrapping
#
$ btpc.exe < btpc64.pas > btpcCrossWin.exe
# Linux–
$ btpcCrossWin.exe < btpc64.pas > btpc64Linux
# , «» , Linux
$ btpc64Linux < btpc64.pas > btpc64Check
Pascal, Linux x64, .
, , . :
BTPC
, Pascal Windows
(RTL)
RTL ELF
, , , . , - . , ,
P.S.
, , . , , 9 . , . . , , , . , ( " ?") . , , , "", , ("" ).
Also, I cannot but thank my scientific advisor, Alexander Konovalov.
As noted at the beginning, this article was not intended to explain the ELF device, teach how to perform crazy tricks on them, or port programs. But using a real example, she, I hope, showed how non-standard solutions of ordinary problems can be, what interesting things can be observed in the course of solving them and what discoveries to make for oneself ... And maybe she will encourage someone to take the first step towards the next uncharted but exciting challenge ...