We create a processor module for Ghidra using the example of v8 bytecode

Last year, our team faced the need to analyze the V8 bytecode. At that time, there were no ready-made tools that could restore such code and provide convenient navigation through it. It was decided to try to write a processor module for the Ghidra framework. Due to the peculiarities of the used instruction description language, we got not only a readable set of instructions at the output, but also a C-like decompiler. This article is a continuation of a series of materials ( 1 , 2 ) about our plugin for Ghidra.





Several months passed between the writing of the processor module and the article. During this time, the SLEIGH specification has not changed, and the described module works on versions 9.1.2โ€“9.2.2, which were released over the last six months.





Now on ghidra.re and in the documentation attached to Ghidra there is a fairly good description of the capabilities of the language - these materials are worth reading before writing your own modules. The ready-made processor modules of the framework developers can be excellent examples, especially if you know the architecture described in them.





In the documentation, you can read that processor modules for Ghidra are written in the SLEIGH language, which originated from the SLED (Specification Language for Encoding and Decoding) language and was developed purposefully for Ghidra. It translates machine code into p-code (an intermediate language used by Ghidra to build decompiled code). As a language intended for describing processor instructions, it has a lot of limitations, which, however, can be stopped due to the p-code injection mechanism in java code.





github. , SLEIGH . , p-code, . The Ghidra Book: The Definitive Guide.





Eclipse, , Ghidra: GhidraDev GhidraSleighEditor. Ghidra Module Project v8_bytecode. , .





, , The Ghidra Book: The Definitive Guide. .





  • *.spec โ€” .





  • *.ldefs โ€” . . *.sla, .





  • *.pspec โ€” .





  • *.opinion โ€” ; , opinion : .





  • *.slaspec, *.sinc โ€” , SLEIGH.





.sla, slaspec-.





, , , , . . .





V8

Jsc-, , c JavaScript Node.Js 8.16.0 bytenode ( Node.Js, npm). , bytenode Node.js . , jsc js:





Node.js , . , ( bytecode-register.cc, bytecode-register.h). v8 Node.js:





, aX , . .





 โ€” <this>, aX โ€” , , rN โ€” , . 1- , 2- Wide- 4- ExtraWide-. Wide- :





Node.js v8 .





, SLEIGH , . , 124  rN 125  aX. , . :





, Node.js - . (X aX) . , , .





, , .





CSPEC                                                  

, cspec-, github. :





Ghidra .  โ€” , , . Ghidra SLEIGH , Intel x86, . , , . , , .





, :





  • Compiler Specific P-code Interpretation;





  • Compiler Datatype Organization ( <data_organization>);





  • Compiler Scoping and Memory Access ( <global>);





  • Compiler Special Purpose Registers ( <stackpointer>);





  • Parameter Passing ( <default_proto>).





, , .





<data_organization> <stackpointer> ; <prototype> <default_proto>, . : <input>



, <output>



, <unaffected>



.





, aX. . , register. . , , , , . (space="register"



) <input>



, , 0x14000 (0x14000 , , *.slaspec aX).





(acc), <output>



. , , . <unaffected>



, , , .





, <global>



register 0x2000.





LDEFS

 โ€” .ldefs. : ( le), (*.sla, *.pspec,*.cspec), id , Ghidra. - , Node.js, , <language>, *.ldefs , Ghidra.





, , .





PSPEC

( .pspec). processor_spec.rxg ( Ghidra ). - . , .





, ( <processor_spec> ).





SLASPEC

SLEIGH .slaspec.





. , , .





, ( register ram), define space,  โ€” define register. offset , , . size. , *.cspec , .





(https://ghidra.re/courses/languages/html/sleigh_constructors.html) , ,  โ€” . SLEIGH , , , ยซ ยป. 5 .





  1. Table Header ( )





  2. Display Section ( )





  3. Bit Pattern Sections ( )





  4. Disassembly Actions Section ( )





  5. Semantics Actions Section ( )





, .





  1. Table Header , , ( ).





  2. Display Section โ€” , Ghidra.





  3. Bit Pattern Section โ€” - , ยซยป c ( ).





  4. Disassembly Actions Section - , .





  5. Semantics Actions Section , , .





( instruction), , .





, , . , , . . , ( ), .





, :





  • ^ โ€” / , ;





  • โ€œโ€ โ€” , , ;





  • , ;





  • ( - , ,  #, ).





. , . , . . , :





tokenMaxSize  8. , - . , , , . : start- endBitNumX 0 tokenMaxSize-1 startBitNumX <= endBitNumX.





v8 , . , , ยซ&ยป ยซ|ยป.





: , , , ยซยป , .





, . , . v8, ( Wide- ExtraWide- , , ). :





, op , Illegal Nop, :





ยซ0xa7ยป Ghidra Illegal, . unimpl. , , . Nop , , . Nop Node.js , SwitchOnSmiNoFeedback,





: LdaSmi, (acc ), AddSmi, c .





bytecodes.h Node.js, operand, . , (. AddSmi).





- LdaSmi [-02]. , , disassembly action ( , ).





AddSmi , op, , ยซ;ยป operand. . , . , , (, , ).





ยซ;ยป , , , ( ), .





PCode ยซ ยป Ghidra. - , p-code.





v8 , lda, . acc . , acc, , .





return, , , :





, , . Mul, , .





ยซ ยป , , , ยซยป . kReg 8 . attach variables 0b 11111111b ( kReg) . , , , 0xfb (11111011b), kReg r0.





, kReg , :





interpreter-generator.cc Node.js. kReg , Table Header src.  โ€” export. p-code, export , ยซยป src. Ghidra .





, :





goto

. SLEIGH goto. , kUImm, . disassembly action rel. inst_start SLEIGH .





SLEIGH . , ( ), , ( , p-code ), .





ยซยป dest. *[ram]:4 rel , 4  rel. rel ram. ยซ*ยป SLEIGH , ( Dynamic References).





[ram] ( ), . p-code, ram.





JumpIfFalse - . SLEIGH goto. js False , , pspec , , . , .





inst_start . TestGreaterThan, goto (<true> ) inst_next. : , , . .





goto inst_next . , ยซs>ยป, . .





. .





, (. ), . v8, . , 4  CallProperty2 , . :CallProperty2 kReg, kReg, kReg, kReg, [kIdx] Sleigh . - :





, , . , callable, receiver, arg1 arg2 - attach :





kReg .  โ€” .





CallProperty2 , call [callable];, . v8 aX ( cspec). , , (, , sinc-, x86). , . Ghidra, . , - , . :





( : sp , ) CallUndefinedReceiver1:





, , java-. , , , SLEIGH. p-code .





, , . , acc , , . , , ( CallVariadicCallOther ยซ ยป ). define pcodeop OperationName , .





p-code- : callotherfixup



cspec- .





java- , :





. bytenode jsc- js:





jsc- Ghidra. - , Ghidra , eclipse , . : sleigh .





, . .





, . 010.  D  F, . :





( SLEIGH), . , ( SLEIGH cpool) LdaGlobal. ( ):





, , JavaScript, , .slaspec ( .sinc). , p-code, , p-code. p-code .





v8 , / . . , , .





, , : ForInPrepare r9, r10!3. , , , , , .





,

. . , ARM: ( , - ).





, . , . . , , , , .





, CallProperty , ยซยป , , . , : rangeSrc rangeDst. rangeSrc โ€” , , rangeDst ยซยป . rangeDst , : aX rX .





. ยซ=ยป, , disassembly action. - . , , , aX, rX, . : , , , .





. . , (contextreg ).





, , ( ), . counter offStart , .





, .





, , , - disassembly action. rangeSrc, , disassembly action offStart,  โ€” counter. ยซ{ยป.





, v8 range_size: , . rangeSrc .





  rangeDst 5  .





  • a0 counter 0 ( ).





  • r0 counter 0 ( ).





  • offStart a0, disassembly action counter , offStart , rangedst1.





  • offStart r0, disassembly action counter offStart , rangedst1.





  • , rangedst1( , , , ).





. rangeDstN, N โ€” , , aN/rN.





. rangeSrc , , rangeDst - , . epsilon, .





rangeDst, rangeDst1, rangeDst2, . , github. , rangeDst rangeDstX, ,  โ€” , .





, . ยซ&ยป ยซ|ยป.





CallProperty :





:





, , CallVariadicCallOther. github java- p-code. p-code call ( Node.js, ,  โ€” ). slaspec, , , :





, :





rangeDst ( r7 ) , console.log(1,2,3,4,5,6). bytenode . 0x167,  โ€” 0x18b.





, , , , - ( , , , ).





, rangeDst , ( , 2 4 ):





, : , . , , . , , , . , , . , SLEIGH.





Node.js , , .





:





:





  1. https://ghidra.re/courses/languages/html/sleigh.html โ€” SLEIGH.





  2. https://github.com/NationalSecurityAgency/ghidra/tree/master/Ghidra/Framework/SoftwareModeling/data/languages โ€” *.cspec, *.pspec, *.opinion, *.ldefs.





  3. https://spinsel.dev/2020/06/17/ghidra-brainfuck-processor-1.html โ€” brainfuck Ghidra.





  4. https://github.com/PositiveTechnologies/ghidra_nodejs โ€” Ghidra .








All Articles