How to decrypt car firmware in unknown format



Toyota distributes its firmware in an undocumented format. My customer, who has a car of this brand, showed me the firmware file, which begins like this: Then there are lines of 32 hexadecimal digits. The owner and other craftsmen would like to be able to check what is inside before installing the firmware: put it in the disassembler and see what it does.



CALIBRATIONΓͺXi ΒΊ

attach.att

ÓÏ[Format]

Version=4



[Vehicle]

Number=0

DateOfIssue=2019-08-26

VehicleType=GUN1**

EngineType=1GD-FTV,2GD-FTV

VehicleName=IMV

ModelYear=15-

ContactType=CAN

KindOfECU=0

NumberOfCalibration=1



[CPU01]

CPUImageName=3F0S7300.xxz

FlashCodeName=

NewCID=3F0S7300

LocationID=0002000100070720

CPUType=87

NumberOfTargets=3

01_TargetCalibration=3F0S7200

01_TargetData=3531464734383B3A

02_TargetCalibration=3F0S7100

02_TargetData=3747354537494A39

03_TargetCalibration=3F0S7000

03_TargetData=3732463737463B4A



3F0S7300forIMV.txt ΒΈNiΒΆm5A56001000820EE13FE2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133E2030133E2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133E2030133E2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133E20911381959FAB0EE9000

81C9E03ADE35CEEEEFC5CF8DE9AC0910

38C2E031DE35CEEEEFC8CF87E95C0920

...










Specifically for this firmware, he had a content dump:



0000: 80 07 80 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0010: 80 07 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0020: 00 00 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0030: 80 07 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0040: 80 07 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0050: 80 07 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0060: 00 00 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0070: 80 07 00 00 00 00 00 00 β”‚ 00 00 00 00 00 00 00 00
0080: E0 07 60 01 2A 06 00 FF β”‚ 00 00 0A 58 EA FF 20 00
0090: FF 57 40 00 EB 51 B2 05 β”‚ 80 07 48 01 E0 FF 20 00
...


As you can see, there is nothing even close to the strings of hexadecimal digits in the firmware file. The question arises: in what format is the firmware distributed, and how to decrypt it? The owner of the car entrusted me with this task.



Repeating fragments



Let's take a closer look at those hexadecimal lines: We see eight repetitions of a sequence of three , which are very similar to the first eight lines of a dump, ending in 12 zero bytes. Three conclusions can be drawn immediately:



5A56001000820EE13FE2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133E2030133E2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133E2030133E2030133E20301

33E2030133C20EF13FE2030133E20301

33E2030133E20911381959FAB0EE9000

81C9E03ADE35CEEEEFC5CF8DE9AC0910

38C2E031DE35CEEEEFC8CF87E95C0920

...





E2030133



  1. The first five bytes 5A56001000are some kind of header that does not affect the contents of the dump;
  2. Further content is encrypted in blocks of 4 bytes, with the same dump bytes corresponding to the same bytes in the file:
    • E2030133 β†’ 00000000
    • 820EE13F β†’ 80078000
    • C20EF13F β†’ 80070000
    • E2091138 β†’ E0076001
    • 1959FAB0 β†’ 2A0600FF
    • EE900081 β†’ 00000A58
    • C9E03ADE β†’ EAFF2000
  3. It can be seen that this is not XOR encryption, but something more complex; but at the same time similar blocks of dump correspond to similar blocks in the file - for example, changing one bit 80078000→80070000corresponds to changing one bit 820EE13F→C20EF13F.


Correspondences between blocks



Let's get a list of all pairs (file block, dump block), and look for patterns in it:



$ xxd -r -p firmware.txt decoded

$ python
>>> f = open('decoded','rb')
>>> data=f.read()
>>> words=[data[i:i+4] for i in range(0,4096,4)]
>>> f = open('dump','rb')
>>> data=f.read()[:4096]
>>> reference=[data[i:i+4] for i in range(0,4096,4)]
>>> list(zip(words,reference))[:3]
[(b'\x82\x0e\xe1?', b'\x80\x07\x80\x00'), (b'\xe2\x03\x013', b'\x00\x00\x00\x00'), (b'\xe2\x03\x013', b'\x00\x00\x00\x00')]
>>> dict(zip(words,reference))
{b'\x82\x0e\xe1?': b'\x80\x07\x80\x00', b'\xe2\x03\x013': b'\x00\x00\x00\x00', b'\xc2\x0e\xf1?': b'\x80\x07\x00\x00', ...}
>>> decode=dict(zip((w.hex() for w in words), (r.hex() for r in reference)))
>>> decode
{'820ee13f': '80078000', 'e2030133': '00000000', 'c20ef13f': '80070000', ...}
>>> sorted(decode.items())
[('00beb5ff', '4c07a010'), ('02057139', '0000f00f'), ('03ef5ed0', '50ff710f'), ...]


This is what the first pairs look like in the sorted list:



00beb5ff β†’ 4c07a010
02057139 β†’ 0000f00f
03ef5ed0 β†’ 50ff710f \ change in bit 24 in the dump changes bits 8, 10, 24-27 in the file
04ef5bd0 β†’ 51ff710f < 
0408ed38 β†’ 14002d06 \
05f92ed7 β†’ ffffd087 |
0a5d22bb β†’ f602dffe> changing bit 25 in the dump changes bits 11, 25-27 in the file
0a62f9a9 β†’ e10f5761 |
0acdc6e4 β†’ a25d2c06 /
0aef53d0 β†’ 53ff710f <
0aef5cd0 -> 52ff710f / change in bit 24 in the dump changes bits 8-11 in the file
0bdebd6f β†’ 4c57a410
0d0c7fec β†’ 0064ffff
0d0fe57f β†’ 18402c57
0d8fa4d0 β†’ bfff88ff
0ee882d7 β†’ eafd7f00
1001c5c6 β†’ 6c570042 \
1008d238 -> 42003e06> change in bit 1 in the dump changes bits 0, 3, 16-19 in the file
100ec5cf β†’ 6c570040 /
109ec58f β†’ 6c070050
10e1ebdf β†’ 62ff6008
10ec4cdd β†’ dafd4c07
119f0f8f β†’ 08006d57
11c0feee β†’ 2c5f0500
120ff07e β†’ 20420452
125ef13e β†’ 20f600c8
125fc14e β†’ 60420032
126f02af β†’ 02006d67
1281d09f β†’ 400f3488
1281d19f β†’ 400f3088
12a6d0bb β†’ 40073498
12a6d1bb β†’ 40073098 \
12aed0bf -> 40073490> change to bit 3 in the dump changes bits 2 and 19 in the file
12aed1bf -> 40073090 /> change in bit 10 in the dump changes bit 8 in the file
12c3f1ea β†’ 20560001 \
12c9f1ea -> 20560002 / changes to bits 0 and 1 in the dump changes bits 17 and 19 in the file
...


Indeed, the following patterns are visible:



  • Changes to bits 0-3 in the dump change bits 0-3 and 16-19 in the file (mask 000F000F)
  • Changes to bits 24-25 in the dump change bits 8-11 and 24-27 in the file (mask 0F000F00)


The hypothesis suggests itself that every 4 bits in a dump affects the same 4 bits in every 16-bit half of a 32-bit block.



To check, let's "cut off" the most significant 4 bits in each half-block, and see what pairs we get:



>>> ints=[int.from_bytes(w, 'big') for w in words]
>>> [hex(i) for i in ints][:3]
['0x820ee13f', '0xe2030133', '0xe2030133']
>>> scrambled=[((i & 0xf000f000) >> 12, (i & 0x0f000f00) >> 8, (i & 0x00f000f0) >> 4, (i & 0x000f000f)) for i in ints]
>>> scrambled=[tuple(((i >> 16) << 4) | (i & 15) for i in q) for q in scrambled]
>>> scrambled[:3]
[(142, 33, 3, 239), (224, 33, 3, 51), (224, 33, 3, 51)]
>>> [tuple(hex(i) for i in q) for q in scrambled][:3]
[('0x8e', '0x21', '0x3', '0xef'), ('0xe0', '0x21', '0x3', '0x33'), ('0xe0', '0x21', '0x3', '0x33')]
>>> [b''.join(bytes([i]) for i in q) for q in scrambled][:3]
[b'\x8e!\x03\xef', b'\xe0!\x033', b'\xe0!\x033']
>>> decode=dict(zip((b''.join(bytes([i]) for i in q).hex() for q in scrambled), (r.hex() for r in reference)))
>>> sorted(decode.items())
[('025efd97', 'ffffd087'), ('02a25bdb', 'f602dffe'), ('053eedf0', '50ff710f'), ...]
>>> decode=dict(zip((b''.join(bytes([i]) for i in q[1:]).hex() for q in scrambled), (r.hex()[1:4]+r.hex()[5:8] for r in reference)))
>>> sorted(decode.items())
[('018d90', '0f63ff'), ('020388', '200e06'), ('050309', 'c03000'), ...]


After rearranging the subblocks by 4 bits in the sorting key, the correspondences between pairs of subblocks become even more explicit:



018d90 β†’ 0f63ff
020388 β†’ 200e06    \
050309 β†’ c03000 \   |  xx0xxx0x     xx0xxx3x  
05030e β†’ c0f000  |  |
05036e β†’ c06000  | /
050c16 β†’ c57042  |
050cef β†’ c57040  |
05971e β†’ c88007   >  xCxxx0xx     x0xxx5xx  
0598ef β†’ c07050  |
05bfef β†’ c07010  |
05db59 β†’ c9000f  |
05ed0e β†’ cff000 <
060ecc β†’ 264fff  |
065ba7 β†’ 205fff  |
0bed1f β†’ 2ff008 <|
0bfd15 β†’ 2ff086  |
0cedcd β†’ afdc07 <|
10f2e7 β†’ e06a7e   >  xxFxxx0x     xxExxxDx  
118d5a β†’ 9fdfff  | \
13032b β†’ 40010a  |  >  xxFxxxFx     xx8xxxDx  
148d3d β†’ fff6fc  | /
16b333 β†’ f00e30  |
16ed15 β†’ fffe06 /
1b63e6 β†’ 52e883
1c98ff β†’ 400b57 \
1d4d97 β†’ aff1b7  |  xx00xx57     xx9Fxx8F  
1ece0e β†’ c5f500  |
1f98ff β†’ 800d57 /
20032f β†’ 00e400 \
200398 β†’ 007401  |
2007fe β†’ 042452  |
2020ef β†’ 057490  |
206284 β†’ 067463   >  x0xxx4xx     x2xxx0xx  
20891f β†’ 00f488  |
20ab6b β†’ 007498  | \
20abef β†’ 007490  | /  xx0xxx9x     xxAxxxBx  
20ed1d β†’ 0ff404  |
20fb6e β†’ 0064c0 /
21030e β†’ 00f000 \
21032a β†’ 00b008  |
210333 β†’ 000000  |
210349 β†’ 00c008  |
21034b β†’ 003007  |
210359 β†’ 00000f  |
210388 β†’ 000006   >  x00xx00x     x20xx13x  
21038b β†’ 00300b  |
210398 β†’ 007001  |
2103c6 β†’ 007004  |
2103d2 β†’ 008000  |
2103e1 β†’ 008009  |
2103ef β†’ 007000 /
...


Correspondences between subblocks



The above list shows the following matches:



  • For the mask 0F000F00:
    • x0xxx0xxin dump -> x2xxx1xxin file
    • x0xxx4xxin dump -> x2xxx0xxin file
    • xCxxx0xxin dump -> x0xxx5xxin file
  • For the mask 00F000F0:
    • xx0xxx0xin dump -> xx0xxx3xin file
    • xx0xxx5xin dump -> xx9xxx8xin file
    • xx0xxx9xin dump -> xxAxxxBxin file
    • xxFxxx0xin dump -> xxExxxDxin file
    • xxFxxxFxin dump -> xx8xxxDxin file
  • For the mask 000F000F:
    • xxx0xxx7in dump -> xxxFxxxFin file
    • xxx7xxx0in dump -> xxxExxxFin file
    • xxx7xxx1in dump -> xxx9xxx8in file


We can conclude that each 32-bit block in the dump is split into four 8-bit values, and these values ​​are replaced using some lookup tables, for each mask. The contents of these four tables seem to be relatively random, but let's try to extract all of them from our file:



>>> ref_ints=[int.from_bytes(w, 'big') for w in reference]
>>> ref_scrambled=[((i & 0xf000f000) >> 12, (i & 0x0f000f00) >> 8, (i & 0x00f000f0) >> 4, (i & 0x000f000f)) for i in ref_ints]
>>> ref_scrambled=[tuple(((i >> 16) << 4) | (i & 15) for i in q) for q in ref_scrambled]
>>> decode=dict(zip((b''.join(bytes([i]) for i in q).hex() for q in scrambled), (b''.join(bytes([i]) for i in q).hex() for q in ref_scrambled)))
>>> sorted(decode.items())
[('025efd97', 'fdf0f8f7'), ('02a25bdb', 'fd6f0f2e'), ('053eedf0', '5701f0ff'), ...]
>>> decode=[dict(zip((bytes([q[byte]]).hex() for q in scrambled), (bytes([q[byte]]).hex() for q in ref_scrambled))) for byte in range(4)]
>>> decode
[{'8e': '88', 'e0': '00', 'cf': '80', 'e1': 'e6', '1f': '20', 'c3': 'e2', ...}, {'03': '00', '5b': '0f', '98': '05', 'ed': 'f0', 'ce': '50', 'd6': '51', ...}, {'21': '00', '9a': 'a0', 'e0': '0a', '5e': 'f0', '5d': 'b2', 'c0': '08', ...}, {'ef': '70', '33': '00', '98': '71', '90': '6f', '01': '08', '0e': 'f0', ...}]
>>> decode=[dict(zip((q[byte] for q in scrambled), (q[byte] for q in ref_scrambled))) for byte in range(4)]
>>> decode
[{142: 136, 224: 0, 207: 128, 225: 230, 31: 32, 195: 226, 62: 244, 200: 235, ...}, {3: 0, 91: 15, 152: 5, 237: 240, 206: 80, 214: 81, 113: 16, 185: 2, 179: 3, ...}, {33: 0, 154: 160, 224: 10, 94: 240, 93: 178, 192: 8, 135: 2, 62: 1, 120: 26, ...}, {239: 112, 51: 0, 152: 113, 144: 111, 1: 8, 14: 240, 249: 21, 110: 96, 241: 47, ...}]


When the lookup tables are ready, the decryption code is quite simple:



>>> def _decode(x):
...   scrambled = ((x & 0xf000f000) >> 12, (x & 0x0f000f00) >> 8, (x & 0x00f000f0) >> 4, (x & 0x000f000f))
...   decoded = tuple(decode[i][((v >> 16) << 4) | (v & 15)] for i, v in enumerate(scrambled))
...   unscrambled = tuple(((i >> 4) << 16) | (i & 15) for i in decoded)
...   return (unscrambled[0] << 12) | (unscrambled[1] << 8) | (unscrambled[2] << 4) | (unscrambled[3])
...
>>> hex(_decode(0x00beb5ff))
'0x4c07a010'
>>> hex(_decode(0x12aed1bf))
'0x40073090'


Firmware header



At the very beginning, there was a five-byte header before the encrypted data 5A56001000. The first two bytes β€” the signature 'ZV'β€” indicate that the LZF format is being used ; further indicated the compression method ( 0x00- no compression) and length ( 0x1000bytes).



The owner of the car, who gave me the files for analysis, confirmed that LZF compressed data are also found in the firmware. Fortunately, the implementation of LZF is open source and fairly simple, so along with my analysis, he managed to satisfy his curiosity about the contents of the firmware. Now he can make changes to the code - for example, auto-start the engine when the temperature drops below a predetermined level in order to use the car in the harsh Russian winter.






All Articles