Reverse engineering of an unknown microcontroller





Complicated tie



Background ...



As part of my work on  reverse engineering of electronic eInk price tags,  I encountered an interesting problem. A specific company (Samsung Electro Mechanics / SoluM) switched from using third-party chips, the origin of which I was able to identify (Marvell 88MZ100) to a new chip, which it began to use with its next generation price tags.



It seemed that this is their own chip, developed by the company for this very purpose. Taking on the reverse engineering of such a thing is a dead issue. A friend gave me some price tags with such chips - to tinker. It turned out that they are of two types: one with a segmented display on e-ink, and the other with a conventional graphic display on e-ink. The main chip in both models is the same, so the first thing I did was with a segmented display device, since it is simpler and it is easier to deal with an unknown system using it. It was not entirely clear where to start, but, of course, these are the tasks that are always the most interesting! 



Study





It is foolish to try to solve a crossword puzzle without reading the questions to it. It is just as foolish to reverse engineer a device without first collecting all the information that is already available about it. So what do we initially know? The wireless data transfer protocol is probably the same as usual, since no company wants to migrate to a new one or support two protocols for its customers at once, slowly performing the migration. The old protocol was 2.4 GHz ZigBee-like, so the new one is probably the same. Here is a photo of the board from both sides.





So what do we see? First, a cool example of cost optimization. They laminated the e-ink screen right onto the PCB! Who needs a conductive glass back panel when there is a PCB? The front panel is made of conductive plastic. But it is not important.



Two antennas are visible, both, judging by their size - at 2.4 GHz. As expected, since the previous generation devices also had two 2.4 GHz antennas. We see two chips. Big and small. The large one (designated "SEM9010") apparently has a lot of contacts going to the display and none to the antennas. Obviously this is a display controller.



The small (designated "SEM9110") appears to be the brain responsible for all operations. It is connected to the antennas, the timing crystal, and key points that are obvious here for factory programming.



There are 12 pads here: one is connected to the positive terminal of the battery, one to ground, the purpose of the other 10 is a mystery. Searching for the name of the chip online, I do not find anything useful - definitely their own development. But who designs their own chip for such a simple application? Maybe just a rebranding? Harnessed, we are working!



Curiously, Google Image Search helped here. It happens that this tool comes in handy for reverse engineering. In this case, he leads us to this nugget. (archived copy  here  for posterity). This is a question from StackExchange - wondering how these electronic shelf labels work. The question is interesting because in the photo posted here, the printed circuit board looks almost  identical to  ours. The chips are also exactly the same, but the labels on them are different! The board was probably made before SoluM started rebranding these chips.



The chip that I assumed to be the display controller is labeled  SSD1623L2



. Indeed, it is an e-ink segmented display controller that supports up to 96 segments. Searching online, I find the  pre-release version 0.1 datasheet (archived copy  here for posterity). It's good! If they knew how to get through to this, they could pick up a code that it understands, and as soon as we see this code, that's all!



It turns out that the main microcontroller is ZBS242



. Okay. I am not familiar with this microcontroller. Let's search the Internet a little more - and the searches lead us to the link  (archive copy  here for posterity), which also mentions the same answer from StackExchange. The page is Korean, but it shows that this chip has an 8051 core, as well as a fairly predictable peripheral equipment: UART, SPI, I2C, ADC, DAC, comparator, temperature sensor, 5-channel PWM, 3-ch triac controller, IR transmitter, key scan function, RF-Wake function, antenna spacing, ZigBiee compatible radio and MAC. The picture shows that there is also an internal 32 kHz RC oscillator, which, as stated, can consume as little as 1 uA in sleep mode. I think it was this company that made our chip for Samsung. Interesting ...



Let's look at the pictures and find that the SEM9110 crystal that puzzled us was also  shot point-blank  (archive copy  here  for posterity). It is stated to be ZBS243. I guess that means there is a whole family of chips here: the ZBS24x. Really interesting.



We have a thread!





Having opened another segment tag, we continue to rejoice in the news: the programming head is signed in clear, legible gold letters! The head appears to have an SPI, UART, reset pin, power supply, ground, and a pin called “test”, probably used to enter factory test mode. Everything is more curious and curious.





It is logical that the oldest representative of the hypothetical ZBS24x family will be designated "ZBS240". Maybe a search for such a query will give us something interesting? Searching for "ZBS240" and filtering out the slag, we find another interesting page in Korean  (archived copy  here  for posterity). It looks like this company makes custom on-demand group programmers. Having looked around on their website, we find a manual  (archive copy  here for posterity) on their programming device, and we can even download a utility for a PC to work with such a device. This utility even has a tool to update the firmware on the device. I looked to see if it was possible to guess from this information how to program the device, but the firmware turned out to be encrypted. Apparently the PC-side utility is just sending data over the USB serial port, so there is no useful information here either. Sad ... 



After searching a little more, we find an even more interesting page  (archived copy  here for posterity). What is it? Is it on sale?!? Definitely not anymore, right? I just wrote to this company for soap, just in case. Silence ... As a gesture of despair, I asked a friend from Hong Kong if he knew anyone in Korea who could contact these guys, since their website shows that they only accept a transfer from a Korean bank as payment. I was just amazed when he knocked back and said, indeed, he could get me this device through an intermediary found in Korea! A few days later, the device was delivered by DHL!



You can reach him!



How to contact him



Works! I can read the chip and write to it. It took me a while to research the programming tool. Apparently, the chip has 64KB of flash memory and a 1KB "information block", which I believe is used to store calibration values, MAC addresses, and the like. I was able to intercept some of the traces, armed with the wonderful Saleae Logic logic analyzer  , watching the programmer do its job. You can download my findings here . In this archive you will find traces of reading, erasing and writing to the INFOBLOCK and CODE spaces. In fact, the protocol is VERY  simple! The clock frequency can be anything from 100 kHz to 8 MHz.



ISP protocol: cut to the bone 



It all starts with setting the lines to the desired state: SCLK bottom, MOSI top, RESET top, SS top. This condition is maintained for 20 ms. Then RESET goes down by 32 ms. Then at least 4 processor clocks are sent to the SCK line at 500 kHz. Then there is another 10 ms delay until RESET is pushed up. You can now set a delay of 100 ms before starting communication. After that, any number of transactions can be made. A few basic rules: there must be at least 5us between SS going down and sending a byte, at least 2us between the end of the byte and SS going up, and the shortest period that SS can spend up is 2.5us. Therefore, each byte is sent in the state: SS is low, a byte is sent in SPI mode 0, SS is up. Yes of course,SS flips for each byte.



All transactions are three to four bytes long. The first byte indicates the type of transaction, the lowest bit specifies the direction of the transaction: zero means writing to the device, one means reading from the device. The 0x02



/ commands 0x03



 are used to initiate communication sessions. The programmer sends a three-byte write:  02 BA A5



and then reads, first sending the read command and "address":, the  03 BA



master sends FF



while receiving A5



. If this works, then communication is established.



Commands  0x12



/ 0x13



 are used to read / write special purpose registers (SFRs) in the CPU (I found this more difficult, but in this case the order is not so important). To select INFOBLOCK, SFR  0xD8



 must be set to 0x80



, to select the main flash area, it must be set to 0x00



. To write the value of vv to register rr, SPI data is needed  12 rr vv



. To make sure that the value has been read, it can be read back by first sending a read command and an "address":,  13 rr



after which the master sends  FF



while receiving  vv



.



It is easy to read the flash memory. To do this, apply 0x09



, a four-byte command. After the command byte, the address is sent, first the high byte, then the low. Then the master sends  FF



, in the meantime receiving the byte that has been read. Well yes. A separate command is required to read each byte. Writing is easy too. For this, the command is used 0x08



. This is a four-byte command. After the command byte, the address is sent, first the high byte, then the low byte, and then the byte to be written. A separate command is also required to write each byte. Be sure to erase before recording. To erase INFOBLOCK, it requires only one 4-byte sequence:  48 00 00 00



. Erasing in the main flash memory is carried out using the command 88 00 00 00



.



So now you know enough to trivially program your ZBS24x!



Get to work!







Primer for 8051



If you are already familiar with the 8051, you can safely  skip  this section.



The 8051  is an  old microcontroller designed by intel back in antiquity . It is a terrible hassle to work with, but it is still used quite often because it is cheap to license (in fact, it is free). What's the trouble? The 8051 has several separate memory spaces. CODE



 - This is the area of ​​memory allocated for the code. Its maximum size is 64KB (16-bit address). In the most modern designs, this is flash memory. The code can read bytes from here using a special instruction movc



 ("MOVe from Code").  XRAM



 Is "external" memory. That is, external to the core. You can store various things in it, but it is almost useless for anything else. Like this: the only operations that can be performed in this memory are writing and reading. Its maximum size is 64KB (16-bit address). How does the address memory of an 8-bit address with a 16-bit wide address work? It turns out to be very slow. The command movx



 ("MOVe to / from eXternal") accesses this type of memory, but how do you specify a 16-bit address? For this, a special register called  DPTR



 ("Data PoinTeR") is used, as well as for working with an instruction movc



DPTR



 consists of an upper register  DPH



 and a lower register  DPL



... Consequently, by writing to each of them half the address, you can address the external memory and the code memory. As you might guess, this process quickly starts to slip, since, for example, to copy a section from external memory to external memory, you will need to repeatedly shuffle the values ​​between DPL



 and  DPH



. For this reason, some of the more advanced versions of the 8051 have many registers  DPTR



, but not all, and not all of them are implemented the same way.



Intel has added a faster way to access a subset of external memory. In this case, the idea is to use registers R0



 and  R1



 as pointer registers. But they are 8 bits in size, where does the other 8 bits in the address come from? They are from a register P2



 (which also controls port 2 for the GPIO pins). Obviously, this practice gets in the way of using port 2 for ... you know ... GPIO. There are ways to smooth out this situation, but I'm not talking about that now. Thus, the amount of memory available to us is limited to 256 bytes (unless you dynamically change port 2, which you probably do not want to do). Usually this memory is called PDATA



. Similar memory accesses are also done using an instruction  movx



. Next in line we have SFR



- various configuration registers with which peripherals are configured. This memory area can only be accessed directly. This is the situation: the address must be encoded directly in the instruction, there will be no access through any pointer register. There are 128 bytes SFR



. The following table shows the lists SFR



available in accordance with the 8051 standard. The gray boxes contain SFR



which bits can be accessed individually using bit-wise commands. This is useful when assigning port pins atomic, or when activating / deactivating interrupt sources, or when checking some statuses.



The internal memory on the 8051 is a little tricky. On all modern 8051s, it is 256 bytes. The last 128 bytes  0x80-0xff



 are available  only  indirectly through the registers  R0



 and  R1



, but, unlike the situation with external memory, now not only read and write are available to us. We can do an increase by one ( inc



rement), lower by one ( dec



rement), addition ( add



), and most of the other expected operations. In fact, ALL of the  internal RAM is accessed indirectly through these pointer registers. Lowest 128 bytes 0x00-0x7f



 also available directly (the address is directly encoded in the instruction itself, just like when working with SFR



. 16 bytes of memory in the range are 0x20-0x2f



 also bit addressable using bitwise processing instructions. It is convenient to store variables for boolean values ​​in this part. The lowest 32 bytes  0x00-0x1f



 make up 4 banks registers  R0



... R7



In the status register  PSW



 there are bits that allow you to select which bank is currently being used, but in reality, since there is usually a shortage in the internal area with memory, the code mostly uses only one bank of memory.



The 8051 is a machine primarily designed to work with a single operand. That is: in most operations, the battery is used as one of the sources and, possibly, as the destination. Registers can also be used for many (but not all) operations, and some operations allow indirect access to internal RAM, as described above. The stack is an empty upstream, addressable SFR



, it is called  sp



 and is located only in the internal RAM, its maximum size is limited to 256 bytes, but in reality it is much smaller. 



Any 8051 ROM image starts with a vector table that contains jumps to the initial code that you want to run as well as the interrupt handlers. In 8051, historically, the reset vector is located at 0x0000



, and interrupt handlers start at the address 0x0003



 and then every 8 bytes. Since the instruction reti



 is only used to return from interrupts, it can be used to easily detect whether a particular function is an interrupt handler.



Fill your C compiler channel with all of this and take a puff! 



A suitable C compiler for this architecture exists: Keil's C51. But it's not cheap. There is also an open source compiler: SDCC . It's so-so, but free. While doing this project, I found only two great bugs in it, which could only be overcome by bypassing; it's not bad at all for an open source project.



Let's start the analysis



void prvTxBitbang(u8 val)
                  __naked {
  __asm__(
    "  setb  PSW.5       \n"
    "  jbc   _EA, 00004$ \n"
    "  clr   PSW.5       \n"
    "00004$:             \n"
    "  clr   C           \n"
    "  mov   A, DPL      \n"
    "  rlc   A           \n"
    "  mov   DPL, A      \n"
    "  mov   A, #0xff    \n"
    "  rlc   A           \n"
    "  mov   DPH, A      \n"
    "  mov   B, #11      \n"
    "00001$:             \n"
    "  mov   A, DPH      \n"
    "  rrc   A           \n"
    "  mov   DPH, A      \n"
    "  mov   A, DPL      \n"
    "  rrc   A           \n"
    "  mov   DPL, A      \n"
    "  jnc   00002$      \n"
    "  setb  _P1_0       \n"
    "  sjmp  00003$      \n"
    "00002$:             \n"
    "  clr   _P1_0       \n"
    "  nop               \n"
    "  nop               \n"
    "00003$:             \n" 
    "  nop               \n"
    "  nop               \n"
    "  nop               \n"
    "  djnz  B, 00001$   \n"
    "  mov   C, PSW.5    \n"
    "  mov   _EA, C      \n"
    "  ret               \n"
  );  }

      
      





It's easy to start with the GPIO configuration. As a rule, you will come across several matching bits, which will be set or erased in several registers in a row. This is logical, since when activating or deactivating, you usually have to use the pin as a function (from the GPIO), set it as an input or output, and set or read its value. You should come across this kind of code at the very beginning of work. Let's see what's there ... we find that the standard registers P0



P1



and  P2



 actually used that way, how to deal with registers GPIO. By looking at which registers are written around them and what then happens to the bits in them (whether they are read (input) or write (output)), we can assume that the registers  AD



AE



AF



 Are designed to "the function" - and it appears that GPIO, which are set corresponding bits are not used as gpio, and all GPIO, actually used as a GPIO, start working so only after a corresponding bit in one of these registers will be cleared. I named them PxFUNC



where x is the port number. Then we can conclude that  B9



BA



BB



 control the direction. Whenever a bit is set in one of them, the corresponding GPIO is only read, and when the bit is cleared, the corresponding GPIO is write-only. Hence, we understand that these registers control the direction of the GPIO. I named them  PxDIR



where x is the port number. So now, in theory, I could control the GPIO. If only I knew which of them do what ... 



I decided to just try all of them in a row until I find the one that controls the "TEST pad" on the programming head, or maybe the URX and UTX pads. Anyway, actually ... I found that port 1 pin 0 ( P1.0



) is "TEST",  P0.6



 this is "UTX", and  P0.7



 this is "URX". Having a controlled GPIO, you can simplify your life, but only as long as you can handle debugging by switching different GPIOs, and until you get tired of it. I had time to practice this! 



We have printf!



I used this function to turn the "TEST" pad into a regular 8n1 serial port using the bit-bang method, and collected the output using my logic analyzer. I fiddled with it until it gave the baud rate that my USB to serial adapter cable could handle. I already had an 8051 implementation of printf in assembler. For an hour, I practiced outputting complex debug lines from this impromptu serial port. Not a bad start, definitely, this is the only way you need to act in order to move forward effectively! 



At this point, I have displayed in the window the values ​​of all SFR



, to at least navigate what these values ​​are. There were still some problems with further research. To begin with, the watchdog timer (WDT) seemed to only be set by default and reset the chip after one second of execution, so all my experiments had to fit in a second or less. I didn't know how to operate WDT yet, so I put up with this limitation for a while. Be that as it may, one second is many cycles! 



Expanding access



Now that I was able to reliably execute the code and output the results, I decided to figure out where the tick controls are. Almost all registers have at least one register that controls different speeds (at least the speed of the CPU) and another register that controls the clock rate (or reset) of various modules. They are usually found like this: the first is usually recorded VERY  early at the initial load, and after that it is hardly touched (if at all). The second usually has a bit set (clock cycles) or a bit cleared before we start configuring a peripheral. We do not know where the various peripherals are configured, but usually the set  SFR



with similar numbers corresponds to a peripheral device. So let's see. Definitely there is a case, fit this description: B7



. We see that one bit at a time is set in it, before several SFR



with similar numbers are written  , and the bits in it will be cleared after calls to several SFR



with similar numbers stop. We also see that it is initially recorded as  0x2F



, so here we are dealing with peripherals that are included in advance. Since the bits appear to be set  prior to what we regard as initializing peripherals, I will call this register CLKEN



... I fiddled with changing the bits in this register, and it seemed like nothing happened when they were cleared. In principle, this is logical, since I do not use any peripherals.



Another register written nearby (literate code usually initializes all clock operations together), which is then not rewritten, is this 8E



. He writes to  0x21



. I suggested that it might be related to speed. I experimented. Apparently, the 4 least significant bits are not reflected in any way at work, so I have no idea why they are set in  0b0001



, but the next three bits, probably, change the CPU speed quite significantly (as far as I can judge from the speed of my UART, subjected to the drift). The most significant bit seemed to change the frequency a little, I assumed that it is responsible for switching between the internal RC circuit and the external crystal. Three bits, which I assumed worked as a frequency divider, set the clock speed to appear to be equal 16M / (1 + )



. I named this register CLKSPEED



. Consequently, the highest speed is achieved at the value 0x01



, and the lowest at  0xf1







Making Timers Work



Many manufacturers build on all sorts of things in the 8051, so there is very little standardization here. However, most do not touch the 8051's normal equipment, such as timer 0 and timer 1. Please note: this is not a rule of thumb. For example, TI significantly changes the timers in its CC series chips. I noticed that in this chip, the registers that are normally supposed to configure standard 8051 timers seem to happen close, and interrupt handler # 1 seems to affect them as well. Is it possible to? Standard timers? I tried it and ... it worked. Completely standard, seemingly exactly the same as the original specification. I checked the register  CLKEN



 and found that bit 0 (mask  0x01



) to make the timers work. Confirmed that the standard register IEN0



 also works as expected, and that numbers 1 and 3 actually drive interrupts for Timer 0 and Timer 1! The timers appear to be running at exactly 1 / 12th of 16MHz, exactly as would be expected in a standard 8051 operating at 16MHz. So far, I have not found how to change this frequency. What we know now reveals registers  TL0



TH0



TL1



TH1



TMOD



TCON



! We now have working precision timers!



I was not too lazy to check if the 8052 standard (sequel to 8051) actually implements timer 2. No, it is not. 



Or maybe UART?



void uartInit(void) {
    // 
    CLKEN |= 0x20;
 
    //  
    P0FUNC |= (1 << 6) | (1 << 7);
    P0DIR &=~ (1 << 6);
    P0DIR |= (1 << 7);
 
    // 
    UARTBRGH = 0x00;
    UARTBRGL = 0x89;
    UARTSTA = 0x12;
}
 
void uartTx(u8 ch) {
    while (UARTSTA_1));
    UARTSTA_1 = 0;
    UARTBUF = ch;
}

      
      





There were several lines in the OTA module. It makes sense that they should relate to something, right? Maybe a debug serial port? This would go well with a board that has the "UTX" and "URX" keypoints. This code was a little convoluted, but it looked like it was storing bytes in some kind of buffer. The code definitely looked like a standard ring buffer. I looked where this buffer is being read. It turned out to be in the handler for interrupt # 0. Oooh, interesting. Could it be a UART interrupt handler? The code seemed to be checking bit # 1 in an area that resembled a status register (register  98



), and if it was set, it read a byte from our ring buffer and wrote it to a register 99



... If another bit (# 0) was set in the aforementioned status register, then it read the register  99



 and inserted the result into ... another circular buffer. Well, this is pretty damn in line with what I would expect from a UART interrupt handler! What do we do next? 



Each circular buffer has two pointers, one for reading and one for writing. It makes sense that they should be initialized before the buffer is used for anything. So if we find where these indices are initialized, then we will probably find where the UART is installed, right? Definitely looks like this. In that function, which initializes the UART, we see that GPIO  P0.6



 and  P0.7



set in function mode,  P0.7



 is put on input, and  P0.6



 - on output. Two more registers:  9A



 and  9B



 are are written with  0x00



 and,  0x89



 respectively. The register that, according to my version, works with states (register  98



) is written as  0x10



, and then bits 0 and 1 in it are cleared. Then CLKEN



 bit 5 is set in  , and IEN0



 bit 0 is set in . That's, in principle, all we need! 



So we name the register  and the register  becomes  . We know that  99



  UARTBUF



98



UARTSTA



UARTSTA



 must be set to 0x10 for this block to work, and we know that bit 0 means the UART has a free byte in the TX FIFO queue, and bit 1 means that the UART has a byte for us in the RX FIFO queue. We know that CLKEN



 bit 5 enabled the clock for the UART and that interrupt number 0 corresponds to the UART interrupt handler. It's just a treasure trove of information. Knowing this, I was able to make a working UART driver in my code and send an outgoing message to the desired "UTX" pin, which, as we now know, is located at port 0 pin 6 ( P0.6



). We also learned that the "URX" key point is connected to  P0.7



, and that this is the RX line in the UART. The UART was sending data at 115,200 bps, 8n1, and was in no way affected by the register CLKSPEED



... So what are these two other mysterious registers that give these magical meanings? 



I tried to tinker with the two remaining registers,  9A



 and  9B



. It quickly became clear what they were for. These are frequency dividers. I substituted a few values ​​to see how they affect the baud rate. It turned out to be simple.  9A



 (hereinafter referred to as  UARTBRGL



) was the low byte, and 9B



 (hereinafter referred to as  UARTBRGH



) was the high byte (the upper 4 bits are apparently ignored). The baud rate is calculated simply as  16M / (UARTBRGH:UARTBRGL + 1)



. This perfectly explains the values ​​that seemed magical - they correspond to 115,200 baud.



Apparently, a small bug is related to the fact that the status bits can be cleared programmatically without affecting the FIFO, so if you accidentally clear the bit that means "there is free space in the TX FIFO" ( UARTSTA



.1), then the interrupt will never occur, and the bit will stay low.



Curiously, these locations match the correct 8051 addresses for  SCON



 and  SBUF



, which are the 8051 serial port registers. Bits 0, 1, and 2 in  UARTSTA



 really fit the descriptions  SCON



from 8051, but that’s where the similarity is over. UART from 8051 requires bits 7 and 6 to be set  SCON



in 0 and 1, only in this way it will become a normal UART. This chip in this case requires 0 and 0. Moreover, the 8051 UART usually does not have a baud divider, instead of which timer 1 is used.



Watchdog timer and "look!"



By this point, the 1 second execution limit guaranteed by the default watchdog configuration was beginning to annoy me. I decided to find out where and how the watchdog is configured. Typically, the watchdog timer is configured as part of its own function, and it is small. Of course, I will not say that this always happens, but most often it looks like this. I had several candidates, and I tried to copy from each in turn the writes of registers into my test program, but the watchdog did not give way. I needed to properly reset the chip every second.



While doing just that, I noticed a very strange function. Apparently, she read the register under the number FF



, wrote something there, then reset P1DIR



, wrote to some other register, and then restored the original value in the register  FF



. The weird thing was that it set ALL  pins on port 1 to pin. This is nonsense. On other models, port 1 has multiple pins configured as input. In addition, such registers are usually operated bit by bit, using instructions  anl



(logical AND) and  orl



(logical OR). Such a rough writing to the entire register at once looked repulsive. What is it about the register FF



that needs to be backed up and restored? It looked very strange! 



I decided to investigate. When dumping the register value to the console FF



, it turned out to be zero, which, of course, did not suit me. I searched the entire firmware and noticed that almost everywhere in it there is a recording, then a backup, and then the original value is restored. I also noticed that writing almost always happens with a value  0x04



and rarely with  0x00



... This register was read only during backup for further restoration; no other actions were performed on this value. What functionality does this indicate? Basically, this is how memory banking controls usually work! When you have more information than you can fit in your address space, you have to switch. This access pattern (backup before change and then restore) is typical for such practical situations. But what can they store? Could this be? Are these madmen overloading the memory space itself SFR



?!



I wrote a program that could display the values ​​of all  SFR



, all 128. Then I turned the bit  0x04



 into  FF



  SFR



and again took out all the space SFR



. Then the program wrapped this bit back and again displayed all the values. God Almighty! And there is! Bit 2 in the register  FF



 really saves space SFR



. I have no doubt seen that when this bit is set, the values ​​that appear change. Apparently, this did not affect ALL addresses  SFR



, but many. I named this register CFGPAGE



.



Now that  CFGPAGE



I thought I was sorted out, I returned to my mysterious function, which zeroed out P1DIR



. Already knowing that it is NOT  reset to zero in this case P1DIR



, but his strange cousin on another page SFR



, I tried to copy this code into my program. Believe it or not, I accidentally stumbled upon a code that disables WDT !!!



Investigated the code surrounding this function, since usually related functions in binaries are located next to each other. There were indeed several functions nearby that also accessed CFGPAGE



 and accessed the adjacent address P1DIR



. After a few hours of trial and error, I fully understood the details of how the watchdog works. On the 4th page of configurations, the address BF



appears to control the enable and reset of the watchdog timer; the most significant bit of this register enables or disables the chip reset function in the watchdog timer. I named it WDTCONF



. Address  BA



 (which is  P1DIR



 on configuration page 0) is the watchdog timer enable register. Bit 0 here enables or disables the watchdog timer itself. I named it WDTENA



.



Up to this point, I was still figuring out how to tame the watchdog timer. It took a while, but in the end I figured it out. A register  BB



 (now named  WDTPET



) can be written to zero to tame the watchdog timer. It took me a few more minutes to figure out how to configure the delay in the watchdog timer, since there was clearly a hole in the address space between BB



 and  BF



... The counter is 24 bits long and is overloaded when tamed. It cannot be read. Reload value saved in WDTRSTVALH



: WDTRSTVALM



: WDTRSTVALL



, located at BE



BD



BC



 respectively, on the configuration page 4. The counter counts  UP  at a frequency of about 62 kHz, and an overflow is triggered. Thus, in order to deliver an increased delay, a smaller value must be written to these reset registers.



More subtle possibilities



Flash memory programming



//    irqs 
voif flashDo(void) {
    TRIGGER |= 8;
    while (!(TCON2 & 0x08));
    
    TCON2 &=~ 0x48;
    SETTINGS &=~ 0x10;
}
 
void flashWrite(u8 pgNo, u16 ofst,
              void *src, u16 len) {
    u8 cfgPg, speed;
    
    speed = CLKSPEED;
    CLKSPEED = 0x21;
    cfgPg = CFGPAGE;
    CFGPAGE = 4;
    
    SETTINGS = 0x18;
    FWRTHREE = 3;
    FPGNO = pgNo;
    FWRDSTL = ofst;
    FWRDSTH = ofst >> 8;
    FWRLENL = len - 1;
    FWRLENH = (len - 1) >> 8;
    FWRSRCL = (u8)src;
    FWRSRCH = ((u16)src) >> 8;
    flashDo();
    
    CFGPAGE = cfgPg;
    CLKSPEED = speed;
}
void flashRead(u8 pgNo, u16 ofst,
    void __xdata *dst, u16 len) {
    u8 pgNo, cfgPg, speed;
    
    speed = CLKSPEED;
    CLKSPEED = 0x21;
    cfgPg = CFGPAGE;
    CFGPAGE = 4;
    
    SETTINGS = 0x8;
    FWRTHREE = 3;
    FPGNO = pgNo;
    FWRDSTL = (u8)dst;
    FWRDSTH = ((u16)dst) >> 8;
    FWRSRCL = ofst;
    FWRSRCH = ofst >> 8;
    FWRLENL = len - 1;
    FWRLENH = (len - 1) >> 8;
    flashDo();
    
    CFGPAGE = cfgPg;
    CLKSPEED = speed;
}
void flashErase(u8 pgNo) {
    u8 __xdata dummy = 0xff;
    u8 cfgPg, speed;
    
    speed = CLKSPEED;
    CLKSPEED = 0x21;
    cfgPg = CFGPAGE;
    CFGPAGE = 4;
    
    SETTINGS |= 0x38;
    FWRTHREE = 3;
    FPGNO = pgNo;
    FWRDSTL = 0;
    FWRDSTH = 0;
    FWRLENL = 0;
    FWRLENH = 0;
    FWRSRCL = (u8)&dummy;
    FWRSRCH = ((u16)&dummy) >> 8;
    flashDo();
    
    CFGPAGE = cfgPg;
    CLKSPEED = speed;
}

      
      





I focused on the OTA image as it is smaller than the main firmware. One detail that is definitely needed in the OTA image is the ability to write to flash memory. What does it look like? It is assumed that we need some kind of function that will erase the flash, since the flash is erased in blocks. You also need a write function that can write a page of data or less. We need some kind of verification of the recorded data. The only detail that differs in the implementations is how we will feed the data intended for writing to the flash controller. I didn't know what it should look like, but the rest was easy enough to find. Verification would probably boil down to just calling memcmp



or cycle. Flash erasing operations wear out the flash memory, so the page must be checked before erasing and then the operation performed. 



Looking for a pre-erase check, I quickly found a function that creates a  0x400



 byte-to  XRAM



-full area of ​​bytes 0xFF



. Then the memory area is  CODE



compared with this buffer, and if they are not equal, then interrupts are disabled, and some are touched SFR



on the configuration page 4. The page size in flash memory is clearly 1024 bytes. Checking what other places are affected by the same SFR



, we find the remaining flash code. It is clear from the context what these registers do and how. In this case, it is interesting how the data is fed to the flash memory control unit. This control block clearly contains a DMA block. An address is supplied to the flash memory control unit XDATA



and data is absorbed directly from there. How cool!



By that time, I was not yet sure how to read INFOBLOCK. Apparently, the OTA code did not concern him, but from somewhere it MUST  be read - after all, there is data in it. I checked the main image and noticed a code snippet affecting the same  SFR



from flash memory, but in a different way. With some more analysis, I was able to reproduce the correct reading of INFOBLOCK. It is curious that the same method can be used to read any other block of flash memory, but there is no need to do this, since all you need to do to read the flash memory is to read the memory area  CODE



. INFOBLOCK is only accessible via the flash memory control unit. For both write and read from flash memory, the control block uses direct memory access (DMA) and writes to  XDATA



.



One register  DF



 ( FWRTHREE



) defied any attempts to explain it. It always had a record with the value 0x03



, I do not know why. My flash access code does the same. Register  D8



 ( FPGNO



) is written with the flash page number. The main pages of flash memory are numbered from 0 to 63, with INFOBLOCK number 128  DA



.: D9



 ( FWRSRCH



:) FWRSRCL



is the source of the DMA block in the flash memory control block. For writing to flash, it contains the address XDATA



where we find the data to write. To read the flash, a byte offset on the original page is looked for, and reading begins at that offset.  DC



: DB



 ( FWRDSTH



: FWRDSTL



) Is the assignment for DMA in the flash memory management block. For writing to flash, it will contain the byte offset on the destination page, and writing will start from that point. To read the flash, the address is used  XDATA



at which the data received during the read is written.  DE



: DD



 ( FWRLENH



:) FWRLENL



Is the length of the data that the DMA block should transfer, minus one.



Writing to flash memory as such is triggered by setting a bit in one more SFR



. Various bits in it are also set to control other code, apparently not related to flash memory, so I concluded that this register would probably initiate various actions. I named this register D7



 on the configuration page 4  TRIGGER



. Completion status is also checked in a register that appears to be shared by other code as well. CF



 I named  this register from the configuration page 4  TCON2



, why not? There was also a register on C7



, also used in conjunction with other code, which apparently configured which operation to perform. I named it SETTINGS



0x30



 was written to it with a logical OR to erase + write,  0x18



 to write a flash,  0x08



 to read a flash. I guessed that the bit 0x08



 means "data transfer pending" 0x10



 means "in flash", and  0x20



 "Erase". This is logical considering what values ​​we see and what operations are performed here.



Reading and writing to the flash worked wonderfully well, but erasing apparently didn't work. Instead of erasing the page with the given code, for some reason, the page on which the code requesting erasure was located was erased all the time. Obviously, this problem was not in the code that was contained on this device, I was doing something wrong. Checked, checked, and checked again to make sure my code matches the factory code. Matched. What's wrong? I worked for several days until I realized that the factory code works at 4MHz, and mine at 16MHz. Could this be the point? It turned out exactly so! I changed my flash erase code to keep the current frequency divider and slowed down the clock to 4MHz for the duration of the flash erase. Went ok as this code is already running with interrupts disabled.



Another subtlety of this flash memory control unit is that it apparently does not provide for a simple "erase" operation. I thought about assigning the appropriate if-bits in the register  SETTINGS



, and then it seemed logical to me that when set to 0x20



 or  0x30



 , a simple erase should occur. The only way to erase this is to perform an erase + write operation, which writes at least one byte (since there is no way to represent a zero length in FWRLENH



:. FWRLENL



To perform a simple erase, I simply ask to write a single byte 0xFF



. It works



SPI



Basically, all SPI drivers are the same. A byte is received at the input, a byte is returned at the output. Of course, some have DMA and some are interrupt driven, but 99% of them in small systems are software controlled, and somewhere there is a simple function u8 spiByte(u8 byte);



.



It was logical to look further into SPI. Since we know that it SSD1623L2



 communicates with SPI, and we also know the details of organizing such communication, we just need to look at the code and find out which part of it should do this operation. Just like in Sudoku, given how much we already know, this search won't be difficult. Looking at the datasheet SSD1623L2



 we see that the register number of the first sent byte is written in bits 1..6, and the "write" bit is at position # 7. All registers are 24 bits long. It is logical that the programmer will write a code that will take the register number as a parameter, shifting it to the left by one, perhaps logical-or-in 0x80



, if a write is requested, and then transfer three bytes. Not all programmers act logically, but this assumption helps immeasurably in reverse engineering. Looking at the code, it's easy to see the functions that look like they do just that. Some add 0x80



, some don't. They all call that same mysterious function for every byte. So, we assume that some display text on the screen, some read. Let's tackle the mysterious function itself.



In fact, everything is very simple here. It switches  CFGPAGE



 to 4, then writes the ED



 value  to the register  0x81



, writes the byte to be sent to EE



, writes  0xA0



 to  EC



, makes a delay of 12 microseconds, sets bit 3 to  EB



, reads the received byte from  EF



, stores  0x80



 to  ED



. That's all. How to comprehend all this? As before, relying on what is already known.



0x80



 and  0x81



 differ by only one bit, and we set it before starting the SPI operation, and at the end of the work we reset it, so this is, apparently, an "activating" bit of some kind. On the other hand, the meaning  0xA0



 literally  sounds like a  configuration of some kind. The register  EB



 is still a mystery. But, if I reproduce this code without writing to it, everything will work, so I conclude that not much depends on this register. Definitely EE



 this  SPITX



and  EF



 this  SPIRX



. I called  ED



 -  SPIENA



 and  EC



 -  SPICFG



.



It remains to characterize what the beats do in SPICFG



... I did a bit of trial and error, armed with a logic analyzer. Bit 7 must be set, bit 6 must be cleared. Bit 5 starts the transmission of the SPI byte and clears itself when finished with it. Bits 3 and 4 set the clock frequency, you can choose from values: 500KHz, 1MHz, 2MHz, 4MHz. 2 is the standard configuration bit CPHA



 for SPI, bit 1 is  CPOL



. Bit 0 appears to violate RX. I am assuming he can configure the block for half duplex (in line  MOSI



). In general, it is not so difficult.



Pin by pin, quickly find the GPIO configuration and see what P0.0



 this is  SCLK



P0.1



 this  MOSI



 and  P0.2



 this  MISO



... By looking for where these GPIOs are configured, we also see how the CLKEN



 SPI bit is needed  : that's bit 3. Great - we now have a working SPI!



Determine the temperature 



volatile u8 __xdata mTempRet[2];
 
void TEMP_ISR(void) __interrupt (10)
{
  uint8_t i;
  
  i = CFGPAGE;
  CFGPAGE = 4;
  mTempRet[0] = TEMPRETH;
  mTempRet[1] = TEMPRETL;
  CFGPAGE = i;
  IEN1 &=~ 0x10;
}
 
int16_t tempGet(void)
{
  u16 temp, sum = 0;
  u8 i;
  
  CLKEN |= 0x80;
  
  i = CFGPAGE;
  CFGPAGE = 4;
  TEMPCFG = 0x81;
  TEMPCAL2 = 0x22;
  TEMPCAL1 = 0x55;
  TEMPCAL4 = 0;
  TEMPCAL3 = 0;
  TEMPCAL6 = 3;
  TEMPCAL5 = 0xff;
  TEMPCFG &=~ 0x08;
  CFGPAGE = i;
  IEN1 &=~ 0x10;
  
  for (i = 0; i < 9; i++) {
    
    // 
    IEN1 |= 0x10;
  
    // 
    while (IEN1 & 0x10);
    
    if (i) {  //  
      
      sum += u8Bitswap(mTempRet[0]) << 2;
      if (mTempRet[1] & 1)
        sum += 2;
      if (mTempRet[1] & 2)
        sum += 1;
    }
    
    timerDelay(TICKS_PER_S / 1000);
  }
  // 
  CLKEN &=~ 0x80;
  
  return sum / 8;
}

      
      





E-Ink displays update differently based on the current temperature, so knowing the ambient temperature is critical to update them correctly. The correct waveforms are selected depending on the temperature. Here knowledge from the outside will come in handy. So if we can find where the waveforms are loaded into the display controller, we can find where the choices are made. From this place you can walk directly to the point where the temperature is measured, right? Having done this, we go to exactly one function, the output of which determines which waveform will be used. This must be it! By the way: usually temperature sensors are attached to the ADC - almost no one makes them in a separate version. But it doesn't matter [yet].



It all starts with setting bit 7 to  CLKEN



and ends with its reset, so that at least we know that this is how we turn the temperature sensor (or ADC) on and off. The function switches  CFGPAGE



 to 4, then writes a series of values ​​to a series of registers. All values ​​are constant. 0x81



 -> reg.  F7



0x22



 -> reg.  E7



0x55



 -> reg.  E6



0x00



 -> reg.  FC



0x00



 -> reg.  FB



0x03



 -> reg.  FE



0xFF



 -> reg.  FD



, then the bits are  0x81



 flushed to  F7



. Thereafter  CFGPAGE



 recovers and then clears bit 4 in the register A1



. This seems to be the initial setup. After a certain procedure occurs five times, the results of all operations except the first are averaged. After that, a lot of math is performed on the average obtained in this way, in particular, using the values ​​from INFOBLOCK - these are probably calibration values. The result is then returned. Let's take a closer look at the details.



In the process, bit 4 in the register was simply set A1



, the global bit was set and then in the active standby mode we spend time until the bit is cleared. The specific averaged values, apparently, are taken from some global one. This is weird ... I looked for where it is written and found it in interrupt # 10 handler. Apparently, this was how bit 4 in the register was cleared A1



, then the switch to configuration page 4 occurred, the values ​​were read from the registers F8



 and  F9



, and some strange things were done with them, and then this global value was written. But what is done with these values? 



I was just in the eyes pricked constants  0x55



0xAA



0xCC



and  0x33



... Is this possible? Could someone be so blunt that ... well, yes. These are constants for a clever way to reverse the order of the bits in a byte. Tricky, but only on more advanced processors. On 8051, this approach is very ineffective. But why? It seems that whatever IP (command pointer) they license to measure temperature, it produces a result in which the bits are in reverse order. Why this problem should be resolved at the software level of a proprietary chip is a big question. After all, reversing the order of bits in hardware is no more difficult than reordering a few wires ... What does it do? I do not know. In fact, I never got it. 



Almost no one designs a dedicated command counter for a temperature sensor, this thing is simply plugged into the ADC. Once I was able to re-implement this code and make sure it worked very well, I tried to change all of these registers. Most of them influenced the gain of the temperature sensor, some had no effect. If this were a normal ADC, we would expect some bits to switch it to a different kind of input and give a completely different value. Unfortunately, this did not happen. It really looked like a normal temperature sensor. This is also confirmed because these registers are not touched anywhere else. Weird as hell, but okay ... 



Since almost all of these registers are written only once, and these are the values, and changing them affects the measured value, I decided to simply call them all temperature calibration values. Therefore, we get acquainted with TEMPCAL1



 (reg.  E6



),  TEMPCAL2



 (Reg.  E7



),  TEMPCAL3



 (Reg.  FB



),  TEMPCAL4



 (Reg.  FC



),  TEMPCAL5



 (Reg.  FD



) And  TEMPCAL6



 (reg.  FE



). I named  it since it is used a number of times and seems to actually manage the loading of the calibration value. The results are issued in   (reg.  F7



  TEMPCFG



TEMPRETH



F8



) and  TEMPRETL



 (reg.  F9



). Results are 10 bits in length, aligned to the upper end of a 16-bit result register, with reversed bit order.  



I also noticed that bit 3 in is  TEMPCFG



 set when the sample finishes creating. Curiously, the factory code doesn't check it, relying instead on the interrupt. But, in fact, it came  in handy in deciphering the purpose of the register A1



. As you can see, the classic 8051 is limited to 7 interrupt sources, since we have 8 bits in the register  IEN



and bit 7 is reserved to activate a global interrupt. So how do you manage interrupts numbered 7 and up? In fact, it’s like the wild west, what you want is what you do. But here we have a hardware element that triggers interrupt number 10, and using a bit, we can determine when it was made. This is great for experimenting. in which we want to know how interrupts above 7 are activated and deactivated. It was just necessary to tinker with this code until you get rid of the interrupt, but the sample is created . The search did not take long. It must be it A1



! I named him  IEN1



... I'm not sure what the function of bit 0 is here, but bits 1 and above control the activation of interrupts number 7 and above. I was able to confirm this later. So done - we've documented yet another peripheral, thus discovering even more oddities ...



I2C



At this stage, I opened a larger e-Ink price tag equipped with the same chip. It was a 2.9-inch model with an e-ink graphic display and NFC !!! Again, third-party knowledge comes in handy here. Most NFC devices will tell you exactly what they are if you ask politely. This is a good thing, as the NFC chip on the board was too small to be properly labeled. After scanning it using NFC and checking the device ID, we find out that it is NXP NT3H1101 (archived copy  here  for posterity). From this very convenient page you can download the datasheet - and it immediately becomes clear how communication with this chip should proceed. Helpful information! (All information is useful here). The only annoying thing is that the I2C address of this device is not fixed, but it can be set to any value; however, a default value is provided. The alphabet of reverse engineering: in 99.9% of cases, the default values ​​do not change. I bet the default I2C address hasn't changed either!



Finding a binary analogue for is  0x55



 quite easy - this value is not so common. Apparently, they are all made before calls to one of the two functions. It makes sense that they should be connected to I2C. Moreover, in all cases, before these calls, bit 4 is set in CLKEN



which is then discarded. We now know that I2C is activated through this bit. Let's take a look at what these functions do. Some copy data from the provided parameter at the very beginning, some do it at the end. In the middle, they all write some global stuff, set the global bit, clear bit 4, and set bit 5 in the register 95



and wait for it to be cleared. Hmm, works like a temperature sensor. Apparently bit 2 in the IEN1



 interrupt activates.



Let's see where the interrupt handler that affects these global values ​​is located. Indeed, its interrupt number is 8, as expected. It sets CFGPAGE



 to 0 and then reads the register 91



... The least significant 3 bits are ignored, and the remaining bits are used in the switch-case to decide what to do. This code turned out to be a little confusing, so I decided to experiment. Attached the logic analyzer to the lines going to the NFC chip and quickly found where SDA



and where  SCL



. It was easy because there is a datasheet for this chip.



It seems that clearing bit 4 in the register  95



 will not affect anything, but setting bit 5 causes the START condition on the bus to be true. An interrupt is triggered. If you do the same using the built-in handler and read the 5 most significant bits in the register 91



, we see that they have a value 0x08



... The address byte is then stored with the R / W (read / write) 94



bit in the register  , and bit 3 in the register is cleared  95



. It should also be noted that ALL paths through this interrupt handler result in bit 3 being cleared in the register 95



. I guess this is the "bit that needs to be interrupted." I haven't figured it out yet, but we can already name some registers. It seems that all the I2C registers are on config page 0.



I'm going to call  because it is I2C that it contains and is never read for any other reason. I have never seen the least significant three bits change or in any way used.  - so I will call  91



  I2CSTATE



I2CBUF



94



, since the data is pumped through it along the conveyor, and  95



 in the future it will be named  I2CCTL



, since in order for things to be done, something needs to be written into it.



We dig further and find that when the address byte is sent, one of four status values ​​can be obtained. If the address byte we sent required write access, then the state will be  0x18



if it was acknowledged (ACK), and  0x20



if not. If the address byte we sent required read access, then the state will be  0x40



if it was acknowledged (ACK), and  0x48



if not. The handling of NAK (byte unacknowledged) is quite straightforward. When bit 5 is set to  I2CCTL



 the STOP condition on the bus is true.



Sending data in write mode is easy. The byte is simply written to  I2CBUF



. If the sent byte is acknowledged (ACK), then the state will become, 0x28



and if not, then  0x30



. To provoke a restart, set bit 4 to  I2CCTL



 - it works. When the execution of the RESTART command on the bus completes, the state becomes  0x10



.



If we want to read the information, then, after sending the restart bit and the address byte in read mode, as soon as we see the status 0x40



, we can decide how to respond to the next byte we receive - ACK or NAK. To acknowledge it (ACK), set bit 2 to  I2CCTL



, and in order not to confirm (NAK) - we clear this bit. With the return of the handler, the byte will be received. When this is done, we will see the status 0x50



if the byte was confirmed, and 0x58



if it was not confirmed. One way or another, the I2CBUF



 received byte will be contained in. 



After reviewing the initialization code and tinkering with our copy, we find that bit 7 in  I2CCTL



 controls whether the peripheral device will trigger interrupts. If not, then this register is initialized to  0x43



... I assume this is how the block is configured to operate in master mode. Since I do not have a sample code for the slave mode, I did not investigate this question further, but I am sure that the slave mode is supported. It can be done, but I'm lazy :).



The register  96



 also recorded information in the initialization time, and then no longer changes. This correlates well with one bit of information that we still lack - indicating how the clock speed is set. Having experimented with this register (which is now called  I2CSPEED



), we see that it has a complex interdependence with the clock frequency, but after several dozen attempts I came to the following:  rate = 16MHz / ((dividerB ? 10 * (1 + dividerB) : 12) << dividerA)



where dividerA is the three least significant bits I2CSPEED



and dividerB is the next 4. The most significant bit is apparently not used.



The fact that the initial GPIO setup occurs near the initialization point of the peripheral seems to imply that pins P1.4



 and  are important in this case P1.5



.



Everything worked, but there was one secret. When the interrupt for this block was activated (c  IEN1



), bit 2 was also set in the register A2



. Since it  IEN1



 is located at the address  A1



, I suspect it has to do with an interrupt. I still haven't figured out exactly what it does, and no code other than the initial I2C setup code uses it. I previously named it I2CUNKNOWN



although it is more likely to be interrupt related than I2C related. Anyway, my code can now perform I2C transactions as a master!



Pin change detection



The price tag firmware woke up when it was scanned with an NFC-enabled device. The onboard NFC chip has a "field detection" pin connected to the main microcontroller. Coincidence? NotI think! There must be a way to detect changes on the pin. It even wakes up the chip from sleep mode (power saving). In addition, it takes some time to draw with electronic ink, and during this wait, the chip should probably continue to sleep. The display will signal the end of drawing by changing the "BUSY" signal. So ... we have two cases in which the CPU must detect a change on a pin and, most likely, we are not talking about an active wait cycle. It would be difficult to find the first described case - I still did not know exactly where this hibernation code is. The second case, on the other hand, was very easy to find - I mean, it was easy to find the code for drawing on the screen. Again, building on existing knowledge is helpful here. I knew,which team is responsible for "refreshing the screen" on virtually all e-ink display chips in existence. I just entered it and saw what would happen. There was a lot of code, many were touched  SFR



... I started experimenting with the few that I saw. Made some educated guesses: All pins should be able to trigger change detection. This is not always the case, but an educated guess is usually drawn. I assumed that whatever configuration registers we were talking about, they would be sequential and work with three ports. I also assumed that changing the pin should provide an interrupt, and not just wake up the device. It makes sense that the number of configuration registers is fairly predictable. For each pin, we need ENABLE, STATUS and, most likely, DIRECTION. In addition, registers related to GPIO change detection are likely to be close to other registers configuring GPIOs.



Based on this, I did some experiments, since I could easily switch at least some of the pins (for example, TEST). Also took some time to see how my current map is developing SFR



. I have not forgotten to look at the registers BC



BD



and  BE



 on the configuration page 0. Several experiments have shown that they control the pullup of each pin. True, I have never seen any configurations that would allow "pulling the pin down". I named them PxPULL



.



After several experiments, it became clear that there are three registers per port, and they control interrupts when the pin changes.  PxLVLSEL



( A3



A4



A4



) selects the desired level (0 = high, 1 = low).  PxINTEN



( A6



A7



A9



) Provides change tracking pin at the hardware level.  PxCHSTA



( AA



AB



AC



) Stores the detection status (bit set = something has changed). Other experiments showed that the interrupt number when changing the pin is 11. Works well, and I even managed to wake up the chip from power saving mode (more on this below).



Second DPTR



Registers  84



 and  85



 mysteriously save up amid all swap transactions CFGPAGE



 and keep all 8 bits stored in them. In many variants of the 8051, this is where the second register should be DPTR



. But, if so, how do you switch  to it? Everyone does it differently. I decided to try it. Wrote a program in assembler to reverse each bit in each register in turn and check if writing an integer DPTR



 (special instruction) followed by reading DPL



 and  DPH



 (normal access to  SFR



). It is predictable that many of these things cannot be switched so easily without crashing the program. But, having practiced carefully skipping one or the other, I isolated bit 0 in  92



. Well, yes ... That's what he does. As with many 8051s, I named this register  DPS



, which means "data pointer selection". Registers 84



 and  85



 I named, of course,  DPL1



 and  DPH1



.



Other experiments.



Some experiments have shown that the two least significant bits in PCON



(standby and sleep) work as expected in sleep mode for the 8051 (although sleep in low power mode can be configured as well). I also noted that setting bit 4 is deactivated  XRAM



. This saves some more energy in sleep mode! 



Registers in the range B2



.. are interesting  B6



. They appear to vary depending on the instructions followed in their location. Having carefully considered everything, I realized that B4



: B5



 it is always up-to-date  PC



!!! Why someone might need it - I don't know. Named them  PCH



 and  PCL



... They are read-only. But what about other registers in this range? B2



 and  B3



appears to be associated with conditional jumps. On a long jump (such as when running ljmp



lcall



or  ret



), they seem to store the destination of the jump. With short transitions (such as  sjmp



),  B2



 it seems to figure out the displacement. Strange things, but useless, so I didn't go into them any further. I named the rest of the registers  PERFMONx



.



Sleep in energy-saving mode



People are people, and nothing human is alien to them. People love round numbers. I like accuracy, even if I don't need it. This helps a lot with reverse engineering. For example, how do you respond to a constant  0x380000



? None? Perhaps. How about 0x36EE80



? The eyes are already clinging to her. What the hell does that mean? Translate it into decimal system and you see: 3,600,000.  Well , this is an hour, expressed in milliseconds. This value can be useful, perhaps, only in the case of a long sleep in an energy-saving mode. I’m tired of counting how many things I “reverse engineered” by relying on constants of this kind that shed light on where the dream is realized! 



Here are the constants on this device were passed to the function of interest to me: 1 5000 2 000 5 000 10 000 3 600 000 1 800 000 0xffffffff. It is quite understandable that this is an indication of the duration in milliseconds. The latter is probably a stub for "forever or almost forever." 



There was almost no chance of understanding what most registers are doing here, since they are used by code almost exclusively in sleep mode. Some were in SFR



and some were in space  MMIO



... I was able to copy the code and reproduce it. In particular, I was interested in that the sleep timer can work at two speeds: with a frequency of 32KHz and 1Hz. This is a 24-bit timer, with which the shortest possible sleep lasts about 30 ms, and the longest can last about 194 days! Read more in the SDK.



Radio



Radio usually requires extensive configuration, so SFR



 it is too crowded in a dense space . Most 8051 equipped with radios are used to solve this problem MMIO



. Memory-mapped I / O in the 8051 is usually just mapped to the address space  XRAM



. Looking diagonally through the code, I realized that the radio on this chip is in  MMIO:df00 — MMIO:dfff



.



RX path



Again, I decided to start with the OTA image. It is small enough to simplify analysis. It soon became clear that the OTA image does not send any radio packets, but only receives them (acknowledgments are automatically sent at the hardware level, which is typical for most ZigBee chips). But it's good! Thanks to this, it is enough for us to analyze only half of the driver, which means that the task is twice as simple as possible! 



When I started looking for where the OTA code gets the data, it seemed like there was a buffer queue. What it is: It is a queue containing individual bytes, each of which is a pointer to a list of buffers. The code that seemed to receive packets and process the received packets took the buffer from the queue, processed it, and then put it in another queue. A very simple scheme. One queue stores buffers full of received data, another queue stores empty buffers ready to receive new received data. Clear enough.



Looking around a little, we quickly discover where the queues are accessed in a different way: removing the buffer from the "empty" queue and enqueuing the full ones. This is the handler for interrupt # 5! The interrupt handler itself was quite simple, provided that the bit was set TCON2.2



, it saved  0xC6



 in  MMIO:df48



, dequeued the buffer, copied bytes into it and put it in another queue. But where did he copy the bytes from? Where did you get the length of the copy? Both were taken from the buffer XRAM



in which he did not write! I have never been able to unravel this mystery.



The search did not end there. Interrupt 4 played a key role. Its handler turned out to be even simpler. He tested bit 5 in  MMIO:dfad



 (I'll call it  RADIO_IRQ4_pending



and, if set, it calls a procedure not called anywhere else. This procedure read , checked that the value in it was less than or equal to 128, read  , checked that with an increase by one, it would become equal to the previous value. If any of the above was not performed, then it saved   in  , otherwise configuration page 4 was selected, the first read value was stored in a global variable, which further denoted the length. This value minus one persisted  , and the pointer to the buffer from which subsequently copied data stored in : . Then bit 2 was set in  . SFR



  FA



MMIO:df98



0xC6



MMIO:df48



D5



D4



D3



TRIGGER







Here, again, knowing the context helps. 127 is the maximum value a valid 802.15.4 packet can have, and this length includes a 2-byte cyclic redundancy check (CRC), but does not include the length of the byte itself. Therefore, my guess is that FA



 this is the resulting length (taking into account the byte length and CRC). I named it  RADIO_GOTLEN



. In such a case, it makes sense that the  MMIO:df48



 (now named  RADIO_rxFirstByte



) could be the first byte (length byte) received. With all the remaining registers is clear:  D5



 it is the length of the DMA for RX DMA (now called  RADIO_RXLEN



D4



: D3



 it is disassembled into parts pointer to the destination RX DMA ( RADIO_RXPTRH



 and  RADIO_RXPTRL



 respectively).



Then it all worked out. Interrupt number 4 is triggered as soon as the radio receives a packet into the internal buffer. Bit 5 set to  RADIO_IRQ4_pending



 (this is now called  RADIO_IRQ4_pending



) tells us that this happened. We proceed with the initial inspection of the packet (making sure that its length is within reasonable limits), and then we run the DMA from the internal buffer to XRAM



, if all is well. If not, then we write 0xC6



 in  MMIO:df48



. Logically, this can be compared to "emptying the RX FIFO", hence this register is now called  RADIO_command



. If everything was fine with the packet and the DMA operation completed, then bit 2 is set in  TCON2



, and interrupt 5 is triggered. Here, again, we write "emptying RX FIFO" to  RADIO_command



. This is useful since we have already pulled the data using the DMA method. Then the data is copied and the job is done! 



In most radios, the received cyclic redundancy code is not provided at the higher layers - it is simply checked and returned with a single status bit with a yes or no value. As usual, it is advisable to assume that everything is working "normally". You check - it is really regular. Most ZigBee radios instead report the LQI (Radio Link Quality Indicator) and RSSI (Received Signal Strength Indicator) in these two bytes rather than the CRC. In this model, the radio works in much the same way. Almost. Apparently the first byte is always 0xD0



but the second seems to actually contain the LQI (in the least significant 7 bits) and the CRC status (in bit 7). In fact, it is functionally very similar to how the Chipcon radio works. The command 0xC6



 also means "empty RX FIFO" for chipcon radios (now TI)! Many other things are not the same, but the commands are OPPOSITE , and it helped me navigate the other elements of this radio stack!



More about radio



If you look at how the OTA code initiates the radio, you can see that LOTS of registers are touched only once, some values ​​are written in them, which seem to be completely random. Most likely, many of them are gauge. Any register which is written to once (or repeatedly, but the same value is entered) is a calibration register. I'll skip the boring details of the registers involved, but I'll talk about the working initiation code that is in the SDK.



Here, again, we observe how many values ​​are written to the register  RADIO_command



... The recorded values ​​match those that we would expect to see if we worked with the values ​​of the chipcon commands, although we can see some values ​​that are not in the chipcon radio modules. So, either this radio is a rare chipcon bastard, or they both descend from a common ancestor. In any case, this situation helps to understand some more commands issued by them.



Reproducing the initiation code and writing interrupt handlers, like those built into the chip, gives us a working binary that can work for reception and is conducive to experiments. Noticing some more registers that the main firmware writes to, I quickly determined that MMIO:df88 — MMIO:df8f



 this is "my long MAC address", which will be used at the hardware level to filter incoming packets. Similarly,  MMIO:df90 — MMIO:df91



 sets the "own PAN ID" for the RX filter. A  MMIO:df92 — MMIO:df93



 sets "own short address". This equipment will accept and acknowledge (ACK) any packet sent to our broadcast addresses.  MMIO:dfc0



 sets the radio channel in standard 802.15.4 (11..26) numbering.



Since the radio will acknowledge the packets, I was also able to find that the MMIO:dfc9



 transmit strength is being adjusted when adjusting  . I think it's about the register setting the TX power. I also noticed that when a channel is set in the main factory firmware, two more registers are written with per-channel values. There is only one such register in the OTA firmware. The one related to RX is called MMIO:dfcb



, and the related to TX is called MMIO:dffd



... Easy enough to reproduce and understand. Then it's time to figure out TX!



Let's send some bytes!



After decrypting the data fetch path, I moved the function and register names into my disassembled master image. Looking at what is still unmarked, we can see where TX's path lies. Indeed, there are two more buffer queues here: one full of empty TX buffers ready to use, and the other full of “used up” TX buffers ready to be sent. I found the transfer function very quickly.



In 802.15.4, it is customary to listen to the radio channel before transmitting. This operation is called CCA (Channel Idle Assessment). Before we do anything with the data we are about to send, consider a loop that reads  MMIO:df98



 and checking bit 0. If it is set, then the function fails, and the timer is set to retry. I think this is the CCA path. If we see zero in this bit 128 times, then we consider that the channel is free. 



The transfer function itself turned out to be depressingly simple: you select configuration page 4, the desired length (not including the length byte or CRC), and everything is written to CD



. A pointer to a buffer in  XRAM



 written in CA



: C9



. The buffer starts with a length byte.  RADIO_command



 loaded with value  0xCB



. There is no such command in chipcon radios, but I guess it means "load TX FIFO". Then bit 1 is set in  TRIGGER



... I suppose this is how DMA access to the internal TX FIFO radio queue is started. Then  MMIO:dfc8



 set to  0xFF



, 255 attempts are made to wait for TX to end, checking that bit 7 in   MMIO:df9b



 (now called  RADIO_curRfState



) is set. Then, after a short delay, it is  MMIO:dfc8



 set to 0x7F



. Curiously, I have no idea why it is being recorded MMIO:dfc8



. In my code, I tried to do without it and everything worked fine.



Tails 



After experimenting a bit, I discovered some tricks that the factory firmware cannot do. Bit 6 in RADIO_IRQ4_pending



 is set after we "TX" the packet and the acknowledgment delay (ACK) expires. If we actually receive an ACK, then bit 4 will also be set. Therefore, it is easy to determine (1) when we actually sent the packet and (2) whether we received an ACK. Cool!



Also, if bit 4 in  RADIO_IRQ4_pending



is set and bit 5 in is RADIO_curRfState



not occupied, this means that we are in the process of receiving a packet. We need to select the RSSI manually, for which we read MMIO:df84



 (now  RADIO_currentRSSI



). It has an offset of about 56 dBm.



I also noticed that bit 1 in TCON2



set upon completion of TX DMA (but not necessarily the TX process itself). Bit 0 in is  TCON2



set when the radio initialization ends.



Unsolved mysteries



ADC / Battery Measurement and AES Encryption Engine 



It makes sense that there should be some way to measure the battery voltage, but I haven't found any trace of any similar code. Without code that uses the ADC in this way, the chances of finding this vanishing method are slim. The AES block is, in principle, the same as the ADC. I know there is an AES acceleration block in the chip (needed for ZigBee). But since the actual code doesn't use it, I don't see a way to find it.



miscellanea



Things that we cannot find, but which we do not really care about, since we cannot buy this chip: IR LED controller, PWM unit, DAC. I will leave these things for the reader to exercise on their own.



ZBS242 / 3 Pinouts, Features, SFR, Downloads 



Download  ZBS24x SDK .







  • The shaded cells indicate bitwise addressable registers 
  • The diagonally shaded registers that are not stocked in the bank CFGPAGE



  • Vertical shaded registers, which, apparently, do not appear on any of the pages at all. 
  • Empty cells are unknown registers
  • Names of RADIO registers begin with the letter "r" 






Lessons for a beginner reverse engineer



  • Read the materials for at least a few hours or days before starting work.
  • - . -, . 
  • . SPI , I2C , . .
  • , – ( ). 
  • : , . , , .  
  • , , . 
  • . - , - . , .
  • - . , , . 
  • , , . , , .
  • - . . 
  • - , , - . 







Cloud servers from Macleod are great for hosting websites.



Register using the link above or by clicking on the banner and get a 10% discount for the first month of renting a server of any configuration!






All Articles