Blackmail Roulette: The Risks of Electronic Shelf Labels for Retail and Critical Infrastructure

Electronic shelf labels (ESL tags) are frequently used in supermarkets as price tags, as they allow for convenient and quick price changes. Thus, their market adoption is predicted to grow exponentially.

Other use cases include hospital storage for medical equipment or drugs, room labels, or organizing court documents in archives. During our research, we analyzed the unknown micro-controller (MCU) of the SUNY ESL tag, which is a common Chinese ESL tag vendor, gained debug access and reverse engineered the proprietary 433 MHz radio-frequency (RF) protocol. As no authentication is used, we were able to update any ESL tag within RF range with arbitrary content. This research was performed in the SEC Consult Vulnerability Lab by our hardware security expert Steffen Robertz and has been presented by him at multiple conferences, including Hardwear.io USA 2022, Australian Cybersecurity Conference 2022 and BSides Vienna 2022. Additionally, the talk will be presented at this year’s Thotcon conference in the US. 

Introduction

ESL is an abbreviation for "electronic shelf label". An ESL tag usually consists of an E-Ink display, a battery and some form of wireless transceiver (e.g., Bluetooth LE or proprietary RF protocols). These tags are deemed to be the next big thing in retail as they offer convenient price changes and can interact with customers by flashing lights and drawing the customers' attention to specific products. Multiple business studies back up this claim by projecting a market growth of over 600% within the next 10 years. 

 

As the market will grow, ESL tags will become a more attractive target for hackers as well. Possible attack scenarios include the obstruction of a retailer, e.g., by deleting all price tags until a ransom is paid. Alternatively, one could also target the customers via phishing directly. Imagine the following scenario: You walk into the grocery store. Underneath your favorite item is a large ESL tag that is blinking and showing you the price as well as a QR code. The code text on the tag promises you a 50% discount if you scan it. Most people would not think twice and just scan the code. It is an official price tag of a trusted store after all. They would not know that they just visited a malicious website until it is too late.  

We were also able to identify some industrial use cases where a slightly different version of the tag is being used, but the vulnerabilities remain the same. For example, the tags are used in hospital storage units, in electrical power grid spare part warehouses or in court rooms for archiving. This opens up many more interesting attack scenarios, including relabeling of power grid spare parts. If the installing electrician is not paying close attention to other part markings (e.g., max power ratings), wrong parts could be installed and thus destabilize the power grid of the building. Or think about hospital storage units and potentially lethal dosage of medication. During our research, we even found a picture of a Chinese court room that uses the vulnerable tags to organize the court document storage. By mixing up all the tag contents, it would be hard to find the right documents for a court hearing and hence allowing to prolong or even dismiss cases

The full technical security advisory detailing the identified security issues can be found here

Picture of ESL tag, produced by Zhuhai Suny Technology.
Fig. 6: Close-up of an ESL tag

The Setup

For our research, we picked a common ESL tag, produced by Zhuhai Suny Technology, a large Chinese supplier (see figure 6). The tag contains an E-Ink display, a battery and communicates via a proprietary 433 MHz protocol. The store manager can update prices by connecting the base station (Figure 7) via Ethernet to a Point-of-Sales (POS) system or Windows machine. 

The attacker is using a HackRF Software Defined Radio (SDR) to sniff the communication and to interact with the tags. 
As we did not have a POS system, we only used the provided Windows tool of the vendor's software development kit (SDK) (Figure 8). It creates pixel graphics for the different tag sizes and then sends them to the base station via Ethernet. The base station will then send the image via a proprietary protocol to the tag. The tags are identified by a 3 Byte hex code, which is written on each tag.

Hardware Analysis

The tag's enclosures were shut via ultrasonic welding. Thus, we had to use a cutting wheel on a rotary tool to cut open the case and remove the PCB. Most of the PCB was no surprise. It consists of an RF transceiver, an SPI flash memory, an LED and a micro-controller. However, the MCU did not seem to be coming from a large manufacturer, as it was only labeled with “ETAG-M1”, and no datasheets could be found. The backside of the PCB was empty, besides 4 test points that looked like a debug header. 

Fig.12: Entropy graph of the flash memory.

We first dumped the contents of the 2MB flash memory to start the analysis. The entropy showed that it was not encrypted (entropy below 1). Furthermore, we were able to identify the current pixel graphic residing at memory offset 0x1e1000. The beginning of the flash seemed to be consistent across multiple tags and did not change when adjusting the tag’s content to a different graphic. Hence, it looked like firmware code.

In order to improve the analysis, we had to determine the MCU’s architecture first. We tried to fingerprint the MCU by reverse engineering its pin-out and comparing it to common MCUs by larger vendors, e.g., NXP, STM and Microchip. Unfortunately, we were not able to find a chip with a matching pin-out for the power and external oscillator pins.

Figure 13: Reverse engineered pin-out of the unknown E-Tag M1 micro controller.

So, we decided to dig one level deeper: We boiled the chip in 95% concentrated sulfuric acid in order to remove the epoxy packaging and then cleaned the die off with isopropanol (Figure 14). This finally revealed that the chip was manufactured by ST Microelectronics (Figure 15) and was most likely designed or manufactured in 2010.

STM mostly deals with ARM-based micro-controllers. However, they also developed their own STM8 architecture. When looking at the four test pads on the back of the PCB, only SWD or SWIM would fit the pin count. SWIM is the proprietary debug interface of the STM8. Looking through the default pin-out of the SWIM debug header, we noted that the test pads were ordered in the same way on the back of the PCB. Thus, the following pin-out can be used to access the MCU:

  • Pin 1: RESET
  • Pin 2: GND
  • Pin 3: DATA
  • Pin 4: VDD

By combining the stm8flash Linux tool with an ST-LinkV2 programmer to connect to the target, we could read out the flash contents starting at 0x8000 (default address of flash memory for STM8 architecture). To our surprise, the flash dump only contained the value 0x71. This indicated that the read-out protection (ROP) was enabled. However, we found a way to derive information about the MCU with a simple trick. As soon as we exceeded the MCU's flash boundary, we no longer received the value 0x71. By simply counting how often we could read that value, we concluded that our unknown MCU had 64kB of flash and 2kB of EEPROM. Furthermore, we were able to read the chip's unique ID from register 0x4926, which indicated an STM8L series device. The chip ID was derived from the lot number, which enabled us to trace the chip and to determine the exact model.

RF Analysis

We started the RF analysis by looking up the tag’s FCC ID. The public FCC database usually contains a lot of useful information, such as the used modulation and the frequency. However, something in the information was off as the base station and the tag seemed to use a different modulation. Therefore, we took all the information with a grain of salt. When sniffing for the signal, we found out that it was not located at 433.92MHz, but rather at 433.254MHz. After capturing some samples and analyzing the signal with the inspectrum tool, we observed that GFSK (Gaussian Frequency Shift Keying) modulation is used by the base station as well as the tag.

Proof of concept video. Click to watch on YouTube
https://www.youtube.com/watch?v=hj_ao25HU1E

RF Replay attack

A replay attack is the simplest RF attack one can perform. A software-defined radio (SDR) is used to capture a valid transmission between the base station and a tag. All recorded samples are then replayed in the same order to check if the same behavior is invoked in the tag. Thus, no protocol information besides the frequency needs to be known.

In the following demo video, a price change is applied by the store manager on the right-hand side of the screen. The attacker (left hand side) is recording this transmission. Later, the store owner increases the price. The attacker will then replay the captured transmission containing the old price and the price is being decreased on the tag again.

 

Source: https://github.com/miek/inspectrum

RF protocol

The tool inspectrum was used to manually extract information from captured RF frames. One key learning was that the base station uses different symbol rates for different frame types. This is part of the reason why automated tools like Universal Radio Hacker were not able to detect the frames properly.
All RF frames start with a Preamble and a Sync Header. These are required in order to sync the receiver's clock to the RF signal. Otherwise, the decoder might drift off of the perfect sample time and will thus introduce bit shifts. Afterwards, the frame length follows in the packet. However, the length does not account for the CRC16 checksum that follows the package data. The next field contains the last 3 bytes of the tag ID, which is used to mark the intended target of the packet. The ID is required because RF systems are inherently broadcasting systems.
In the beginning, the tag is in sleep mode to conserve power. Thus, the base station will wake the tag up by spamming it with 920 different wake-up frames. Each frame is repeated five times to guarantee its correct reception by the target, even on noisy channels. The base station counts the 920 frames down to 0. This is when the tag fully wakes up and begins to listen for new images. The wake-up frames are transmitted at 175 kBaud.

Table 1: Wake-up frame structure
Preamble Sync Header Frame Len Tag ID Fixed Frame Counter Fixed CRC16
AAAAAAAAAAAAAA D391D391 08 065302 0000 0398 0A 2708

The next frame structure is the regular image carrier frame. In contrary to the wake-up frame, this and all other frames are being transmitted at 100 kBaud. Each image carrier frame can hold up to 54 byte of image data payload. The frame counter field is used to indicate the current frame number and how many frames are being sent in total (e.g., 0x09 0x01 indicates that this is frame number 1 out of 9 total frames). However, this would limit the protocol to a max of 256 frames and thus 13824 bytes of total payload. This can be too little for some of the larger tags. Thus, larger tags use a slightly modified version of the protocol. Here, the first image carrier frame uses the full 16 bit field width to indicate the total frame count. All following carrier frames will then use the 16 bit to report their own id. The counter will start at 0 and count up towards the total count.

Table 2: Frame structure for image carrier frame
Preamble Sync Header Frame Len Tag ID Frame Counter Fixed Payload CRC16
AAAAAAAAAAAAAA D391D391 3C 065302 0901 33 < Image Data Structure > CRC

As the tags also include some periphery (e.g. RGB LED), the image data is not just transmitted as is. Instead, an image data structure is being built. This structure also includes commands for the periphery. The first 2 byte field contains the color of the LED, e.g. a white LED is indicated with 0x07 0x00. As white is generated by turning on all 3 color components of a LED, one bit is probably used per color channel (0x07 = 0b0111). The next 16 bit field holds a batch code. This code can be any random value. The LED Time is a form of counter that indicates the blink frequency of the LED. The compression header field indicates which E-Ink color channel is to be updated. 0xFC00000000 indicates the black channel, while 0xFC80000000 marks the red channel. The next fields contain the display height and width in pixels. This information is necessary in order to apply the image data correctly. The rest of the structure is filled with the compressed image data.
The image is being transmitted as pixelmap. A "1" indicates that a pixel is turned on, while a "0" turns it off. Adding all image rows behind each other creates a one-dimensional pixel array. However, this array is then compressed so that less data has to be sent. This is done to conserve power. Low power RF systems waste most energy on RF transmissions, while the sleep mode barely draws any current.

Table 3: Image data structure
Preamble Sync Header Frame Len Tag ID Fixed Frame Counter Fixed CRC16
0700 BF75 00ED 000A FC00000000 007F 0127 < Compressed Image Data >

In order to examine the compression algorithm, we reverse engineered the .NET SDK package (https://www.nuget.org/packages/eTag.SDK/). Using the dnspy debugger tool, a compression method could be found. It turned out that the tag used a very basic form of run-length encoding in order to compress the image data. The algorithm includes four different cases, which are marked within the first byte and just contain the bits or a repetition value if a run larger than 7 pixels is detected.

 

Figure 16: Different reverse engineered frames used for run-length encoding of bits.

This image structure is then split across the image carrier frame and transmitted to the tag.
The tag answers with a frame after correct reception. However, this frame is not required for any interaction and only contains informative values such as the battery voltage, RSSI and temperature.

Table 4: ESL tag response frame
Preamble Sync Header Frame Len Tag ID Battery Voltage RSSI Temperature
AAAAAAAAAAAAAA D391De91 07 065302 1D 2068 E9
Fig. 17: The GNURadio sketch used to receive ESL tag messages. Only the RF interface is handled here. The recovered frames are sent via UDP to a python decoding script.

Building a GNURadio Receiver

After analyzing the protocol and determining that there is no encryption, we tried to automatically receive and decode all ESL messages. We used GNURadio together with the HackRF in order to receive the RF samples. On a sidenote: I'm not a GNURadio expert, so there might be a much better way of handling this protocol. However, this worked fairly well.

Here is a step-by-step guide:

  • First, we pipe the complex RF signals via a frequency Xlating FIR filter and a Low Pass filter. This clears up any noise that overlays the signal.
  • Second, the filtered samples are piped through a correlation estimator to tag the beginning of each packet.
  • We then use the quadrature demod block to decode the complex signal to float signals.
  • This float signal is then forwarded into the symbol sync block. This block will perform timing corrections on the preamble and returns a single sample per symbol.
  • Lastly, the float stream is converted into a byte stream and the packet deframer will use the sync word information from the protocol to sync itself to the packet.
  • It will then evaluate the length field of the packet and pack all bytes of a packet into a PDU, which is then sent over UDP to our Python analyzer script.
  • The script will decode all frames and dump all available information. Furthermore, it will use all image frames to reassemble the pixel graphic.

 

Proof of concept video. Click to watch on YouTube
https://www.youtube.com/watch?v=x7t0QViu2gU

The following video shows how an attacker is able to sniff ESL tag frames from the RF interface and can easily reconstruct the image that is being sent to the tag. Thus, proving that the protocol is fully reverse engineered.

Fig. 18: GNURadio ESL sender sketch. The img_source block can be configured with a custom image and a tag ID. The image will then be depicted on the selected tag.

Building a GNURadio Sender:

After using GNURadio to receive all tag contents, the next logical step was to generate and send our own ESL tag frames with the HackRF. Unfortunately, we ran into an issue after creating all frames in Python. The wake-up frames had to be transmitted at a different sample rate and the timing in between frames had to be consistent. Thus, using a GNURadio UDP Source to receive all frames from a Python script, did not work. The network stack seemed to accumulate a lot of packets before they were sent to GNURadio, hence, all timing information was lost. As a workaround, we drafted an img_source block, which took the tag’s ID and the picture and automatically generated the frames from within GNURadio.


Soon, the next issue was discovered. The wake-up frames and image frames had to be modulated with different symbol rates. It did not seem possible to change the interpolation of a block during runtime. We solved this by having the img_source block automatically interpolate the sample stream. In order for it to work, the sample rate variable of its configuration had to be divisible by both symbol rates. We used 2.1M samples, as 14*150000 = 21*100000 = 2100000. Thus, all frame types could be interpolated without any sample loss.
The generated float sample stream was then passed through a Gaussian FIR filter and frequency modulated to limit the out-of-band noise of FSK modulation. The created complex samples were then forwarded to the HackRF and sent to the tag. The tag decoded the packets and successfully showed the picture that was set in the img_source block.

Proof of concept video. Click to watch on YouTube
https://www.youtube.com/watch?v=028Gn4VC8yE

The following video shows how an attacker could use the GNURadio sender sketch to send the SEC Consult logo to a price tag. The image and target tag can be easily modified with the block's parameters.

Conclusion

During this vulnerability research, the team at the SEC Consult Vulnerability Lab was able to identify an unknown MCU and gained debug access to it. We identified flaws in the protocol by testing a simple replay attack. As this attack vector was successful, we fully reverse engineered the proprietary protocol and were able to send and receive arbitrary contents. This proofs that using proprietary RF protocols is not a secure solution, it just raises the level of effort and number of devices required to analyze the tag. However, after gaining some knowledge about the protocol and its physical properties, it turned out that a 2$ C1101 RF transceiver, such as the Flipper Zeros, would have been enough to communicate with the tag in the first place. 

In conclusion, it would be beneficial if cybersecurity specialists would focus on analyzing proprietary protocols on RF interfaces in the future. More and more devices offer smart capabilities and are connected to each other. There are still plenty of vulnerable "everyday devices" to be discovered!  

 

This research has been performed by Steffen Robertz and published on behalf of the SEC Consult Vulnerability Lab.
SEC Consult is always searching for talented security professionals to work in our team. More information can be found at: https://www.sec-consult.com

 

FAQ – Frequently asked questions

All industries that use electronic identification (ESL), e.g. for prices, archives or other purposes, are potentially affected by the problem. 

According to German consumer law and price labeling law, the purchase contract is only concluded at the checkout. This means that if a different (higher or lower) price is written on the shelf, the price saved in the cash register still applies. However, the situation is different with prices or price tags that are attached directly to the goods. 

More information (German): https://www.test.de/Verwachsenerrecht-Regal-oder-Kasse-whicher-Preis-gilt-5115345-0/  

Electronic tags are commonly used. Unfortunately, we cannot provide a detailed list. 

Some of the attack scenarios are phishing (e.g. with QR codes), extortion (e.g. ransom demands), obstruction of court proceedings (e.g. through chaos in the archive). 

Yes, critical infrastructure is also affected if ESL tags with insecure technology are used. 

Yes. Although it is a product from a Chinese manufacturer, the ESL tags are also known to be in used in Europe. 

The specifications can vary greatly. In Australia, for example, there is a slightly different law in the area of price marking than Europe. Take the voluntary Code of Practice for Computerized Checkout Systems in place for incorrect prices at Woolworths stores, for example. This Code establishes certain procedures that a store must follow in the event of mispriced items. For example, if a customer scans an item at Woolworths with a higher price than the price listed on the shelf, the customer is entitled to receive the item free of charge (provided the search number is correct). 

Important: We cannot provide legal advice. Please get in touch with a regional contact person in your area.  

The analyzed 433 MHz RF protocol can be transmitted over long distances. With standard hardware, a distance of up to 300 m is already possible. Larger distances are also feasible thanks to specialized hardware. 

Electronic shelf labels are not insecure per definition. However, the tested ESL tag relies on an insecure protocol that has no security restrictions. Access in plain text is thus possible with relatively little effort. Likewise, the manipulation. 

No, not in this case. There is no update mechanism for the special label to supplement the missing security routines (e.g., encryption, public / private key). The vendor did not reply to our communication attempts. If the manufacturer does not react, the only option is a product change. 

Unfortunately, if a proprietary protocol is used, there are no automatic tests. With standard protocols, you know the weak points and you know how to test them. A comprehensive pen test would be necessary. 

It is often difficult to determine in advance whether a product is implemented securely. Therefore, when making agreements before making a purchase, make sure that you agree on a separate right of withdrawal including compensation if the product does not meet the necessary security requirements after a test. 

A proprietary protocol means overhead. In other sectors where control devices and sensors are used (e.g. by car manufacturers), it is common practice to subject them to a corresponding security check in advance. Such an IoT Embedded Pentest can include a two-digit range of person days, depending on the use case and scope. If the manufacturer is cooperative and provides data sheets, this can significantly speed up the test (because reverse engineering would not be necessary). 

Unless explicitly agreed, you as the buyer must bear the costs. 

Get in touch with the manufacturer in advance. If they are cooperative and provide data sheets, you can assess the risk beforehand. Otherwise, you’d have to rely on trial & error. 

We recommend including mandatory security requirements which must be fulfilled by the manufacturer during negotiations. Don't forget to demand the right of withdrawal and compensation for the security test in case these requirements are not met. More information on the policy can be found here: https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act or reach out to our security experts at SEC Consult.