DataSafe: An Encrypting Pen Drive with the STM32 Primer2
Marco “Kiko” Carnut <kiko at postcogito dot org>
This projects implements an encryption layer within the Primer2 USB Mass Storage device driver. The data in the MicroSD card will be readable only if the user selects the correct passphrase in the application. If the wrong passphrase is entered or the user tries to use the card in the host system directly, it will look like an “unformatted disk”. The primary advantage over a
PC-only solution (such as the popular TrueCrypt program) is that the passhprase and cryptography are handled by the Primer2, away from keyloggers and data reminiscence attacks.
Overall Description and Usage
This section assumes the application is already installed under CircleOS and the MicroSD card is already inserted in its slot.
When the application is started, the passphrase form entry is displayed. A 6-row, 3-column table with a highlighted word as a “cursor” allows the user to select one 4-letter word from an internal dictornary using the joystick. The table scrolls as necessary accompanying the cursor's movements.
When the user presses the joystick button, the currently hightlighted word is appended to the passphrase – visual feedback is provided by word counter in the lower right corner of thescreen being incremented.
The second function button, shown with a left arrow icon, servers as a “backspace” key: the word counter in the lower right corner of the screen is decremented, allowing the user to go back and reselect a word or even restart from scratch.
A passphrase is comprised of at least six (easier to remember, less secure) and at most 8 words (harder to remember, more secure). When the user is done entering the passphrase, clicking on the second function button (with a “checkmark” icon) starts the USB Mass Storage device. If at this point the USB cable to the PC is already connected, the PC will recognize the device in a few seconds. On Microsoft Windows, it will appear as a new drive under the “My Computer” window. On Linux, the kernel ring buffer (visible using the dmesg command) will show several messages reporting detection of the device and its size.
There is deliberately no “wrong passphrase” visual feedback in the Primer2 application itself. If the passphrase is wrong, the MicroSD card's contents will look like random data to the host operating system. This will cause Windows to say that the drive isn't formatted. Under Linux, a message in the kernel ring buffer will say that the “partition table has not been recognized”.
This “wrong passphrase” behavior turns out to be the first thing a first-time user will have to encounter – the initial passphrase you choose will necessarily be “wrong” because you haven't set up one yet. Choose your passphrase carefully (preferably by a random process) and ask your operating system to format the drive. You will then be able to use it normally. When you are done, shut the Primer2 down.
When the user restarts the DataSafe application and enters the exact same passphrase used before for that specific card, it will be detected and recognized correctly by the operating system in the host computer. If it does not recognized the card, it means you entered the wrong passphrase.
The user has the option of hitting the checkmark icon without entring any passphrase at all, in which case the USB Mass storage device will be started with no encryption, just like an ordinary pen drive.
Also notice that the passphrase is not stored permanently anywhere – neither in the MicroSD card, nor in the STM32's flash memory (although it is kept in the STM32's RAM while the application is running). Instead, it is used directly as the encryption key (the key is “mixed” with the data) and scrubbed when the application is restarted.
The most practical consequence of this fact is that if you lose or forget the passphrase, there is absolutely no way to recover it other than by exhaustive search (trying all possible passphrases); as we will see below, the system was deliberately engineered to make this not viable. (Were it viable, it would constitute a “back door” that could be exploited by an adversary.)
When the Mass Storage device is activated, the DataSafe application installs the interrupt handlers that perform the actual work. If the application is closed (by hitting the “X” icon), the interrupt handlers are not cleared – the USB connection and the mass storage device will contine running “in the background”, even if another application runs. Of course, this might have an adverse effect on the other applications' performance.
When the application is restarted, the USB subsystem is reset. It can thus be used as a way to perform a “disconnect” via software, without actually having to unplug the USB cable.
Technical and Implementation Aspects
Our starting point was the USB Mass Storage Device application version 1.1 available at the STM32Circle web site [8.]. We then added the encryption subsystem and user interface. Our design goal was to make the application as simple as possible to use and to implement, so we deliberately tried to keep our changes to a minimum while at the same time striving to achieve a practical, usable application.
The passphrase system used here is called Diceware-4 and its primary objective is to strike a “good” balance between:
Resistance guessing or exhaustive search attacks: the passphrase should be sufficiently complex to yield a reasonable amount unpredictability so that it is not easy to guess and trying all possible combinations take way too much time;
As the internal dictonary has 1,296 entries, the choice of each word implies in about 10.34 bits of entropy. With six words, the passphrase offers about 62.04 bits of entropy – comparable to the 64 bits typically considered to be the viability threshold for brute force attacks. With eight words, the entropy goes up to 82.72 bits, which is 282.72-62.04 = 220.68 ≈ 1.7 million times stronger (either in terms of cost or time).
Memorability and usability: using English words in a higly regular fashion should make them reasonably easy to remember and hopefully not too irritating to input in a device with no keyboard such as the Primer2;
For a longer discussion of the Diceware4 method and the tough security vs human usability dillema, see this earlier paper of mine [1.] where, among other things, I argue that most “eight typable character” password systems can't exceed about 53 bits of entropy. On the other hand, 82 bits of entropy isn't considered all that much in many circles. The underlying encryption algorithm itself takes (described below) 256-bit keys. I can justify that only as a deliberate bias torwards usability – there is no point in making the system extra-secure if it is so annoying that user's don't buy it. Additional credits: the Diceware-4 system used in this program was inspired by Arnold Reinhold's original Diceware Passphrase System – see [7.].
The Encryption Algorithm
DataSafe employs the ChaCha encryption algorithm [2.] introduced by Daniel J. Bernstein as an enhancement of his already successful Salsa20 family of cryptographic primitives. For an in-depth discussion of Chacha, Salsa20 and their security characteristics, the reader is referred to [3., 4., 5.].
We considered ChaCha particularly well-suited to this application because it seems to have been specifically designed with block device encryption in mind. Its particularly attractive features are:
It is fast – the version we used encrypts at a speed of about 100 cycles/byte and it is far from speed-optimized. (There are reports of ARM implementations of this algorithm running as fast as 69 cycles/byte). It can also be made faster by reducing the number of rounds;
It is small – the encryption/decryption code fits in a bit more than 2KiB of Flash, and as it is a loop-unrolled implementation, it is not size-optimized – it could be made a lot smaller at the expense of some speed;
It is lightweight – it operates “in place”, requiring no extra RAM beyond the key/nonce pair and the data being encrypted;
Operates in “counter mode”, allowing random seeks to any point of the keystream with zero extra effort;
Although it is a relatively new design, ChaCha is believed to offer excellent security. Salsa20, its predecessor was selected for Phase 3 of the eSTREAM international selection process. Both ChaCha and Salsa20 have been receiving considerable scrutiny from the academic cryptography community and no pratical attacks have been found.
In the DataSafe application, the passphrase is used directly as the encryption key. Some issues with this arrangement are discussed further below.
Chacha is a stream cipher – encryption/decryption are accomplished by XORing the message with a pseudo-random stream “seeded” by the key+nonce+counter. In order to make sure the seed is never reused, the SD card's “Product Serial Number” is used as the “nonce” (one of the keystream generator's inputs that act as a “unique message identifier”). This makes it safe to use the same passphrase in two different cards.
The way Chacha's “counter/offset” is used may be considered slightly odd – the card address is passed directly to ChaCha's “offset”, although it operates in units of 64 bytes. As sectors addresses are usually multiples of 512 bytes, this means we are not using the keystream exactly in order. This is of little consequence, though – even with this “wasteful” use of the keystream space, with 32 bit offsets we don't get anywhere close to the maximum 270 keystream limit imposed by ChaCha. However, anyone attempting to create a compatible implementation must take that peculiar use of offsets in mind.
Also notice that the in-memory keys are zeroed when the application is started. They are not zeroed when the application is closed because, as described earlier, it keeps running “in background”. So if you want to scrub the keys, close and restart the application (optionally closing it again afterwards).
Issues Regarding Block Device Encryption
The encryption/decryption is performed on a sector-by-sector basis as the host computer's operating system writes/reads data. This means that:
The process is entirely filesystem-agnostic: you may use whatever filesystem you want – FAT16, FAT32, NTFS, ext2, ext3, reiserfs, YAFFS, you name it. You can even have several different filesystems simultaneously in different card partitions. The filesystems retain all their innate caracteristics, including performance, crash-resistance, wear leveling (or lack thereof), etc.
Although someone that doesn't know the key cannot understand the encrypted data, the cleared sectors in a fresh card can easily be discerned from the “random looking” encrypted data and thus an estimate of how much data is contained in the filesystem can be computed despite the encrypton.
An easy way to fix this is to “prime” the card: before actually using it to store important sensitive data, write to it to the point of exhausting the free space in the filesystems. This will ensure that all sectors have been written (with what the adversary will see as random numbers) at least once. Then choose another passphrase, format the card again and only then store your important data.
While testing the original USB Mass Storage application, we had some instabilities regarding writing large files – sometimes the host operating system reported write errors. We fixed this by following a recomendation we found in the STM32Circle Forum – we disable Timer2 when we start the USB subsystem in the application. This has the side effect of disabling the MEMS accelerometer and the audio subsystem, but, in our testing, kept the USB subsystem and the overall application perfectly reliable – we have successfully write and read gigabyte sized files with no glitch whatsoever.
As we said before, the USB subsystem is kept running even after the application is closed. However, the application's quit funcion reenables Timer2 to recover the MEMS and the audio – but this may cause the USB system to become unreliable. So we recommend using the USB subsystem with the application at all times for improved reliability.
Due to time constraints, we did not test the application with MicroSD cards larger than 2GB. As the application uses 32 bit offsets throughout, we conjecture it might be unable to support cards larger than 4GB.
Comparison to Related Work
There are several popular “disk” (or, in Unix-like parlance, “block device”) encryption programs out there – TrueCrypt, FreeOTFE and PGP Whole Disk Encryption being perhaps the most notable examples. On another category there are the file-level encryption programas such as PGP and GPG.
Although some of these programs do offer hardware-assisted encryption, they are mostly “software based” encryption systems in the sense that the encryption and decryption are performed by the same processors that handle the general computing tasks. In the modern, complex, multi-tasking and multi-user operating systems, this opens up several avenues of attack to steal the keys and/or passphrases – viruses, spywares, to name a few.
The goal of this project is to implement the encryption and security-critical funcions in the STM32, taking advantage of the fact that its hardware is almost perfect for the job: it supports a MicroSD card interface and its LCD/joystick/touchpad provide the essential UI requirements for passphrase entry. It becomes thus possible to arrange things so that nothing about the passphrase/crypto keys ever reaches the host computer. However, we must bear in mind that once the encrypted volume is open and made available to a potential hostile computer, viruses and spywares can still steal sensitive information. Nonetheless, the separation of duties does reduce the vulnerability window.
We know of at least one commercial product that seems to offer similar functionality: the IronKey pen drive. We didn't have one to test/compare and the web site seemed unclear as to how exactly the password/passphrase is to be entered, although it does say that “authentication is handled in hardware and cannot be disabled my malware”.
One disadvantage of our current implementation is performance: the original USB mass storage device driver seems to deliver just a few hundred kilobytes per second of throughput. On the other side, our testing indicates that the encryption seems to have negligible effect on the throughput.
M. Carnut, E. Hora, Improving the Diceware Memorable Passphrase Generation System, Proceedings of the 7th Symposium on Security in Informatics, Instituto Tecnológico da Aeronáutica, São José dos Campos, São Paulo, Brazil in November/2005,
D. J. Bernstein, The Chacha Family of Stream Ciphers,
D. J. Bernstein, ChaCha, a variant of Salsa20,
D. J. Bernstein, Snuffle 2005: the Salsa20 encryption function,
D. J. Bernstein, The Salsa20 family of stream ciphers,
J. Aumasson, S. Fischer, S. Khazaei, W. Meier, C. Rechberger, New Features of Latin Dances: Analysis of Salsa, ChaCha, and Rumba,
A. Reinhold, The Diceware Passphrase Home Page,
STM32Circle, USB Mass Storage 1.1