Information sources

Linux specific

2004 05 - thread "Support for VIA PadLock crypto engine" on LKML

http://www.ussg.iu.edu/hypermail/linux/kernel/0405.2/0063.html

Fruhwirth Clemens: "You have been campaigning with FUD against cryptoloop/dm-crypt for too long now. There are NO exploitable security holes in neither dm-crypt nor cryptoloop."
Jari: "This (Saarinen's) attack makes it possible to detect presense of specially crafted watermarked files [...] Watermarked files contain special bit patterns that can be detected without decryption."; with provided code doing this.

2004 07 16 - thread "dm-crypt and gpg" on linux-crypto ML

http://www.spinics.net/lists/crypto/msg02796.html (about gpg-encrypted random password as a way to improve dm-crypt's security level)

Andrew Johnson: to have a 256-bits random password may be sufficient to make the dictionary needed by a known-plaintext attack's infeasibly large
Boyd Waters: it "probably would NOT protect against watermark. Problem there is the treatment of the per-sector password for the block encryption: loop-AES runs through a number of iterations, dm-crypt and cryptoloop do not."
watermark attack exploits weakness in IV computation and the fact that same key is used for all sectors.
Jari: "loop-AES' multi-key mode resistance against watermark attack is result of stronger IV computation. Using different keys for different sectors also helps a little bit."

2004 07 21 - (huge) thread "[PATCH] Delete cryptoloop" on LKML

http://marc.free.net.ph/thread/20040725.172544.b3323513.html

Matthias Urlichs: "AFAIK, the main issue is: If I write some data to the start of block N, I get a bit pattern. If I write the same data anywhere else (the middle of block N, the start of block M = N, a different on-disk bit pattern must result. If there are identical bit patterns, then the system is vulnerable."
1. Peter Anvin: "So does cryptoloop use a different IV for different blocks? The need for the IV to be secret is different for different ciphers, but for block ciphers the rule is that is must not repeat, and at least according to some people must not be trivially predictable. One way to do that is to use a secure hash of (key,block #) as the IV."
Jack Lloyd http://marc.free.net.ph/message/20040722.145843.f5318685.html : long explanation/demonstration
Pascal Brisset http://marc.free.net.ph/message/20040722.194415.f1515ac3.html : "The IV is predictable in cryptoloop and in other implementations. This causes specially crafted watermarks to be detectable through the encryption. Pretty bad, but whether this is really a concern or not depends a lot on what you are encrypting."
Fruhwirth Clemens http://marc.free.net.ph/message/20040724.124146.f18c50e1.html : "Modern ciphers like Twofish || AES are designed to resist known-plaintext attacks"; but with public-IV, "it is likely that some information is revealed", "although personally, I neglect this security threat"
Jari Ruusu http://marc.free.net.ph/message/20040725.114236.8143e157.html : "Ciphers are good, but both cryptoloop and dm-crypt use ciphers in insecure and exploitable way."
Fruhwirth Clemens http://marc.free.net.ph/message/20040725.132430.b62930f3.html : "There is no use in running your code. It does not decipher any block without the proper key. Where is the exploit?"
Andreas Jellinghaus: "If someone can prove that I have a certain file on my hard disk, even if it encrypted, that is less security than I expect from a hard disk encryption. Am I expecting too much?"
Marc Ballarin: "the purpose of this attack is not to break encryption, but to prove the existence of a file known to and prepared by the attacker. The exploit generates a rather simple bit pattern with a size of 1024 bytes. When this pattern - the watermark - is encrypted, dm-crypt's output has some special properties - independent of cipher or key size. For example, encoding nr. 1, always produces a cyphertext block, where bytes 0-15 are equal to bytes 512-523."
Marc Ballarin: "The difference in bit patterns between the first and second half of the watermark block compensates partly for the trivially and predictably changing IV beetween two successive sectors. As Jari eplained, this causes any cipher to produce two identical blocks of ciphertext (after all the input is identical). [...] An improved and unpredictable IV generation should protect against this watermarking as well."
David Wagner http://marc.free.net.ph/message/20040728.202406.36f716f9.html : "M.J. Saarinen's attack seems to be real, if that's what you're asking about. IV generation is important; if you choose IVs poorly, then you can end up with some weaknesses even if the underlying block cipher is perfectly fine. (I noticed that some posts from, e.g., Clemens were confused about this point. If you use a great cipher in a bad mode of operation, you can easily end up with an insecure system. The existence of an attack against such a system is not in contradiction to the security of the underlying block cipher against chosen plaintext attacks.)"
Christophe Saout http://marc.free.net.ph/message/20040729.155039.64dc28aa.html : "IV = initialization vector = sector number (little endian, 32 bits), pad with zeroes The actual content is then encoded using the selected cipher and key in CBC mode. For those who don't know what exactly that means:
```
  C[0] = E(IV xor P[0])
  C[1] = E(C[0] xor P[1])
  ...
  C[n] = E(C[n-1] xor P[n])
```

C is the encrypted data, P the plaintext data. The block size is given by the cipher (usually 128 bit or something like that). E is the encryption using cipher and key. This is done for every sector. The weakness is that the IV is known. You can write specially crafted blocks on the disk and have a known plaintext for the first block. One simple way to avoid this would be to compute the IV in a different way, something based on key and sector number."

Andries Brouwer http://marc.free.net.ph/message/20040729.161203.e3266beb.html : "So far, every time I checked the details Jari Ruusu has been right. In the present discussion Fruhwirth Clemens showed an amazing lack of understanding of cryptography. His threat model seems limited to things like "chosen plaintext attack" etc. But there are so many entirely different attacks."
David Wagner http://marc.free.net.ph/message/20040729.211529.d97e9d53.html : full & well-informed explanation of the dm-crypt weaknesses
the end of this thread is in fact the most interesting part of it :)

"On The Weaknesses of public-IV On-Disk-Formats", by Clemens Fruhwirth

http://clemens.endorphin.org/OnTheProblemsOfCryptoloop

basically, Fruhwirth pretends to prove mathematically that public-IV On-Disk-Formats are secure
I've not verified his demonstration, some apparently well-informed cryptographers basically consider that Fruhwirth has a very limited understanding of cryptographic issues...
the flaw of his demonstration comes from the fact that he's only considering one kind of attack, and is in general oversimplifying the issue. The attack he's considering is not the most evident one: the scenario constructed in this webpage is based on the assumption that the attacker found randomly and luckily two sectors beginning with the same ciphertext. Thus, the fact that an attacker may have prepared two sectors (leading to the watermarking flaw described later) isn't treated in his demonstration which seems to be true in fact, but only under this very limitative assumption. Also, it should be noted that he's not taking into account the deterministic pattern due to the on-top filesystem (I'm not sure if that's matters). He neglect the flaw of watermarking (described in the paper of Saarinen), because he's just considering that it's not a security hole (!) (he say that during the thread on the LKML quoted before), and also neglect the question of dictionnary attacks. So, basically, he clearly lacks of an idea of what "security" means in a cryptographic context, in this text at least.

Linked topics

It seems that the flaws found in dm-crypt can be seen as somewhat close to the problems occuring in the WEP protocol used with Wi-Fi security equipments. Some useful links & explanations can be seen here: http://en.wikipedia.org/wiki/WEP

Basically, the two flaws (the watermarking issue in dm-crypt and one of the WEP issues) is linked to a poor IV selection, which lead to information leaking because of a well known property of the XOR operator used in these schemas (namely that (a XOR k) XOR (b XOR k) = (a XOR b)). According to this XOR property, any repetition of an IV (which will obviously occurs if the IV is too small), lead to information-leaking.

Markku-Juhani O. Saarinen "Encrypted Watermarks and Linux Laptop Security"

http://www.tcs.hut.fi/~mjos/doc/wisa2004.pdf

This paper describe the watermarking hole which allow an attacker to prove that two previously crafted files are actually present in the file, without decrypting anything. Also considers three other possible problems (all three based of the lack of a proper integrity-check in both dm-crypt and cryptoloop):

the possibility to corrupt encrypted data
the possibility to swap blocks of encrypted data under certain conditions
the possibility of reverting the changes that occured on the disk during one period

So, seriously worrying, but this paper is also proposing an alternative design which is supposed to be immune to these flaws. Let's hope :)

Jerome Etienne, Vulnerability in encrypted loop device for Linux

http://off.net/~jme/loopdev_vul.html

One of the first papers (2002) explaining the weaknesses of CBC mode, especially "cut'n'paste" attacks : "As an file-system isn't designed to appears random, its content may be predictable to some extents (e.g. common directories and files, inode, superblock). The attacker may use such informations to guess the contents and do a knowledgeable cut/past. For example, an attacker knowing the location of a password file may replace a password by another one which is already known."

Clemens Fruhwirth, Linux hard disk encryption settings

http://clemens.endorphin.org/LinuxHDEncSettings

Fruhwirth'brain last upgrade results... discussing about IV and CBC mode. Even Jari Rusuu hasn't told this text contained false information, so we tend to trust it ;)

A thread on dm-crypt list (http://thread.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/521, and particularly http://thread.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/562), discusses a bit more deeply some aspects not covered by this document, such as moveability, malleability & replayability of different block cipher modes, namely Plain-IV, ESSIV, Plumb IV, LRW, CMC, and EME.

not Linux specific

"How to defend your Privacy", by Anonymous

http://v4.livegate.net/wipe/

dm-crypt is, like Cryptoloop, vulnerable to optimized dictionary and watermark attacks

Poul-Henning Kamp, "GBDE - GEOM based disk encryption"

http://phk.freebsd.dk/pubs/bsdcon-03.gbde.paper.pdf

make dictionary attacks more expensive: instead of adding iterations to the pass-phrase preprocessing path (such as in loop-aes), we could combine the pass-phrase with a high-entropy token, for instance 1024 random bits stored remotely

And after ?

Known plain-text attacks

Some know-plain-text attacks need R/W access to the active (mounted) file-system (e.g. by sending a mail to an encrypted maildir ;) and/or to know where the data ends up on the disk no in fact, only partial information leak ?

IEEE SISWG (Security in Storage Working Group)

http://www.siswg.org/

They aim to propose a "Standard Architecture for Encrypted Shared Storage Media", bypassing and have already published the following draft documents, which are really good formal descriptions of LRW & EME block cipher modes :

dm-crypt hardening:

The simplest method involves the use of true cryptographically-random strings (from /dev/random, for example) as the first-level passphrase; encrypt them with a hashed, second-level salted passphrase as the key, and store the encrypted first-level passphrase with the salt. To setup the cryptoloop, just decrypt the encrypted first-level passphrase and use it. But the lack of entropy in the second-level passphrase is compensated for by salting and hashing it, and using it to encrypt a truly-random first-level passphrase. Any dictionary attacks on the second-level passphrase cannot be precomputed, because of the salt. Computing it on the fly will require calculating the hash, which is designed to be very slow.)

dm-crypt list

http://news.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt

they "Create a crypt_iv_operations structure with a ctr, dtr and generator methode and move the plain iv generator to this structure."
they "also redefine the syntax" "to support chaining modes different from CBC mode, for example CMC. CMC is not implemented in cryptoapi yet, however I would like dm-crypt to be ready for it, so the problems outlined by Adam J. Richter in http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/454 can be fixed easily, by switching to CMC chaining mode. Compatibility code has been added to accommodate the old sytnax."
"This patch adds a new IV mode, I baptise ''encrypted sector|salt IV'', short ESSIV. I describe ESSIV in http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/472"

loop-aes solutions

ssh user@host cat keyfile | mount -p 0 -o gpgkey=/etc/foo that uses /etc/foo key file that is stored locally, but if that key file is gpg encrypted with password like P+zl9O2QYxJZcgMO94+IN9ezfjf/BVQsNEOXajbWRnO2ok/FLQDD8zCsDDyT that you pipe through ssh to mount.