I'm cross-posting the README for my Python Encryption Example, since I think it's pretty interesting.

About

This program was written to demonstrate how to correctly encrypt and decrypt files, using PBKDF2-SHA1, AES, and HMAC-MD5.

The output file is JSON, containing all of the information that isn't meant to be secret (the ciphertext, HMAC, IV, salt, and number of PBKDF2 iterations).

PBKDF2

PBKDF2 is a key derivation function. It takes a password and generates a key for use with an encryption function (like AES). It is one of two good ways to securely hash a password, the other one being bcrypt, which I recommend in cases where you need to store and verify a password, like for login information.

When new programmers get security advice, they tend to get advice about how to use a salt correctly, and which hash functions to use. All of this advice is wrong. You should use PBKDF2 or bcrypt, and they will do these things correctly without your input.

I chose PBKDF2 over bcrypt for this program because PBKDF2 generates an arbitrary-length key (in this case, we want a key of the correct length for use with AES). It's possible to do this with bcrypt, but it's not what it's designed for, and the first rule of doing cryptography correctly is to use cryptographic functions in the way they were meant to be used. bcrypt is for storing passwords, PBKDF2 is for generating keys.

PyCrypto uses PBKDF2-SHA1 by default, meaning that internally, it uses SHA-1 as part of the process. I chose to leave this at the default value because it doesn't matter, and I wanted to make sure anyone reading this understood that it doesn't matter. You can set it to SHA-256 if you want, but it won't be any more secure. You can also set it to MD5, and it won't be any less secure. The flaws in SHA-1 and MD5 don't apply to their usage in PBKDF2.

AES

The Advanced Encryption Standard (AES) is a secure and easy to use encryption function. It was standardized in 2001, and has been heavily researched. In cryptography, older is better, because old functions have been tested better than new ones.

HMAC

Encryption provides confidentiality (someone with access to the encrypted file should not be able to read it), but it doesn't provide integrity (someone could change the data and you wouldn't be able to notice, except that the decrypted data might look like garbage).

A hash-based authentication code (HMAC) allows you to verify that the encrypted data wasn't changed in transit. HMAC uses a password and a hash function to generate a hash. This program includes the HMAC in the JSON output. When it decrypts a file, it generates the HMAC again and verifies that it hasn't changed.

Sometimes, new programmers are taught about authentication, but they're usually taught to do something like sha1(data, password). This is wrong. Use HMAC. This will protect you from things like flaws in your hash function (this program uses HMAC-MD5, and it's perfectly safe, even though MD5 is broken in certain cases).

Usage

You can run ./encryption.py -h for standard usage information and help.

Encryption

To encrypt a file, use:

./encryption.py encrypt [-i input_file] [-o output_file] [-p password]

All of the argument after encrypt are optional. If -i is not provided, input will be read from the standard input (read from the terminal). If -o is not provided, output will be written to the standard output (printed to the screen). If -p is not provided, you will prompted for the password. In general, providing -p is insecure, because the password can be saved to your shell's history.

Errors and help messages will be printed to standard error, so you can safely pipe the output (if you don't use -o):

./encryption.py encrypt -i input_file.txt > encrypted.json

Decryption

To decrypt a file, use:

./encryption.py decrypt [-i input_file] [-o output_file] [-p password]

All of the argument after decrypt are optional. If -i is not provided, input will be read from the standard input (read from the terminal). If -o is not provided, output will be written to the standard output (printed to the screen). If -p is not provided, you will prompted for the password. In general, providing -p is insecure, because the password can be saved to your shell's history.

Advanced Usage

This program accepts input on stdin and writes to stdout to allow you to use UNIX pipelines. If you add a | between two commands in a UNIX terminal, the output from the first command will be "piped" to the input of the second command.

Compression

Using that, we can do things like compressing the file before encrypting it. This example uses xz, which uses LZMA2 compression. You could do similar things with gzip or bzip2.

To compress with xz and encrypt:

# Prompt for the password and store it as $pass
# This prevents it from being saved in the shell's history
read -sp 'Password: ' pass
cat example.txt | xz | ./encryption.py encrypt -p $pass -o secret.json
# Clear $pass so other programs can't read it
unset pass

To decrypt and decompress:

read -sp 'Password: ' pass
cat secret.json | ./encryption.py decrypt -p $pass | xz -d > output.txt
unset pass

Encrypting a Hard Drive Backup

We can also use dd to make a backup of a hard drive, compress it, and encrypt it:

read -p 'Drive to back up (ex: /dev/sda1): ' drive
read -sp 'Password: ' pass
sudo dd if=$drive bs=4M | xz | ./encryption.py encrypt -p $pass -o backup.json
unset pass

Restoring the backup. This is an example. I don't actually recommend running this on anything you care about:

read -p 'Drive to restore (ex: /dev/sda1): ' drive
read -sp 'Password: ' pass
./encryption.py decrypt -p $pass -i backup.json | xz -d | sudo dd of=$drive bs=4M
unset pass

Note that the output of this program isn't particularly space efficient, since it uses JSON, so it's necessary to base64 encode the encrypted output. In reality, you would probably want to use a more efficient program for hard drive backups.