Ämne:
[dcdev] Encrypting ADC - a second approach
Från:
Gustaf Räntilä <[email protected]>
Datum:
2005-03-10 7:15
Till:
Direct Connect developers

Hi all,

The topic has been active previously, but afaik we haven't united upon a
standard for encryption in ADC yet. I'm raising this issue again, with
my ideas on it, and with contribution to benchmarks. Refer to those
early ones cologic attached, they indicate that encryption is feasible.
I'll propose a scheme for an efficient encryption implementation on hubs
and clients, for hub-client and client-client communication.

INTRODUCTION
First of all, encryption on todays computers are often efficient, as
cologic showed. This is an argument for applying encryption to the ADC
protocol as a compulsory element rather than an optional one,
particularly since encryption is more or less useless if not every user
in the hub encrypts the communication. However, such a requirement must
follow some more in-depth analysis of its effects.
The user count limitation for hubs has been bandwidth throughput rather
than CPU usage, and it has been shown that an increase in user count
does not affect bandwidth linearly. In some cases CPU issues have been
taken care of by smart socket buffer pooling to respect the processor
cache. This indicates the importance of doing real-situation
simulations. I have simulated encrypted messages passing through a hub,
being decrypted and then encrypted by a few thousand different keys (to
be sent to the individual clients). I request a more accurate simulation
basis with a quite large recorded session from a hub with lots of users.
User passwords can be removed, and usernames can be changed to fakes if
necessary.
The technique with encryption has negative effects on data re-use in the
processor cache relative the non-encryption case, but I'll show that the
bottleneck for hubs will continue to be throughput limitations on
broadband connections, and not CPU throughput.

IMPLEMENTATION
The algorithms I suggest are RSA and AES(Rijndael i.e.). AES as the
symmetric cipher for encrypting/decrypting all data (between hub and
client, as well as between clients), and RSA being the asymmetric
algorithm for exchanging keys. Rijndael has some quite positive
features, for example it not only follows the AES requirements of
supporting 128, 192 and 256-bit key-sizes, it actually supports and
key-size with bit count being a multiple of 8, hence full bytes. This
can be useful for other implementations apart from the protocol. RSA is
a widely accepted algorithm, often used for symmetric key exchanges.
Both of these algorithms are part of the libssl package. This means that
implementations doesn't require large new pieces of (buggy) code, but
rather that it requires just another library. Clients today are familiar
with both zlib and bzip2. They require some sort of xml library and a
socket library unless this is written by hand. I've heard a few
frightening comments on how protocol complexity might injure hub
development, and I hope such ancient ideas are left out by now, so that
the usefulness and features one could expect from a protocol in the 21st
century aren't underestimated. However, by exporting the work to a
library for this, one can certainly not complain about complexity
issues. In Unix variants and Unix-like systems such as Linux, this is
the easiest of tasks. In debian, apt-get for libssl0.9.7 and libssl-dev.
In windows, it requires you to compile the libssl library correctly.
I've done this for those of you who want to start right ahead. The
library as well as the headers can be downloaded from here:
http://gempond.com/openssl-win32.zip
It's for vs2003, without debug-code i.e. 'release', and statically since
that will probably become most useful.
You probably know how to apply this to visual studio. I haven't actually
tested if the win32-library works, but it I'm sure it does...

SCHEME
My scheme for implementing encryption is as follows:
All parts in an ADC network (hubs and clients i.e.) must initially
generate a unique RSA key-pair (the library does this for you), just as
they also need a unique ClientID.
ADC commands, sequences and structures are untouched as they are defined
in the draft. The encryption occurs rather at socket-level than at
message level. Only the login sequence is slightly changed - key
exchange is injected before any other commands, and server has already
sent CID before it gets HSUP from client, so it shouldn't send it again.

1. Upon connection, the server (in client-hub this means the hub, in
client-client this means the callee not the caller) sends its public RSA
key Tiger-hashed (base32 encoded just as any other tiger hash unless we
can agree upon a better suggestion) followed by a newline (\n) followed
by a key-size (an integer, just as share-sizes etc are encoded) followed
by another newline (\n), followed by its CID and finally terminated with
a final newline (\n). I.e.:
TTH(RSA Public key)
Requested AES key size (the server fully decides this), e.g. 128
CID
2. The client receives this information, looks in a 'database' for the
CID to check if it has a stored copy of this particular server's (note:
hub or client) public RSA key. If it doesn't (or if it does, but the
hashed keys differ, and after the user might've been notified on this)
it sends a newline (\n). This will cause the hub to send its public RSA
key (base32 encoded also?).
Now the client stores the public RSA key bound to this CID. It generates
a random Rijndael key of the key-size sent by the server (the library
can generate this). The key-size should be (even though not required
pure algorithmly as mentioned) 128, 192 or 256 bits (openssl for
instance, requires this). This key (being unique for this connection) is
what will be used for ciphering all data sent between these parts.
3. The client encrypts the AES key by the server's public RSA-key, and
sends it to the server. The server decrypts it (and stores it as two
binary AES-objects, if implemented using openssl as I've done) in its
internal user object structure.
Now both parts have a copy of a unique 128-256 bit AES(Rijndael) key.
No one else have had the chance of stealing this on its way (I happily
ignore the man-in-the-middle attack possibility, or will someone look
into running its own DC CA, and why would we trust him/her?). Nothing is
now encrypted using RSA (which is slow), instead _everything_ is
encrypted by the AES key.
Step 4 to infinity (i.e. all future communication in this connection)
will be just as before, HSUP/CSUP and everything, but AES encrypted.

FEATURES
Further on, one could ask for and require the possibility of private
message to really be private, hence encrypted. This should be an
optional state (to not collide with "chat channels"/"group chats"). In
this case the encryption requester (the one who wants a PM to be
encrypted) requests the other part's (by nick? I'd prefer CID for key
storage as suggested above, but looking at pm in MSG in the draft, this
seems not to be possible without another message type, I hope I'm
mistaken here) public RSA key. This could be skipped if we know the
user's CID and already have his/hers public key. We must be certain
anyway, and check if the key's TTH is the same. Exactly how this could
be done is better being thought through by someone more familiar to ADC.

In the request for the other part's public RSA key, the requester can
send its own public RSA (if it assumes by looking in a
'reverse-database' that the user doesn't already know this). I'm not too
sure how this is supposed to be implemented, but it shouldn't be too
difficult. The point is to RSA-encrypt the private messages when such a
request has been made. This will mean, RSA-encrypting the text,
base32:ing it into a "text-like" string, packaging it into ADC MSG and
AES-encrypting it before sending to the hub. Double encryption i.e. But
I'm confident that this will not stress any part at all, and
RSA-encrypting private messages will never end up consuming "too much"
CPU for anybody. Well, maybe if you can type _very_ fast...

EXTENSIONS
ADC supports zlib as an extension. I see no problem with compressing
before encrypting (the plain text message is most definitely more easily
compressed, and Better compressed than an encrypted message). As a
general rule, encryption is the outer-most part of sending/receiving a
message (except for in the previous case of encrypting private
messages), and please don't forget to pad the messages to whole 16-byte
blocks, it won't entirely be encrypted otherwise!

Lastly, client-client communication is encrypted with the scheme stated
above with no exception.

VIABILITY
Now, is this viable or even feasible? It surely sounds good, but
defining it as a compulsory part of ADC (with all positive effects that
means) requires it to be fast (enough). The fact that protocol omplexity
increases drastically if encryption is optional due to agreeing on
whether to use it etc. is an argument by itself. Anyway, one basic
approach to optionality is the levels [require-not, prefer-not, prefer,
require] where level 1-4 means No communication (!), [1-2]-[1-3] means
unencrypted communication and [2-4]-4 and 3-3 means encrypted
communication. However, connecting to a hub that by itself is claiming
level 1-3, can that be considered safe? What if a few users communicate
unencrypted with the hub. My searches as well as main chats can't be
considered safe. Anyone interested in defining these things more
strictly as well as suggesting a neat implementation (especially
client-wise) on how to inform the user whether communication is safe or
not, is welcome to do so. Add a new description-tag (INF-part i.e.)?
Unless the implementation is intelligent enough to appear very clear to
even not so technically blessed users, I suggest we skip encryption
entirely, and to me that would mean a great protocol loss.

BENCHMARK
My benchmark is written in C++ (and attached to this mail), and is
compiled on Linux and linked with openssl 0.9.7 (when linking with
gcc, don't forget -lcrypto, that's the library name containing all
necessary functions - libcrypto i.e. not libssl, I use this line:
gcc -lcrypto -lstdc++ -o cbench crypto_benchmark.cpp)
If you run the benchmark on your own, change the KEY_SIZE macro in the
top of the source code for different key sizes.
The benchmark can probably be run in win32 by fairly small
justifications in the code, if any. First of all, I simulate messages
passing through a hub. They are received encrypted (with different
encryption with respect to the user who sent the message). Every message
is decrypted, and then individually encrypted and sent back to each
user. (Yes, I currently only simulate broadcast messages, they stress
ciphering the most I suppose, and I never 'send'.)
This is for stressing encryption against a large amount of keys, and
suitably corresponds to a hubs daily job. Many short strings, many
different keys, many users. In the client-client case below, I focus
on quite the opposite - Few but very large (relatively) strings and few
keys.
I once again request; A recorded hub session containing (except password
info) all messages passed through the hub for a substantial number of
seconds, and converted from DC to ADC. I'm optimistic that one of us
writes a py-script in 2 minutes for the conversion. I know of one who
has proved good py skills more than once...
The importance of using session recordings (logs) is to simulate things
like active/passive searches, user list propagation, hub-restart
situation etc...

Today client-client transfers is more or less; read from file - send -
receive - write to file. Clients should try to copy the data as little
as possible. In C/C++ cases this is fairly simple and comes naturally,
by allocating once, using pointers and sending the data from that same
address. In the case of encrypting, this becomes an even more important
issue. The openssl implementation of the symmetric Rijndael cipher can
use the same data as input and output. Therefore the same procedure can
be taken but with adding an encryption step for each block before
sending the data. Remember, strings Must be sized as multiples of 16
bytes. Pad with nulls to accomplish this at the 'end' of a message/file.
I'm unsure client-client needs more 'realistic' benchmarks than the ones
cologic gave, which are openssl's speed test.

RESULTS
I simulate a hub with 2500 users. I generate 10000 message with random
readable ASCII text at random length (between 16 and about 200
characters). A message here is a Text, its Length and a Pointer to a
user object. A user object has some Dummy information and a Pointer to
an AES key (an openssl-internal structure, actually two, one for
encryption and one for decryption).
 From the messages I randomly look at one, decrypt it, and then go
through the list of users and encrypt it against each users unique key.
I loop this behavior for 30 seconds and show how many bytes were
totally encrypted/decrypted, as well as how many messages passed through
the hub (One message equals one Incoming message sent to everybody).

Benchmark run on my server, which is a PII (Deschutes) @ 450 MHz.
Throughput:
Key-size 128 bits: 9.2 MiB/s, 29 messages/s.
Key-size 192 bits: 7.9 MiB/s, 25 messages/s.
Key-size 256 bits: 7.0 MiB/s, 23 messages/s.

Benchmark run on a server at the university, which is a P4 @ 1.7 GHz.
Throughput:
Key-size 128 bits: 29.1 MiB/s, 97 messages/s.
Key-size 192 bits: 25.6 MiB/s, 84 messages/s.
Key-size 256 bits: 23.1 MiB/s, 77 messages/s.

CONCLUSION
As far as I'm aware, most hubs today are being limited to 10 Mbit/s,
and that's the most critical bottleneck. I have shown that a simple (but
more thorough than an openssl speed test) quantitative simulation of a
hub even at 256 bits AES encryption handles more than 50 Mbit/s (7 MiB/s
x 8 bits) on an ancient machine, and on a today more regular computer
the same bit size shows well above 200Mbit/s.
I seriously don't believe one is supposed to run a messaging server for
thousands of users with fully encrypted traffic on a computer like mine,
and on the more modern ones, we see an outstanding performance well
above what's necessary on even massively loaded hubs.

For p2p transfers we have: the faster connection (broadband), the more
pressure on the computer. You need fast hardware to fully use a 100Mbit
connection, and if you use a fast computer (for example something like a
P4 1.7 GHz), then not even at the greatest key size will affect the
speed of the transfers.

In this conclusion, I discard any further slow-down due to file-handling
in clients, etc. Instead I focus on the raw ciphering speeds. I can
hardly imagine an encryption as this one to give such a negative effect
on transfer speeds and/or CPU utilization as to draw conclusions that
it's not viable, and by that I mean compulsory. In other upcoming
bandwidth-heavy and even worse - CPU-heavy - applications such as Skype
do encryption occur, and not being optional.
But if it for some reason needs to be optional, then someone should not
be able to reject secure transfers unless they also claim it in the INF,
and again, it needs its own implementation, and I'm personally not
willing of figuring that one out.

I also have ignored key-propagation for searches in the implementation
scheme, but this should not be a burden to satisfy, if we can just unite
on the very basics of encryption in the first place.

However the bandwidth results are more than convincing, the message
count could be more of an issue, therefor I request that anyone with
experience in this particular entity gives comments on this, as well as
anyone willing of sharing thoughts on this topic in general.

Opera




crypto_benchmark.cpp

/* 
 * Copyright (C) 2005 Gustaf Räntilä, gustaf ta gempond tod com
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 */

// Standard c-library stuff
#include <stdlib.h>
#include <time.h>

// Crypto
#include <openssl/aes.h>
#include <openssl/rand.h>

// STL
#include <string>

// Changable values
/* Key sizes are in bytes, so 128, 192 and 256 bits refer to 16, 24 and 32 bytes. */
#define KEY_SIZE 16
/* User count */
#define MAX_USERS 2500
/* Message pool */
#define MAX_MESSAGES 10000
/* Time of simulation */
#define SECONDS 30

#define MIN(x, y) (((x) < (y)) ? (x) : (y))

using namespace std;

typedef unsigned char u8;
typedef const u8 cu8;

struct AESKey {
	AES_KEY enc;
	AES_KEY dec;
};

class AES {
public:
	static AESKey* makeKey(const char* str, size_t len) {
		AESKey* aes = new AESKey;
		cu8* key = (cu8*)str;
		size_t keysize = len;
		u8 tmpkey[32];
		if (len != 16 && len != 24 && len != 32) {
			// Algorithmly not necessary, but for libcrypto it is
			// Note; key is now NOT of size <len> as requested!
			// Try to stick to using AES specified key-sizes.
			// Pad "too small" keys with blankspace, and truncate too large ones
			len = MIN(len, 32);
			memcpy(&tmpkey[0], key, len);
			if (len < 16)
				keysize = 16;
			else if (len < 24)
				keysize = 24;
			else
				keysize = 32;
			for (size_t i = len; i < keysize; ++i)
				tmpkey[i] = ' ';
			key = &tmpkey[0];
		}
		if (0 != AES_set_encrypt_key(key, keysize * 8, &aes->enc)) {
			delete aes;
			return NULL;
		}
		if (0 != AES_set_decrypt_key(key, keysize * 8, &aes->dec)) {
			delete aes;
			return NULL;
		}
		return aes;
	}
	static void encrypt(const char* in, char* out, size_t len, AESKey* key) {
		// if (len % 16) exit(1); // not block size multiple
		for (size_t i = 0; i < len; i += 16)
			AES_encrypt((cu8*)in + i, (u8*)out + i, &key->enc);
	}
	static void decrypt(const char* in, char* out, size_t len, AESKey* key) {
		// if (len % 16) exit(1); // not block size multiple
		for (size_t i = 0; i < len; i += 16)
			AES_decrypt((cu8*)in + i, (u8*)out + i, &key->dec);
	}
};

struct UserStruct {
	char pad[100]; // In hubsofts, user structures contains a lot of stuff ignored here
	AESKey* aeskey;
};

struct Message {
	UserStruct* user;
	char* message;
	int message_len;
};

static inline int simpleRand(int max) {
	return rand() % max;
}

int main(int argc, void** argv) {
	srand(time(0));

	UserStruct* users = new UserStruct[MAX_USERS];
	Message* messages = new Message[MAX_MESSAGES];

	// Create user objects
	printf("Generating %i users (keys, of size %i bits)...\n", MAX_USERS, KEY_SIZE*8);
	fflush(stdout);
	char randkey[KEY_SIZE];
	for (int i = 0; i < MAX_USERS; ++i) {
		RAND_bytes((u8*)&randkey[0], KEY_SIZE);
		users[i].aeskey = AES::makeKey(&randkey[0], KEY_SIZE);
	}

	// Create messages
	printf("Generating %i messages, and encrypt them (to simulate incoming encrypted message)...\n", MAX_MESSAGES);
	fflush(stdout);
	for (int i = 0; i < MAX_MESSAGES; ++i) {
		messages[i].user = &users[i % MAX_USERS];
		messages[i].message_len = 16;
		messages[i].message_len += 	simpleRand(200); // Message sizes are between 16 and 216 bytes...
		if (messages[i].message_len % 16)
			// ...make message multiple of 16 bytes, AES is 16-size block cipher
			messages[i].message_len += 16 - (messages[i].message_len % 16);
		messages[i].message = new char[messages[i].message_len];
		RAND_bytes((u8*)messages[i].message, messages[i].message_len - 1);
		// Ensure readable ASCII characters... No reason for this really.
		for (string::size_type j = 0; j < messages[i].message_len - 1; ++j)
			if ((u8)messages[i].message[j] < 32 || (u8)messages[i].message[j] > 192)
				static_cast<u8>(messages[i].message[j]) += 128;
		messages[i].message[messages[i].message_len - 1] = 0;
		// Encrypt this message
		AES::encrypt(messages[i].message, messages[i].message, messages[i].message_len, messages[i].user->aeskey);
	}

	// Benchmarking message encryption/decryption
	printf("Decrypting random message, encrypting to all %i users for %i seconds...\n", MAX_USERS, SECONDS);
	fflush(stdout);

	clock_t c_start, c_end;
	c_start = clock();
	c_end = c_start + CLOCKS_PER_SEC * SECONDS;
	size_t bytes = 0, num_messages = 0;
	do {
		Message* msg = &messages[simpleRand(MAX_MESSAGES)];
		// if (msg->message_len % 16) exit(1);
		// Decrypt message
		char* dec_msg = new char[msg->message_len];
		AES::decrypt(msg->message, dec_msg, msg->message_len, msg->user->aeskey);
		bytes += msg->message_len;
		num_messages += 1;
		// Encrypt to each and every user (including the sender, whatever... doesn't really affect ordo.)
		for (int i = 0; i < MAX_USERS; ++i) {
			char* sendbuf = new char[msg->message_len];
			AES::encrypt(dec_msg, sendbuf, msg->message_len, users[i].aeskey);
			delete[] sendbuf;
			bytes += msg->message_len;
		}
		delete[] dec_msg;
	} while (clock() < c_end);
	printf("Done. Throughput: %i kbytes/s, %i and messages passed hub per second.\n", (bytes/SECONDS) >> 10, num_messages / SECONDS);
	fflush(stdout);

}



Bilagor:
crypto_benchmark.cpp5.7 kB