hjp: projekte: messaging

A Messaging System

Requirements

An unordered list:

  • No central servers.
  • Should work for very short (IM-like) messages as well as for long messages (including attached documents, video/audio streams, etc.)
  • Identity of sender
  • Make spamming hard
  • Make eavesdropping hard
  • Control about delivery status.
  • Roaming users
  • Should work through firewalls, NAT, etc.
  • Must allow one-to-many communications, equivalent to mailing-lists or chat-rooms.

Ideas for implementation

Each entity (=mailstore, =user in the normal case) has a public key.

Each entity is represented by an agent. The agent listens on a specific IP-Address:Port. It has access to the entity's private key Is this a good idea? Probably better to give the agent a different key and sign that with the key of the entity

Each agent accepts only messages from a known set of other agents.

An agent should forward messages received from trusted agents if it doesn't represent the receiving entity.

This builds a "web of trust". Every member of the strong set can send messages to every other member of the strong set.

A newly created entity initially cannot send any messages. It has to be "invited" or whitelisted by at least one entity already in the set.

An entity can be explicitely whitelisted (e.g., after verifying the fingerprint per email, phone, etc.) or implicitely, by sending it a message.

If a trusted agent misbehaves, trust can be revoked.

If an agent wants to send a message to an entity with which it doesn't have an existing trust relationship, it sends a message "who will relay a message to X?" to all its neighbours. If they don't have a trust relationship with X either, they will forward the question to their neighbours. Eventually, a reply containing a path to X will be returned. (Agents should cache the replies and implement some kind of rate limiting - in any case the number of messages to previously unknown entities is expected to be rather low, so that the flooding should be tolerable)

Each agent signs any forwarded message with its own key. This allows the recipient to verify the path. It also allows to send complaints back through the same path. An agent which receives a lot of complaints about one of its trusted neighbours may revoke its trust. This should make it possible to quench spam sources quickly. While a spammer can create new entities at will, he can only whitelist them on his existing agents, which have already been blocked or will be blocked soon since all spam by the new entities must be relayed through them.

Each agent also sends a signed progress report back to the sender (through the reverse path). So the sender knows at any time where the message is. Each of the agents which have the message in transit can also try an alternate path if the message appears to be stalled.

Each agent announces its reachability parameters (IP address, port) to its neighbours, whenever they change, and also periodically. An agent caches these parameters for all its neighbours. It should remove them (or at least mark them as invalid) if they prove to be wrong (e.g., no agent is reachable, or the agent doesn't have the correct key) and may remove them after some time of inactivity.

This allows communities to survive in environments with dynamic IP addresses as long as each agent is only offline for short timespans after which at least one of its neighbours still has the same IP-address. If one of the neighbours can still be reached it can be asked for the other IP addresses.

For some communities this will not be sufficient, though. Consider a local ISP which changes the IP addresses of its customers every 8 hours. A community of friends using the same ISP who switch their PC on only in the evening will lose contact in the 18+ hours during which none of them is online. Such small, isolated communities will be common until the system is widely deployed.

Thus there needs to be a public directory service. One possibility is to use domain names instead of IP addresses and use the existing DynDNS providers to map static domain names to changing IP addresses. Another possibility is to build service-specific directory servers. They would answer to "where is X?" messages but would not relay. The advantage of the former is that DynDNS already exists. The advantage of the latter is that these servers contain the information specific to the protocol and can use protocol-specific authentication (e.g., each agent has to sign its announcements with its private key).

There doesn't seem to be a good way to be reachable in the presence of firewalls and or nat and the absence of public servers. I will have to look at P2P protocols to see how they handle this.

Currently the best I can think of is to connect to a neighbour with a public IP address and keep the connection open. The neighbour can then announce that it knows how to relay to the shielded agent.

Keeping connections permanently open is expensive in terms of file descriptors. Thus one would normally only keep open a connection for a single message and close it again afterwards. OTOH, if each agent only announces open connections (instead that it thinks it can open a connection), the reachability information changes almost in real time. It also limits the number of neighbours any agent can have, which may be seen as a feature.

Messages are encrypted with the recipients public key.

It is possible to use "onion routing" by nesting the encryption. A sender S could encrypt a message for A, encrypt that for B and that for C and send it to C. C will decrypt the message, notice that it is really for B and pass the message to B. B will decrypt the message, notice that it is intended for A and pass it to A. A will decrypt the message, and know that it is the final recipient because it can read the message.

In this scenario, C has no way of knowing that B is not the final recipient and B has no way of knowing that A is the final recipient. An attacker compromising B or C cannot know the content of the message, cannot know that the recipient is A, and cannot even be sure that the sender was S (it can be sure that the message passed through S, if the signature of S is intact).

One to many communications can be achieved in the same way as with mailing-lists today: A message is addressed to an entity which then forwards it to all subscribers.

DocId: ea42d3124d487a1753ba4109286da729 $Id$