This page is also reachable within the Tor network: http://vb75uj2ap3hyyava.onion/research/torchat_protocol/
DISCLAIMER: This is NOT a specification but an external analysis of the TorChat protocol by studying the Java and Python source code implementation of it which can be found here and here. I am not related to the Tor or TorChat projects. I simply try to cover the protocol as correctly as possible but don't cry if it contains errors or is incomplete. I am using this as base for a C# based implementation for use in the Smuxi Messenger.
(Assumed) Design Goals
- p2p (decentral)
- encrypted
- anonymous (for everyone else, except the peers itself)
- hides who communicates with whom
- hides physical location
- registration free (auto id generation)
Transport
Uses TCP sockets to hidden services running on port 11009.
Peers send and receive messages on that TCP socket.
Connections
Hidden services behave like regular server sockets except that the server has no idea who (in the sense of IP source address) the client is because it is a tor client. As TorChat is p2p, it needs to make out-bound connections to send messages and allow in-bound connections to receive messages from other peers. Both in- and out-bound connections always happen on port 11009.
Out-Bound Connections
Connections to TorChat peers (the hidden service on port 11009) are out-bound and are authenticated by definition as only the owner of the hidden service key is able to respond to the connection attempt.
In-Bound Connections
Connections from other TorChat peers are always unauthenticated except they can prove in some way that they are who they pretent to be. TorChat uses an session token for each peer to authenticate their connection and only then we can believe the claimed origin of the messages we receive on that in-bound connection. For more details how this authentication procedure works refer to the Authentication section below.
Message Format
- type: byte array
- message seperator: 0x0a (LF)
- decode as string:
- replace '\r\n' with '\n' then '\n' with "\n" (LF)
Command Format
- command: a-z or _
- seperator: 0x20 (SP)
- payload: byte array
Example:
ping <payload>
Message Commands
ping
- command: ping
- seperator: 0x20 (SP)
- payload: <origin_hidden_service_id><seperator><authentication_token>
<origin_hidden_service_id> is the hash of the public key used in the onion network (also known as onion address). This is the address the peer needs to contact to return the authentication_token. This way the origin knows on which in-bound connection the peer sits on as the authentication_token was only sent to a single hidden service.
<authentication_token> is a string of no specific length, but should be 7-bit-only to avoid charset conversion issues.
WARNING: this authentication token has to be unique, cryptographically random and kept secret! If this token leaks, anyone can impersonate the identity of that TorChat peer as long as the TorChat application which generated this token runs.
Example:
ping mb4bc4jk4cj2fky4 31754944747097474078662100165902771331350515775810664422385852963171834014133
pong
client
version
add_me
message
status
Format:
status <status>
<status> can be one of:
- away
- available
- xa
Example:
status available
filename
filedata
filedata_ok
Authentication / Handshake
TBD
Potential Security Issues / Weaknesses
Hidden Services Keys
Tor's hidden services fully rely on 1024 bit RSA keys. I don't know yet how these keys are used to make a conclusion if this is a real weakness or not.
Hidden Services Guessing
Hidden services are stored on a DHT and can be iterated to find existing hidden services. As TorChat uses a static port this can be used to find TorChat users.
Authentication Tokens
The TorChat client that wants to authenticate an hidden service hashkey (which is a chat buddy / peer in TorChat) has to generate an authentication token that the chat peer needs to return. If the client selects a weak token, say a pretty short one or one without real random data, then the token could be guessed and you could send spoofed TorChat messages that look like they are coming from the authenticated peer.
Message Size
The protocol doesn't seem to specify or enforce a maximum message size. This could potentially easily be abused to consume all memory of the client.