Observations from a Trillian play-date
Rasmus Dahlberg, 2020-11-23.
Have you ever heard about Trillian in the context of transparency logging? Perhaps you view it as an integral part of Certificate Transparency, a solution for arbitrary transparency applications, or both. Even if you know Certificate Transparency quite well the Trillian details might be a bit blurry until you sit down and get some hands-on experience: at least that was the case for me. Therefore, Trillian and I had a little play-date. I thought I would share a few observations that in hindsight are obvious but at the same time helpful.
Problem statement and overview
I agree with Daz Wilkin that it is somewhat daunting to get started with Trillian. Putting it all together involves many different components and configurations, especially if you need the high reliability and scale that Trillian supports. It does not have to be that complicated though. Trillian is pretty much a database which includes an append-only Merkle tree:
- Trillian log server: exposes a gRPC API that is used by an application-dependent front-end or so-called Trillian personality. Requests and responses trigger operations on the underlying database, such as queuing new data requests and assembling cryptographic Merkle tree proofs.
- Trillian log signer: checks the database periodically and sequences it into a Merkle tree. The term log signer was confusing for me initially because it is usually the front-end personality that adds externally visible signatures. Therefore, I found it helpful to think of this component as a log sequencer.
I will not talk much about the front-end personality. It is the part of Trillian that you or your ecosystem has to implement. It will include definitions of public endpoints, the data to be logged, who is allowed logging it, etc.
Trillian as a database abstraction
The simplest description of Trillian is probably as a regular database. You can insert any item of your choice after serializing it as zeroes and ones, and come back later on and retrieve it. In reality it is more accurate to say that Trillian is hooked-up to a database, such as MariaDB using the schema over here. This means that before getting started a database must be configured such that there is a record in the Trees table that identifies a particular Trillian instance.
CREATE TABLE IF NOT EXISTS Trees( TreeId BIGINT NOT NULL, HashAlgorithm ENUM(‘SHA256’) NOT NULL, SignatureAlgorithm ENUM(‘ECDSA’, ‘RSA’, ‘ED25519’) NOT NULL, PrivateKey MEDIUMBLOB NOT NULL, PublicKey MEDIUMBLOB NOT NULL, ... );
Initially I was confused by the public-key cryptography that is part of the database schema: is it not the front-end personality that attaches signatures, for, say, Signed Certificate Timestamps (SCTs) in Certificate Transparency? Well, yes. But the scenario in mind here is that there might be a front-end personality that runs in a different trust domain, such that the Trillian back-end needs to sign some data to prove its origin. The front-end determines what becomes externally visible regardless of if these signatures are used.
New add-data requests are queued by the Trillian log server in an unordered table of pending leaves. Each such leaf also has an optional appendix, which allows extra data to be stored but without merging it into the Merkle tree. For example, it might be reasonable to hold on to an associated signature if the front-end personality requires that the data is signed as an admission criteria.
Trillian as a Merkle tree abstraction
The log signer sequences the pending leaf data periodically. By sequencing I mean taking the unordered leaves that one or more log servers queued, and then appending them to the current Merkle tree on specific indices. In other words, not even the log servers know the index of an added leaf until it is merged. It is important to keep in mind because several proposals in the past assumed that Trillian logs are timestamp ordered, but strictly speaking there is no such guarantee unless the front-end takes responsibility of sequencing (in which case there is a specific pre-ordered Trillian API that can be used).
The Merkle tree itself is viewed as many smaller sub-trees in the database, where only the bottom layer of each sub-tree is stored physically. Any interior node can be computed on the fly, which apparently saves up to 50% of space. The log server accesses the database to interact with the sequenced Merkle tree, e.g., to retrieve tree heads and build audit paths (hashes in the tree). As such, there is no explicit communication between the log server and signer.
Trillian as an API
The final part of the puzzle is the interface that the front-end personality can use while talking to Trillian. Fortunately, it is relatively straight forward. You will only send requests and receives responses from the log server that exposes a gRPC API. Possible requests and responses are documented here. This is really the place to look if you want to know what will “just work”.
For example, you will notice that there is a
QueueLeafRequest that takes as
input some data that goes into the Merkle tree and the leaf’s Appendix, as well
as an identity hash that tells Trillian what should be counted as a duplicate.
You may also take advantage of the built-in Trillian rate limiting by specifying
charge_to string. You can think of this as saying “Dear Trillian,
this IP address requested to add a leaf and it is signed using a certificate
that ends in the following trust anchor”. In response a resource exhaustion
error might be returned if too many requests were observed for a given quota
Other requests I would suggest you look into include retrieving a leaf, a signed tree head, an inclusion proof, and a consistency proof. It goes a pretty long way if you want to get what details are (not) in the front-end personality.
The view that Trillian is a database with an append-only Merkle tree is by no means wrong, but it is also not a complete description. For example, there is also a map mode that associates keys with values without being append-only. If you look further into Trillian you will also realize that there are many details that matter for deployment but not so much if we just want to get the hang of things. For example, there is built-in functionality for running several log server and signing instances, coordinating them, exporting health metrics, choosing database back-ends, configuring rate limiting, and more. If that sounds interesting you can get an enhanced intuition by reading the manual deployment scenario documentation for Certificate Transparency.
Fredrik Strömberg provided valuable feedback on this story, which is sponsored by my System Transparency employment at Mullvad VPN.