Analysis of Emmy<sup>+</sup>

04/6/2020 update: After a discussion with Michael Neuder (University of Colorado Boulder), who noticed a discrepancy between his analysis of Emmy+ and ours, we identified a bug in the script used to generate the data for the plot in the “Forks starting now” section. We have therefore updated the plot. The number of confirmations has changed from 6, 12, 44 to 7, 16, 67 for attacker stake fractions of 0.2, 0.3, 0.4, respectively. The discrepancy was due to taking into account the rate of the shortest adversarial forks that are faster than the honest chain; instead, we should have taken into account the rate of any fork faster than the honest chain. In addition, we have provided a more precise explanation of the last plot from the “Forks started in the past” section.

19/11/2019 update: Following feedback from Daniel Moroz and Michael Neuder, we edited the blog post in order to nuance our previous statement that “selfish baking is not profitable”. More precisely, selfish baking only gives tiny profits, for instance 0.0016% gains with respect to the honest rewards, for a dishonest baker with 30% stake fraction. A new table gives the details.

Note: This analysis was done with the help of Arthur Breitman and Bruno Blanchet (Inria). The code used for the analysis can be found at this url.

We have recently announced Emmy⁺ an improvement of Emmy, the current consensus algorithm used by Tezos. In this blog post, we perform a brief analysis of Emmy⁺.

Disclaimer: We emphasize that this is not a complete analysis; in particular, we do not present any security proofs. Also, the results of the analysis should not be extrapolated, they only hold inside the model we’re working with. Namely, we assume here that the network is synchronous and therefore all blocks and operations (including endorsements) are effectively broadcast to all participants within a known time bound, namely 40 seconds. Furthermore, we only consider a rather limited attacker, who needs not follow the protocol, but who is otherwise not capable of disturbing the network (by blocking or delaying messages between honest participants).

Let’s first recall the main ideas behind Emmy⁺: the number of endorsements included in a block no longer influences the fitness, which only increases by 1 with each level. Instead, the number of endorsements determines the minimal time at which a block is valid: the fewer endorsements a block carries the longer it takes before it can be considered valid. Furthermore, the reward policy is adjusted so it discourages selfish baking. As we will see, these changes increase the security of Tezos’ consensus layer.

We summarize next the main findings, which are obtained for the following security criterion: the rate at which a malicious fork can occur should be lower than 10^-8, that is, roughly once every two centuries.

Assuming an attacker that has at most 33% of the total stake, in our model we obtain that an operation included 6 blocks ago will only end up on the attacker’s chain (which might exclude that operation) once every two centuries, given that the chain is perfectly healthy during these last 6 levels (that is, no skipped priorities, no missed endorsements). Moreover, our analysis suggests that, for a freshly injected operation, and making no assumption on the health of the chain, 6, 12, and 44 block confirmations are recommended, assuming an attacker with a stake of 20%, 30%, and respectively 40% of the total stake. Here again, the statement is not absolute, but instead relative to our security criterion.

Our analysis also shows that, in our model of Emmy⁺, selfish baking results in insignificant profits, even when the baker attempting it has a very large portion of the stake.

This blog post is structured in four parts:

a presentation of the changes introduced by Emmy⁺
an analysis of malicious forks
an analysis of selfish baking
a brief presentation of the rationale for choosing the values of Emmy⁺-specific constants

The changes introduced in Emmy⁺

The improvements brought by Emmy⁺ are achieved by means of a new “minimum block delay” function and adjustments to the reward policy.

The new delay function is defined as follows:

emmy_plus_delay(p, e) =
  time_between_blocks[0] + p * time_between_blocks[1] +
  delay_per_missing_endorsement * max(0, (initial_endorsers - e))

In what follows we use the following abbreviations:

dp := time_between_blocks[1]
de := delay_per_missing_endorsement
ie := initial_endorsers

where dp stands for “delay per priority” (that is, per missed baking slot) and de for “delay per (missed) endorsement (slot)”. The chosen values are dp = 40, de = 8, and ie = 24.

In Emmy, the baking reward was set to 16 tez and the endorsing reward to 2/(p+1) tez (per endorsement slot), where p is the priority of the endorsed block.

In Emmy⁺, the baking reward is 16/(p+1) * (0.8 + 0.2 * e/32) and the endorsing reward is 2/(p+1), where p is the baked block’s priority or the priority of the block which contains the endorsement, respectively, and e is the number of included endorsements. These changes counter selfish baking, as well as incentivize bakers to include all available endorsements.

Indeed, smaller rewards for smaller priorities make selfish baking attempts much less profitable: a baker with a slot at priority 1 who might attempt to steal the block at priority 0 by withholding his endorsements would be better off having his endorsements included in the block of priority 0. Note that the reward for his baked block does not have much weight, it is worth less than 4 endorsements on the block at priority 0.

Also, the factor e/32 in the baking reward function is used to incentivize bakers to include all available endorsements. On the other hand, this factor only counts for 20% of the main factor of the baking reward (namely 16/(p+1)) so that the risk for losing the baking slot outweighs the benefit of waiting for more endorsements.

Preliminaries

To get an intuition of the event rates that we see in this post, we recall the following rough correspondences.

Rate (in blocks^-1)	Frequency
10^-3	10³ minutes = 16.6 hours
10^-4	6.9 days
10^-5	2.2 months
10^-6	1.9 years
10^-7	1.9 decades
10^-8	1.9 centuries
10^-x	1.9 * 10^(x-8) centuries

Here we assumed that blocks are produced every 60 seconds. Consequently, given a rate of 10^-x in blocks^-1 of an event e, e happens every 10^x blocks, and therefore every 10^x minutes.

Note that we talk about the rate at which an event happens rather than the probability of its occurrence, because, in contrast to a proof-of-work system like Bitcoin, if an attacker continuously attempts to make the event happen, she will for sure succeed (at a certain rate).

We also recall that 1mu = 10^-6, 1n = 10^-9, 1p = 10^-12, and 1f = 10^-15.

Malicious forks

A malicious fork is one where an attacker is building an alternative branch in parallel with the main branch. The attacker can then make her branch become the main branch, if her branch is faster. Here the attacker’s goal is to break consensus. This is in contrast to the selfish baking strategies, where the goal of a dishonest baker is to obtain financial gain through the reward system. Therefore we assume that the attacker is not refraining from double baking and double endorsing.

Forks started in the past

We are mainly interested in the following question: what is the rate at which an operation included in a block n levels ago will end up on an alternative branch? Here we assume that the attacker’s goal is to build an alternative branch in which the operation is not included. We therefore assume that the attacker starts building her branch at the same level as the block which contains the operation.

To answer this question, we consider the delay accumulated by the current chain, relative to the “ideal chain”, where all blocks are produced at priority 0 and have 32 endorsements. Note that the accumulated delay (called below “observed delay”) is the time difference between the timestamp of the current block and the timestamp of the block containing the operation, minus n minutes (corresponding to n observed blocks). We will see that this delay measures the “health” of the chain: the smaller the delay, the healthier the chain is. The operation ends up on the attacker’s branch if this branch becomes at some point in the future faster than the branch including the current block.

We plot the rates of such forks when the attacker’s stake fraction varies and for a few values of n and of the observed delay. Concretely, the different curves are for the cases when the number n of observed blocks is either 3 or 6 and when the observed delay is 0, 128, or 256 seconds. The observed delay can be expressed as sp * dp + me * de, where sp is the number of skipped priorities and me is the number of missed endorsements. Therefore an observed delay value of 128 may represent more than one possibility, like 0 skipped priorities and 16 missed endorsements, or 1 skipped priority and 11 missed endorsements, and so on (assuming dp=40 and de=8).

These rates can be interpreted as follows. In our model, one can consider an operation included n blocks ago as final as long as:

n=6 and there were no skipped priorities nor missing endorsements during the last 5 levels (that is, delay=0), assuming an attacker with at most 33% of the stake;
n=6 and, for instance, there are no skipped priorities, but around 2-3 missing endorsements per level, assuming an attacker with at most 31% of the stake;
n=6 and, for instance, 2 skipped priorities, but around 5 missing endorsements per level, assuming an attacker with at most 28% of the stake;
n=3 and again a perfectly healthy chain, assuming an attacker with at most 22% of the stake.

To get a better understanding of the case when the chain is not very healthy, we also plot the rate of forks as a function of n for 3 different stake fractions (0.2, 0.3, 0.4) and a delay proportional to the number n of observed blocks, namely 8*n*de, representing that one quarter of endorsements are being missed per level (and no skipped priorities).

The plot shows that, in our model:

for an attacker stake fraction of 0.2 (0.3, 0.4),
for a number n of observed blocks at least 5 (9, 25),
given that the observed delay is 8*n*de seconds,

the rate of having a fork of any length starting n blocks in the past is smaller than the one set by the security criterion.

Forks starting now

Next, we look into the case where the malicious fork starts at the current block. Similarly to the previous case, we plot the rate of forks as a function of the fork length. This analysis suggests how many block confirmations are necessary, under our assumptions, no matter how unhealthy the chain will be.

The plot shows that, in our model, for our security criterion to be met, one needs to wait 7, 16, and 67 blocks, when assuming an attacker with a stake fraction of 0.2, 0.3, and respectively 0.4.

Selfish baking

With selfish baking, a dishonest baker tries to “steal” baking slots by building a faster alternative branch in order to gain more rewards. To achieve this, the baker’s only strategy, as far as we see, is to withhold his endorsements in order to slow down the honest branch.

We proceed as follows. We first present three selfish baking scenarios, which are among the most relevant ones. Next, we compute the rates at which these scenarios occur, and also the rate at which selfish baking in general is successful, in relation with the fraction of the stake of the dishonest baker. Finally, we compare the baker’s expected rewards on the honest branch with those on the dishonest branch, thus determining whether selfish baking is profitable.

The scenarios

For the ease of reference, we denote the dishonest baker by D. To illustrate the corresponding scenarios, consider the following tree with root block B₀. The branch B₁, B₂, B₃ represents the chain being constructed by an honest baker, while B’₁, B’₂, B’₃ the one D tries to construct. Let t_i (resp. t’_i) be the time of block B_i (resp. B’_i). The label on top of a box represents the priority at which the block is created.

The three scenarios are as follows:

D bakes his block B’₁ before B₁, that is, t’₁ ≤ t₁.
D bakes his block B’₂ before B₂, that is, t’₂ ≤ t₂ (but t’₁ > t₁).
D bakes his block B’₃ before B₃, that is, t’₃ ≤ t₃ (but t’₁ > t₁ and t’₂ > t₂).

In scenarios (2) and (3) we assume that D also has the slots at priorities 1 to p₂ for the level corresponding to B₂, and similarly, it also has the priorities 1 to p₃ for the level corresponding to B₃. Therefore the lowest priority at which an honest baker can bake is p₂+1 and respectively p₃+1.

Block stealing rate

We analytically compute the rates at which a baker could steal baking slots by withholding endorsements for the 3 different scenarios. We also compute these rates in the general case, that is, for all possible selfish baking scenarios. More precisely, we consider all branches starting with a block at which the dishonest baker does not have priority 0. (If he has the best priority at some level, it is more profitable to simply bake the block at that level on the honest branch, instead of starting a fork.)

We plot the rate of forks for each of the three scenarios, the sum of these rates, and the rates for the general case, when the stake fraction varies. The graphic suggests that generalizing the scenarios to higher fork lengths (that is, higher than 3) is not worthwhile, because the rates become insignificant. The plots also shows that, at least when D does not have a very high stake, the 3 scenarios are the most relevant.

scenarios

We stress that here we do not take into account whether D would make a profit by selfish baking, but only whether D’s branch would be faster than the honest branch. We will see next that, even when the rate at which D’s branch is faster is quite high, D cannot take advantage of this, as D’s (average) reward on the dishonest branch would not surpass the (average) reward on the honest branch.

Profitability of selfish baking

The new reward policy ensures that selfish baking only gives insignificant profits, at least in our model.

Indeed, consider first scenario (1). Assume that D has e endorsement slots. If D plays correctly, sharing his endorsements, and if these are included in B₁, then his reward is 2 * e. If not, and assuming D is able to steal the baking slot, the reward for D is 16/(1+1)*(0.8+0.2*32/32) = 8 (assuming all 32 endorsements are available to D). Therefore, it is profitable for D to steal the block only when 8 > 2*e, that is, e < 4. However, it is easy to see that, with e < 4, D cannot steal the baking slot: we have t₁ < t’₁.

To further show that the new reward policy in Emmy⁺ discourages selfish baking, we also compute the expected reward on the dishonest branch (under the condition that the dishonest branch is faster and is profitable), under both the Emmy⁺ and the Emmy reward policies, assuming that we are in scenario (2).

compare_expected_rewards

We observe that, for instance, for a stake fraction of 0.2, the expected reward with Emmy⁺ rewards is about 6 times less than with Emmy rewards, and about 40 times less for a stake fraction of 0.4.

Finally, we compute the expected rewards in the general case, that is, for all possible selfish baking scenarios. More precisely, we compute the expected reward that D would obtain on the honest branch, and the expected reward D would obtain on the dishonest branch when it is faster than the honest chain. We plot the percentage of the expected reward on the dishonest branch with respect to the honest branch. To put these values in perspective, we also plot again the rate at which the dishonest chain is faster (meaning that block stealing is possible, though not necessarily profitable).

percentage_rewards_rate

We observe that the expected reward on the dishonest chain is always smaller than the expected reward on the honest chain. We also see that for very small stakes (that is, a stake fraction between 0.01 and 0.05), the expected reward on the dishonest chain is higher than 80% of the expected reward on the dishonest chain. However, in our model, the rate at which such cases occur is less than 10p, that is, less frequent as once per century. Intuitively, there is a peak for small stakes because, when the baker’s stake is small, an additional baked block makes a big difference in terms of rewards. Note also that this analysis assumes that the dishonest baker does not know in advance whether he will make profits, therefore he always performs selfish baking.

Finally, the following table considers the profits the dishonest baker gains with forks of length 2. As discussed above, the baker does not obtain profits with forks of length 1, while forks of length higher than 2 have very low rates of feasibility. The table shows the rate (of success per block), the profits (in tez) per year, and the percentage of the gains with respect to the rewards expected to be obtained by honest baking.

stake ratio	rate	profits per year	percentage gain
0.10	0.000000	0.268869	0.000006 %
0.15	0.000007	6.340287	0.000101 %
0.20	0.000061	43.331616	0.000515 %
0.25	0.000235	129.935813	0.001236 %
0.30	0.000499	207.868605	0.001648 %
0.35	0.000650	196.642652	0.001336 %
0.40	0.000586	115.204479	0.000685 %
0.45	0.000414	42.734604	0.000226 %

Choosing the constants

Overview

Two main goals guided us when choosing the value of the constants: minimizing the profitability of selfish baking and minimizing fork rates (equivalently, maximizing finality). As we will see, these two goals are conflicting for some of the constants.

Also, we note that for the kinds of analysis we have seen, it is not the concrete values of the dp and de constants that matter, but instead their ratio, dp/de. Indeed, for such analyses, we are interested in which branch of two given branches is faster, that is whether or not

$$\sum_{i=1}^{k}delay(p_i,e_i)<\sum_{i=1}^{k'}delay(p_i',e_i')$$

holds, where $k$, $k'$ are the lengths of the two branches, and $p_i, e_i, p_i', e_i'$ are the priorities and the number of endorsements of the blocks of the two branches, respectively. By expanding the delay expression, we obtain the following inequality

$$\frac{dp}{de}<\frac{\big(\sum_{i=1}^{k'}max(0,(ie - e_i'))\big) - \big(\sum_{i=1}^{k}max(0,(ie - e_i))\big)}{\big(\sum_{i=1}^{k}p_i\big) - \big(\sum_{i=1}^{k'}p_i'\big)}$$

This inequality shows that which branch is faster depends on the dp/de ratio rather than the values of dp and de taken independently.

We have taken dp to be 40 as it seems a reasonable value: it reduces the waiting time between blocks (with respect to the value in Emmy, 75), but it is still large enough to allow blocks and operations to be timely delivered.

Choosing `init_endorsements`

Intuitively, a lower ie value is better for preventing selfish baking and a higher value is better for preventing malicious forks.

Indeed, with a low ie a baker that wants to steal baking slots has less of a leverage by withholding endorsements. Say that the dishonest baker D has e' endorsements and say 32-e' endorsements are available to the honest bakers. For D to be able to delay the block honest bakers are baking, we need that ie - (32 - e') > 0, that is ie + e' > 32. So the lower ie is, the higher the number e' of endorsements D needs to have.

Regarding forks, the higher the ie, the more weight we give to the number of endorsements on the chain. This results in fewer chances for an attacker to build a faster alternative chain, as the number of endorsements available to him is independent of ie.

Next, we check the above intuition. We first consider the plot for the rates of of block stealing.

The plot suggests 20 as the value minimizing the rates for each given stake fraction.

We now consider fork rates when the number n of observed blocks is 5 and the observed delay is 128. We note that we obtain similar shapes for different n and delay values.

This confirms that higher values for ie correspond to smaller probabilities of forks.

We choose 24 as a trade-off value between lower values to prevent selfish baking and higher values to reduce malicious forks. As we have seen, the new reward policy in Emmy⁺ reduces even further the profitability of selfish baking.

Choosing `dp/de`

For the chosen value of ie, that is 24, we now vary the dp/de ratio. We obtain the following graph for the rates of block stealing.

Here we see that a ratio of roughly 7 minimizes all cases.

Next, we look at fork rates. As above, the plots are for n = 5 and delay = 128.

Again, we choose dp/de = 5 as a trade-off value that is good enough for both our goals. This leads to the value de = 8.

Analysis of Emmy+

Analysis of Emmy⁺