Self Organisation in Coin Selection

3 July 2018 Edsko de Vries 18 mins read

Self Organisation in Coin Selection

The term "self organisation" refers to the emergence of complex behaviour (typically in biological systems) from simple rules and random fluctuations. In this blog post we will see how we can take advantage of self organisation to design a simple yet effective coin selection algorithm. Coin selection is the process of selecting unspent outputs in a user's wallet to satisfy a particular payment request (for a recap of UTxO style accounting, see section "Background: UTxO-style Accounting" of my previous blog post). As Jameson Lopp points out in his blog post The Challenges of Optimizing Unspent Output Selection, coin selection is thorny problem. Moreover, there is a surprising lack of academic publications on the topic; indeed, the only scientific study of coin selection appears to be Mark Erhardt's masters thesis An Evaluation of Coin Selection Strategies.

Note: by a slight abuse of nomenclature, throughout this blog post we will refer to a user's set of unspent outputs as that user's UTxO.

Dust

An obvious strategy that many coin selection algorithms use in some form or other is "try to get as close to the requested value as possible". The problem with such an approach is that it tends to create a lot of dust: small unspent outputs that remain unused in the user's wallet because they're not particularly useful. For example, consider the "largest first" algorithm: a simple algorithm which considers all unspent outputs of the wallet in order of size, adding them to a running total until it has covered the requested amount. Here's an animation of the effect of this algorithm:

Figure 1 Figure 1-1
Figure 1. Simulation of largest-first coin selection. Main histogram shows UTxO entries; inset graph shows UTxO balance in blue and UTxO size in red, histogram top-right shows number of inputs per transaction, graph bottom right shows the change:payment ratio (more on that below). Graph at the bottom shows the distribution of deposits (blue, left axis) versus payments (red, right axis). In this case, both are normally distributed with a mean of 1000 and 3000 respectively, and we have a deposit:payment ratio of 3:1; modelling a situation where we have frequent smaller deposits, and less frequent but larger payments (withdrawals). The wallet starts with an initial balance of 1M.

There are various things to see in this animation, but for now we want to focus on the UTxO histogram and its size. Note that as time passes, the size of the UTxO increases and increases, up to about 60k entries after about 1M cycles (with 3 deposits and 1 payment per cycle). A wallet with 60k entries is huge, and looking at the UTxO histogram we can see why this happens: virtually all of these entries are dust. We get more and more small outputs, and those small outputs are getting smaller and smaller.

Cleaning up

Erhardt makes the following very astute observation:

If 90% of the UTxO is dust, then if we pick an unspent output randomly, we have a 90% change of picking a dust output.

He concludes that this means that a coin selection algorithm that simply picks unspent outputs at random might be pretty effective; in particular, effective at collecting dust. Indeed, it is. Consider the following simulation:

Figure 2 Figure 2-2
Figure 2. Same distribution and ratio as in Figure 1; we run the largest-first algorithm for 1M cycles, and then random coin selection for another 150k cycles.

Notice quite how rapidly the random coin selection reduces the size of the UTxO once it kicks in. If you watch the inputs-per-transaction histogram, you can see that when the random input selection takes over, it first creates a bunch of transactions with 10 inputs (we limited transaction size to 10 for this simulation), rapidly collecting dust. Once the dust is gone, the number of inputs shrinks to about 3 or 4, which makes perfect sense given the 3:1 ratio of deposits and withdrawals.

We will restate Erhardt's observation as our first self organisation principle:

Self organisation principle 1. Random selection has a high priobability of picking dust outputs precisely when there is a lot of dust in the UTxO.

It provides a very promising starting point for an effective coin selection algorithm, but there are some improvements we can make.

Active UTxO management

Consider the following simulation of a pure "select randomly until we reach the target value" coin selection algorithm:

Figure 3 Figure 3-2
Figure 3. Random-until-value-reached, for a 1:1 ratio of deposits and withdrawals, both drawn from a normal distribution with mean 1000.

The first observation is that this algorithm is doing much better than the largest-first policy in terms of the size of the UTxO, which is about 2 orders of magnitude smaller: a dramatic improvement. However, if we look at the UTxO histogram, we can see that there is room for improvement: although this algorithm is good at collecting dust, it's also still generating quite a bit of dust. The UTxO histogram has two peaks. The first one is approximately normally distributed around 1000, which are the deposits that are being made. The second one is near 0, which are all the dust outputs that are being created.

This brings us to the topic of active UTxO management. In an ideal case, coin selection algorithms should, over time, create a UTxO that has "useful" outputs; that is, outputs that allow us to process future payments with a minimum number of inputs. We can take advantage of self organisation again:

Self organisation principle 2. If for each payment request for value x we create a change output roughly of the same value x, then we will end up with a lot of change outputs in our UTxO of size x precisely when we have a lot of payment requests of size x.

We will consider some details of how to achieve this in the next section. For now see what the effect of this is on the UTxO:

Figure 4 Figure 4-2
Figure 4. Same deposits and withdrawals as in Figure 3, but now using the "pick randomly until we have a change output roughly equal to the payment" algorithm.

The graph at the bottom right, which we've ignored so far, records the change:payment ratio. A value near zero means a very small change output (i.e., dust); a very high value would be the result of using a huge UTxO entry for a much smaller payment. A value around 1 is perfect, and means that we are generating change outputs of equal value as the payments.

Note that the UTxO now follows precisely the distribution of payment requests, and we're not generating dust anymore. One advantage of this is that because we have no dust, the average number of inputs per transaction can be lower than in the basic algorithm.

Just to illustrate this again, here is the result of the algorithm but now with a 3:1 ratio of deposits and withdrawals:

Figure 5 Figure 5-2
Figure 5. Same algorithm as in Figure 4, but now with 3:1 deposits:payments (i.e., many small deposits, fewer but larger payments).

We have two bumps now: one normally distributed around 1000, corresponding to the the deposits, and one normally distributed around 3000, corresponding to the payment requests that are being made. As before, the median change:payment ratio is a satisfying round 1.0.

The "Random-Improve" algorithm

We are now ready to present the coin selection algorithm we propose. The basic idea is simple: we will randomly pick UTxO entries until we have reached the required value, and then continue randomly picking UTxO entries to try and reach a total value such that the the change value is roughly equal to the payment.

This presents a dilemma though. Suppose we have already covered the minimum value required, and we're trying to improve the change output. We pick an output from the UTxO, and it turns out to be huge. What do we do? One option is to discard it and continue searching, but this would result in coin selection frequently traversing the entire UTxO, resulting in poor performance.

Fortunately, self organisation comes to the rescue again. We can set an upper bound on the size of the change output we still consider acceptable (we will set it to twice the payment value). Then we take advantage of the following property.

Self organisation principle 3. Searching the UTxO for additional entries to improve our change output is only useful if the UTxO contains entries that are sufficiently small enough. But precisely when the UTxO contains many small entries, it is less likely that a randomly chosen UTxO entry will push the total above the upper bound we set.

In other words, our answer to "what do we do when we happen to pick a huge UTxO entry?" is "we stop trying to improve our selection". We can now describe the algorithm:

  1. Randomly select outputs from the UTxO until the payment value is covered. (In the rare case that this fails because the maximum number of transaction inputs has been exceeded, fall-back on the largest-first algorithm for this step.)
  2. Randomly select outputs from the UTxO, considering for each output if that output is an improvement. If it is, add it to the transaction, and keep going. An output is considered an improvement when:
    • It doesn't exceed the specified upper limit
    • Adding the new output gets us closer to the ideal change value
    • It doesn't exceed the maximum number of transaction inputs.
      Figure 6. The Random-Improve algorithm. Side note for point (2a): we use twice the value of the payment as the upper limit. Side note for point (2b): it might be that without the new output we are slightly below the ideal value, and with the new output we are slightly above; that is fine, as long as the absolute distance decreases.

Evaluation

The algorithm from Figure 6 is deceptively simple. Do the self organisation principles we isolated really mean that order will emerge from chaos? Simulations suggest, yes, it does. We already mentioned how random input selection does a great job at cleaning up dust in Figure 2; what we didn't emphasize in that section is that the algorithm we simulated there is actually our Random-Improve algorithm. Notice how the median change:payment ratio is initially very low (indicative of a coin selection algorithm that is generating a lot of dust outputs), but climbs rapidly back to 1 as soon as Random-Improve kicks in. We already observed that it does indeed do an excellent job at cleaning up the dust, quickly reducing the size of the UTxO. The simulations in Figures 4 and 5 are also the result of the Random-Improve algorithm.

That said, of course the long term effects of a coin selection algorithm can depend strongly on the nature of the distribution of deposits and payments. It is therefore important that we evaluate the algorithm against a number of different distributions.

Normal distribution, 10:1 deposit:payment ratio

We already evaluated "Random-Improve" against normally distributed payments and deposits with a 1:1 ratio and a 3:1 ratio; perhaps more typical for exchange nodes might be even higher ratios. Here is a 10:1 ratio:

Figure 7 Figure 7-2
Figure 7. Simulation of largest-first coin selection. Main histogram shows UTxO entries; inset graph shows UTxO balance in blue and UTxO size in red, histogram top-right shows number of inputs per transaction, graph bottom right shows the change:payment ratio (more on that below). Graph at the bottom shows the distribution of deposits (blue, left axis) versus payments (red, right axis). In this case, both are normally distributed with a mean of 1000 and 3000 respectively, and we have a deposit:payment ratio of 3:1; modelling a situation where we have frequent smaller deposits, and less frequent but larger payments (withdrawals). The wallet starts with an initial balance of 1M.

We see a very similar picture as we did in Figure 5. Since the deposits and payments are randomly drawn (from normal distributions), the UTxO balance naturally fluctuates up and down. What is satisfying to see however is that the size of the UTxO tracks the balance rather precisely; this is about as good as we can hope for. Notice also that the change:payment ratio is a nice round 1, and the average number of transaction inputs covers around 10 or 11, which is what we'd expect for a 10:1 ratio of deposits:payments.

Exponential distribution, 1:1 deposit:payment ratio

What if the payments and deposits are not normally distributed? Here is Random-Improve on exponentially distributed inputs:

Figure 8 Figure 8-2
Figure 8. Random-Improve, 1:1 deposit:payment ratio, deposits and payments both drawn from an exponential distribution with scale 1000.

In an exponential distribution we have a lot of values near 0; for such values it will be hard to achieve a "good" change output, as we are likely to overshoot the range. Partly due to this reason the algorithm isn't quite achieving a 1.0 change:payment ratio, but at 1.5 it is still generating useful change outputs. Furthermore, we can see that the size of the UTxO tracks the UTxO balance nicely, and the average number of transaction inputs is low, with roughly 53% having just one input.

Moreover, when we increase the deposit:payment ratio to 3:1 and then 10:0, the change:payment ratio drops to about 1.1 and then back to 1.0 (graphs omitted).

Erlang

The exponential distribution results in many very small deposits and payments. The algorithm does better on slightly more realistic distributions such as the Erlang-k distributions (for k > 1). Here we show the animation for the 3:1 deposit:payment ratio using the Erlang-3 distribution; the results for other ratios (including 1:1) and other values of k (we additionally simulated for k = 2 and k = 10) are similar.

Figure 9 Figure 9-2
Figure 9. Random-Improve, 3:1 deposit:payment ratio, deposits drawn from an Erlang-3 distribution with scale 1000 and payments drawn from Erlang-3 distributio with scale 3000.

More payments than deposits

We have been focusing on the case where we have more deposits and fewer (but larger) payments. What happens if the ratio is reversed?

Figure 10 Figure 10-2
Figure 10. Random-Improve, 1:10 deposit:payment ratio, deposits and payments drawn from a normal distribution with mean 10k and 1k, respectively. 1M cycles.

In this case we are unable to achieve that perfect 1.0 change:payment ratio, but this is expected: when we have large deposits, then we frequently have no choice but to use those, leading to large change outputs. We can see this more clearly when we slow things right down, and remove any source of randomness; here is the same 1:10 ratio again, but now only the first 100 cycles, and all deposits exactly 10k and all payments exactly 1k:

Figure 11
Figure 11. Random-Improve, 1:10 deposit:payment ratio, all deposits exactly 10k, all payments exactly 1k (no randomness). First 100 cycles only.

We can see the large value being deposited, and then shifting to the left in the histogram as it is getting used for deposits, each time decreasing that large output by 1k. Indeed, this takes 10 slots on average, which makes sense given the 10:1 ratio; moreover, the average value of the "large output" in such a 10-slot cycle is 5k, explaining why we are getting 5.0 change:payment ratio.

The algorithm however is not creating dust outputs; the 1k change outputs it is generating are getting used, and the size of the UTxO is perfectly stable. Indeed, back in Figure 12 we can see that the size of the UTxO tracks the balance perfectly; moreover, the vast majority of transactions only use a single input, which is what we'd expect for a 10:0 deposit:payment ratio.

Real data

MoneyPot.com

Ideally, of course, we run the simulation against real event streams from existing wallets. Unfortunately, such data is hard to come by. Erhardt was able to find one such dataset, provided by MoneyPot.com. When we run our algorithm on this dataset we get

Figure 12 Figure 12-2
Figure 12. Random-Improve, using the MoneyPot data set. There is a rougly 2:1 deposit:payment ratio. Values have been scaled. NOTE: log scale on the x-axis.

A few observations are in order here. First, there are quite a few deposits and payments close to 0, just like in an exponential distribution. Moreover, although we have many values close to 0, we also have some huge outliers; the payments are closely clustered together, but there is a 10xE9 difference between the smallest and largest deposit, and moreover a 10xE5 difference between the largest deposit and the largest payment. It is therefore not surprising that we end up with a relatively large change:payment ratio. Nonetheless, the algorithm is behaving well, with the size of the UTxO tracking the balance nicely, with an average UTxO size of 130 entries. The average number of outputs is 3.03, with 50% of transactions using just one input, and 90% using 6 or fewer.

Cardano Exchange

One of the large Cardano exchange nodes has also helped us with some anonymised data (deposits and payments), similar in nature to the MoneyPot dataset albeit significantly larger. Coming from an exchange node, however, this dataset is very much skewed towards deposits, with a deposit:payment ratio of roughly 30:1. Under these circumstances, of course, coin selection alone cannot keep the UTxO size small.

Figure 13 Figure 13-2
Figure 13. Random-Improve, using data set from a large Cardano exchange. There is a rougly 30:1 deposit:payment ratio. Values have been scaled. NOTE: log scale on the x-axis.

Conclusions

The choice of coin selection algorithm has far reaching consequences on the long term behaviour of a cryptocurrency wallet. To a large extent the coin selection algorithm determines, over time, the shape of the UTxO. Moreover, the performance of the algorithm can be of crucial importance to high-traffic wallets such as exchange nodes.

In his thesis, Erhardt proposes "Branch and Bound" as his preferred coin selection algorithm. Branch and Bound in essence is a limited backtracking algorithm that tries to find an exact match, so that no change output needs to be generated at all. When the backtracking cannot find an exact match within its bounds, the algorithm then falls back on random selection. It does not, however, attempt our "improvement" step, and instead just attempts to reach a minimum but fixed change size, to avoid generating dust. It is hard to compare the two algorithms directly, but on the MoneyPot dataset at least the results are comparable; Erhardt ends up with a slightly smaller average UTxO (109 versus our 130), and a slightly smaller average number of inputs (2.7 versus our 3.0). In principle we could modify our Random-Improve algorithm to start with bounded backtracking to find an exact match, just like Erhardt does; we have not done this however because it adds complexity to the algorithm and reduces performance. Erhardt reports that his algorithm is able to find exact matches in 30% of the time. This is very high, and at least partly explains why his UTxO and average number of change outputs is lower; in the Cardano blockchain, we would not expect that there exist exact matches anywhere near that often (never mind finding them).

Instead our proposed Random-Improve does no search at all, instead purely relying on self organisation principles, the first of which was stated by Erhardt, and the other two we identified as part of this research. Although in the absence of more real data it is hard to evaluate any coin selection algorithm, we have shown that the algorithm performs well across a large variety of different distributions and deposit:payment ratios. Moreover it is straight-forward to implement and has high performance.

One improvement we may wish to consider is that when there are very large deposits, we could occassionally issue a "reorganisation transaction" that splits those large deposits into smaller chunks. This would bring the change:payment ratio down, which would improve the evolution of the UTxO over time and is beneficial also for other, more technical reasons (it reduces the need for what we call "dependent transactions" in the wallet specification). Such reorganisation is largely orthogonal to this algorithm, however, and can be implemented independently.

IOHK make a visit to Google’s London offices

Googlers keen to talk Cardano and the future of cryptocurrency

28 June 2018 Jane Wild 57 mins read

IOHK make a visit to Google’s London offices

Cryptocurrency is one of the most discussed topics of the moment, and whatever people think about it, they all want to know what’s next for the technology. An audience that was no different was at Google, where Charles Hoskinson was invited to talk about Cardano and the future of cryptocurrencies. At the meeting, held at Google’s London headquarters last month, Googlers around the world dialled in to hear the presentation and put questions to IOHK’s chief executive, and to Duncan Coutts, IOHK Director of Engineering. As you might expect from a company that has laid much of the ground for today’s technological landscape, Googlers asked some of the most incisive and informed questions Charles has had on his speaking tour this year. After a brief introduction from Charles on IOHK and Cardano, the floor was opened to questions. Cardano development raised much interest, and Charles explained how its consensus protocol, Ouroboros, uses staking as a means to encourage people to join and help run the network. Development milestones were in focus too, such as one expected in July, when a test network will be opened to developers who want to play around with smart contracts on the IELE virtual machine. Later this year, full decentralization of the network is expected, as part of the Shelley phase of development, and Charles explained the background to all these topics.

There followed questions about how developers could get involved with Cardano, and about the K framework, which is part of IOHK’s test smart contract capability; how cryptocurrencies will cater for privacy, and of course, about where cryptocurrencies are headed. After the session, Googlers were kind enough to take the IOHK team up in the glass lifts to the top of the building and on to the roof, to enjoy the spectacular view across London.

Read the conversation, below.

Q. I have a question about Ouroboros and staking: is the number of tokens on offer sufficient to convince people to join the protocol and help run the network?

Charles: We published a preliminary monetary policy. The ceiling is 45 billion tokens and the current number in circulation is 26 billion so we have a little bit of room to work with inflation there, plus there's transaction fees as well to subsidise transaction validation.

First, Proof of Stake (PoS) is an extremely cheap protocol to run, especially Ouroboros, if you compare it to mining. The odds are that the operational costs will be so much lower that you really don't need to pay as much. But it's a broader and more abstract question: how do you handle fees and incentives and stake pools and delegation and then get all those incentives engineered in a way that you have reasonable game theoretic reasons to believe that the system is going to behave as intended? That's a really hard question. So we have two individual work streams. One is led by Elias Koutsoupias, an algorithmic game theorist at Oxford and a Gödel prize winner. He's working on the incentives question, trying to create models and understand inaudible first example – if you want to delegate and run a collection of stake pools, how many ought there to be and what are the economics of that?

Outside Google HQ, London

Then, the other side is, if I'm going to try to convince people to delegate, they ought to get a reward, so how much should that be? And then you have to do some empirical calculations – what is the operational cost of the network? You don't want to pay too much more inaudible but you also want to pay enough to incentivise people to run 24/7 nodes to maintain the system. It's an interesting question, but with the inflation that we've proposed we have more than enough wiggle room to work with. Not only will people participate, they'll probably make windfall profits relative to operational costs, given the way these markets work.

We opened up registration for stake pools last month and were looking for 100 applicants for a beta test, but got 1,500 applications – 15 times more people expressed interest than we expected.

As with all monetary parameters in a beta system, these things can be adjusted depending on facts and circumstances, but the reality is that the driver here is the price of the underlying asset – the token – and markets tend to converge on that. The short answer is, it's probably going to work out; the long answer is that we're probably not going to have the right model to begin with. We're either going to underpay or overpay and it's qualitatively going to be pretty obvious, based on participation of the network. The odds are that we’re probably going to overpay in terms of rewards.

Q. On all the projects that you are driving, are there specific milestones that will for sure be completed this year?

Look at cardanoroadmap.com, for Cardano-specific projects. Month by month, it gives an update on where we're at. We also do weekly reports and we try to be as transparent as possible about where we're at. Our goal is to release the next major version of Cardano some time this year, called Shelly. We are working really hard towards that. It might slip, but the odds are that it won't. It's a difficult project. Shelley is true decentralization of the network. At the moment we're running our proof of stake protocol in a forced delegation model. So all the PoS mechanics are there and the stake rights have been delegated to nodes that IOHK and two other entities control, so it's a federated network. We did this because we're not insane. You don't go and invent a protocol in the academy, turn it on and say 'Good luck everybody'. Instead, you have training wheels on your bicycle. You say, 'Let's launch this system in a federated format and gradually decentralise the system once we have momentum and assurance that what we've done is correct'. And also when we've trained up hundreds of people to run stake pools and participate in the staking process so there's a bit of redundancy and a much more natural unlocking of the system. So, over six to nine months, that process will continue and hopefully all the Shelly mechanics will roll out.

In parallel, we are releasing testnets for smart contracts. The first one will be released at the end of the month, and this is done with something called the KEVM. We worked with Runtime Verification at the University of Illinois Urbana-Champaign, and they took the operational semantics of the Ethereum Virtual Machine and wrote them in a meta language called K. What the K framework allows you to do is implement a correct-by-construction version, just from the semantics of the virtual machine. So it's really cool. What we were able to do is actually take the K semantics, build a VM, connect that to a fork of Ethereum and we're now turning that on at the end of the month to test that this framework is working and you can run smart contracts on it. We also have another virtual machine that we built specially for Cardano, called IELE. And those semantics are publicly available in GitHub. We have a paper that we are submitting for peer review. That testnet will launch some time in June or July – that gives people who live in the Ethereum world and write smart contracts the chance to play around and deploy code on our system and look at our gas model and get a better understanding of how our system works. And then, over time, testnet iterations will occur and eventually we'll pull these two systems together.

IOHK arriving at Google HQ, London

One of the architectural features of Cardano is the separation of accounting and computation. With Ethereum they are bundled together, your peanut butter is in your jelly. And that's fun from an implementation standpoint; it's simpler to maintain, but it creates a lot of problems. If you screw up parts of your computational model you'll also inadvertently block your ability to spend money. Also, computation carries a much higher liability than accounting does. For example, Bitcoin versus Ethereum in terms of transactions. In Bitcoin, if I send Jane a transaction and Philipp a transaction, buying a laptop from Jane and weapons-grade plutonium from Philipp – the miner in the system would have no way of differentiating between those two transactions, they're fungible. We don't know the actors, they are just transactions. But if we're running code, you might be able to differentiate between Crypto Kitties and Silk Road. There is some precedent if you look at Tor exit nodes having legal liability and being arrested for trafficking, child pornography or copyright violations. Computation, if you can discover the difference between what Jane and Philipp are doing, has higher liability. In our view, architecturally, it's a good idea to separate them and it also gives you a lot of flexibility because you can have multiple computational models, like we're backwards compatible with Ethereum and we had a different model and we have a functional model and you can do a lot of cool stuff with that. The downside is that you have to maintain the state of many ledgers at the same time and you also have to figure out how to move value between the ledgers, which we're going to do because of our interoperability mandate. We decided to take this on but it adds complexity to the system, a lot more work to do. That will be gradually rolled out in stages through testnets and it's quite a bit of work.

Duncan: There's the compartmentalization aspect of it. Ethereum is monolithic, it bundles together all of the features, so if one thing breaks the whole thing breaks. If there's some fundamental flaw you haven't found, not only have all your ERC20 tokens gone but so has ether itself. There's no compartmentalization between those things. But if you have, in essence, a Bitcoin-style simple settlement layer and then you do your EVM stuff and equivalent to EVM on different blockchains that are linked you can move money between them but they're otherwise compartmentalized. If for some fundamental reason there's a flaw found in the EVM that destroys that, well, that's very sad but it doesn't destroy the settlement layer. That's a big advantage. And it means, as Charles says, you can add new ones of these things that can be somewhat experimental because that lets you evolve the system a bit.

Charles: We wrote, I think, the first formalization of sidechains. There was a sidechains paper written in 2017. So what a sidechains transaction is for those who don't know it, I like to call it interledger transactions. So you have a source ledger, a destination ledger and an asset. So basically what you're trying to do is initiate a transaction, where the destination ledger can answer two questions about the transaction. One, is that the asset does exist from the source ledger, and two, that the asset from the source ledger has not been double spent. The foundational question you're asking is how much information does the destination ledger need to possess to be able to validate that transaction and verify those two questions? We wrote a model, first for proof of work, called ‘Non-interactive Proofs of Proof of Work’ that explains how to do this, and now we've extended that model to Ouroboros and the proof of stake world and we have a paper that we've just submitted that contains details on how to construct these types of proofs and what these transactions look like. There are still questions about how large the proofs are going to be, relative to the size of the ledger, and there are questions about validation time, and also generality. The proofs we scoped work with our particular consensus algorithm, but we'd like to make these things work for all ledgers, not just a particular type of ledger, so there's a lot of work to be done there. But it's the first time it's been formalized. The concept has been enumerated in a paper that was written in 2014, by a competitor of ours called Blockstream, but they didn't write a proper academic paper and submit it for peer review. That's considerably harder, there's a lot more to think about when you're rigorous about these things. In the long term, it's a great thing to think about for scalability and interoperability, and also testing because you can now deploy versions of your chain with different parameters and it's easy to move value between them and you can let people vote with their feet.

Q. How will Cardano overcome the first-mover advantage of Ethereum? Do you see multiple smart contract platforms co-existing in the space or will there be one prominent winner?

So how many Java, C++ or Go developers are writing code on Ethereum? You can't, Ethereum doesn't support any of these languages. They can't even run a single viral app on the platform. If you look at the top 10 languages, none of them works on the system, so, by definition, all those developers aren't developing for the system, they have to go and learn new tools and new stuff. With Cardano, first off, we're backward-compatible, 100%, we're running an EVM. So you can take your Solidity code and your Web 3 stuff and all the things you've come to know and love about Ethereum, and you can run it on my system, and it's faster, cheaper and safer to run it on my system because we have a better consensus model. Second, through our work with the University of Illinois, through Runtime Verification – Grigore Rosu and his team – we're working on something called semantics-based compilation. Should this be successful, we can take any K-defined system and translate it to run on our machine. All you have to do for a new language is just write the semantics in K, one time, and then K framework takes care of the rest of it. It's a hard, high-risk, high-return project, but at the end of the day, we will end up one way or another supporting mainstream languages. Part of it is backwards compatibility, part of it is supporting mainstream languages, part of it is recognising that the vast majority of real applications aren't running on Ethereum at the moment. The other thing is that smart contracts are not monolithic and you write and run your entire application on a blockchain and the reality is you have to add a server-client component to it. Let's think of it like a poker game, maybe you trust random number generation to the server, but other things like player matching and account managing, these things are almost certainly not going to run on your blockchain, you're dumb if you're going to do these types of things. They're probably going to run on some sort of server back-end. It's more of, I treat a smart contract as a computational service. So it's silly to say, 'Oh well, only one platform and one token's won', it's akin to saying Internet Explorer's won and we all have to be Active X developers, god help us. I'm not loyal to IE, or Amazon Web Services. Rather, I have to ask, what's the cheapest, best, most secure environment for me to run my computation in for my users? Our strategy is be backwards compatible, support more languages, especially mainstream languages in a better way, have a better user and developer experience, and be smarter about the ecosystem in which these contracts live. So we make it easier for the server to come into play, to use multiple ledgers and have a good app platform to deploy these types of things on, and we'll definitely get a lot of growth there.

The other thing is that very few people today write smart contracts. They play with these things, but very few people are smart contract developers. If 99% of developers aren't in the ecosystem, how can you say a person has first-mover advantage? It's nuts.

Q: In 2014 I played around with the K framework for a university project, but I found it to be extremely slow.

Charles: Yes, because there is a K to OCaml back-end, but we're building a K to LLVM back-end, which should be about 100 times faster.

Q: But is that enough? Because it was outright impossible to run a reasonably large project with thousands of lines of code.

Duncan: This is one of the problems that Grigore is trying to solve. As you say, executing the operational semantics directly is very slow. Runtime Verification are basically trying to do a compiler approach, which they believe will be fast.

Charles: It's still a big unknown, exactly how much performance is necessary – and can we get within an order of magnitude of handwritten code? One proof of concept is the testnet that we're launching at the end of this month running a version of the Ethereum Virtual Machine built in K. You can run smart contracts on the KEVM and compare them to an Ethereum testnet and see the performance delta between the two. But it's also important to understand that the open source K framework components that you use and the version that Grigore uses are different. Grigore built a second set of software for his private company, Runtime Verification, that he uses for contracts he's had with Boeing and Nasa, and I think that's about 100 times faster than the one that you used. But even so, there's still a big performance delta that needs to be ameliorated. We have quite a large team, there's 19 people involved in that contract. Some of those people are allocated specifically for performance. Now let's say that we can't quite bridge that performance, there's probably a lot of things we could do to cheat and improve things a bit, including handwriting certain things, or abandoning semantics-based compilation for more traditional techniques. But it's still a big unknown. This is also why we have a multi-computation model, so in addition to the IELE framework and the K framework, we also have an alternative approach called Plutus.

Duncan: We think that most of the computation time in most of the smart contracts goes into crypto primitives. So you don't have to have the world's fastest interpreter to interpret those smart contracts. Plutus – people like me with a programming language background look at the EVM and Solidity, and say ‘It looks like the people that wrote this didn't have much experience with programming language design’. There’s an academic discipline to the design of programming languages, and that didn't seem to inform the design of Solidity at all. That shows up, in things like if you miss one error code then your smart contract loses everybody's money. So we have two smart contract platforms that Charles has mentioned, the backward compatibility story, byte-for-byte compatibility with the K version of the EVM and then IELE, which is EVM style, but fixing a lot of the obvious problems. That gives us the story of how you compile Solidity programs to IELE and KEVM. In addition, we have a smart contract platform that is based on programming language research, in particular functional programming. We have an approach based on a functional core language, which is actually executed, based on system inaudible and then two languages which are compiled into that core language. The core language is what's executed on the consensus nodes. That's the inaudible equivalent of the EVM. We don't call it a VM, it's just an intermediate code, that's the equivalence, and then two languages initially that compile into that. One is called Plutus, which is a functional, Turing-complete language very similar in many ways to Haskell but simplified and cut down. Then there is a non-Turing-complete DSL (domain specific language) that is aimed specifically at financial applications. It's based on a paper from around 2001, ‘Financial Smart Contracts’, that lets you express all the normal and even exotic financial contracts that people tend to write, but it does it in a much simpler way so they can be easily analysed and understood. The point is, if your application fits into the domain of that DSL then you would get much shorter and simpler, easier-to-analyse programs, but alternatively, you can go to the general purpose, Turing-complete functional language and in both cases, it's a two-layer language approach. If you look at existing Ethereum applications they don't just run on the blockchain, as Charles said, it's a blob of Javascript that runs on the client, and some Solidity code that runs on the back-end. Your programming model is this two-level thing anyway, but there are two different languages, in many ways like our web stack. Our web stack has grown over time, one language runs on the back-end and another on the front-end, and these multi-language things are not that easy to do, especially when they have grown accidentally. Because we can see that, we are taking a more deliberate approach, and saying, let's design an explicitly two-level language so this bit will run on the blockchain, this bit will run on the client, but they are very similar languages. So what we're actually doing is Haskell on the client and Plutus on the blockchain. And Plutus is very similar to Haskell, so what you will see is one program with one file... one program with embedded snippets that run on the chain and an out-of-context inaudible layer that happens off-chain. That should give a better, more integrated development experience and we aim to be able to do things like analyzing the stuff that runs on-chain, so you can demonstrate the safety properties of the on-chain code.

Charles: And performance should be equivalent to Haskell, because they share a common core.

Duncan: Right, and the Haskell code will be run through the Haskell compiler.

Q: How can a software developer that is excited about your project get best involved with it? Do you have any plans for educating developers or creating developer-relation type of roles at IOHK?

Charles: I really admire what you guys did with Dart. I love the developer experience effort that Google put into it and there are things to take from that.

It's difficult with a cryptocurrency that is very rigorous in its approach. You're starting with white papers written by cryptographers and the notion of formal specifications and you're trying to implement these and prove correctness, to figure out: when and how do you open that project up to successfully collaborate with one-source developers? We are hiring people specifically to work with the exterior community and try to communicate how we are running a project and how we are writing things and how we welcome third-party contributions. There is a lot of technology in our stack, we're making material enhancements to K, so anyone who wants to contribute there definitely should. We have Electron in our stack, so we are using Chromium and Node.js for our wallet front-end so there's a lot of things going on there with the Daedalus wallet, and we'd love contributions there. And, of course, there's the Haskell back-end. We are reimplementing some of that back-end in Rust, and experimenting with Rust and web assembly in the browser, so a Chrome-based wallet. So there's a lot of tech there and it depends on the core competency of the person and what exactly they'd like to contribute. We have yet to build an easy-to-use, formal external process, to make it friction-free for external developers to come and assist us and it's going to be a high priority in the second half of this year to figure out how we do that.

Another thing is that if an open-source project is to be successful, especially with these types of protocols, we do need competition. When I was running Ethereum, multi-client models were very important for us, so we ended up implementing Geth and the C++ client. And then, later on, Gavin Wood split off and created the Parity client for Ethereum. Now, this was great because it really forced us to specify the protocol properly, and we could no longer say, 'Well, the code is this specification,' because you're inaudible of ambiguity there. So we worked hard at proper documentation, and we'd like a similar environment to materialize, and it would be great to see some alternative projects grow. But at the moment, the best you can do is go to our forums, go to our GitHub repository, and you can, of course, open issues and email our developers. And if you're really interested in making open source contributions, we'll try to find a way to integrate you. And long term, we'll have a formal process for that's really easy for people to connect with. And also, again, it's depending on the particular level you want to contribute to. For example, we do formal specification verification work, so if you're an Isabelle or Coq developer and you want to work with us, that would be great. If there's like five people there, we'd probably be able to do that, right? But, levity aside, it would be fun to find people there. And other things, like if you want to build applications in our system. We are launching testnets soon, so it would be much appreciated for people to write software to deploy in our system and try to break it because that helps our system get better. So that's the best non-answer to your question that I can give!

Duncan: At the moment, our code is all open, it's on GitHub. This is one of those projects that's open source but not yet open, collaborative. It's not easy at the moment for us to collaborate with people because we're not yet using GitHub with its issue tracker; we have a private issue tracker at the moment, for example. So this is one of the things where we aim to get there, for there to be documentation that's easy for other people to look at and understand and make contributions and for us to accept. So the goal is to get there, but we are not there yet. You can go and read all the code, but you can't see what the open tickets are, the documentation's a bit patchy. So that is where we'd like to get to, to be able to direct people and accept contributions from anyone really.

IOHK videoconferencing with Google HQ

Charles: We are going to try to annotate a lot of the design decisions in the system. For example, we recently released a formal specification for our wallet back-end. I think it's the first time it's ever been done for a UTxO wallet, and we're going to create YouTube lectures, going through section by section on that and putting it up specifically with the aim of educating developers how our design decisions work and what the system's all about. So as we specify each component of our system, we're going to try to do that. We have an internal department called IOHK Education, led by a mathematician named Lars Brünjes that specializes in that, so over time, you should see more accessible documentation materialising. Hopefully, that will encourage people who have the capacity to contribute to come in. We are also discussing how we open up our issue tracker. We made the decision to have a private issue tracker in the beginning because there's a lot of, usually, security concerns or open discussion about what direction to go in because the protocol is still very young. So we just figured, I'll just leave that all private and not worry too much about it. But we do have a moral obligation, as an open source project, to try to get project management and issue tracking into a more open domain. So there is a lot of open discussion about that. And once those things get into the open domain, it will considerably easier for open source contributions to occur.

Q: We have another question on privacy. Are there any plans to implement private transactions in the style of Monero or Zcash?

Charles: Yes, so Monero uses a primitive called ring signatures, and Zcash uses a snark primitive. So privacy is a complicated topic because you're actually talking about three things. You're talking about the privacy in terms of linkability, so, if I look at a transaction or an address, what's the probability that I can relate it to a known identity? So basically, go from anonymous or pseudonymous to known, the linkability dimension of it. Then there's the obfuscation of amounts. So you might not be able to easily connect the addresses or transactions to people, but if it's a public ledger, you could certainly see what's the most expensive transaction, and it creates kind of a priority cue for deanonymization. Say, 'Ah, well, there's a $10 million transaction that happened today, let's go and find who has that with a wrench to rob him’. And then there's the network side of things, so can you obfuscate the use of the protocol or try to prevent people from being able to understand what you're doing with the protocol? So there are existing solutions in all three of these. Like ring signatures and Zcash cover the first category. Confidential transactions, for example, covers the second category. And the third are covered by technologies like Dandelion. So first, you have to understand that privacy is a spectrum, and also, it is one that carries considerable regulatory discussion. For example, Japan just announced that they're probably going to de-list all the privacy coins. So if we wish to be in the Japanese markets and we were to embrace Monero-style privacy, there's a very low probability that Japanese exchanges will list Ada, which is a high priority for us.

On the other hand, privacy is a moral right. If you don't have privacy in your system, you're basically creating a system that your entire financial history is publicly known back to the beginning of time, since the system's inception, which is dystopian to the max. So the best way of resolving this is to develop out some really good privacy options and implement them as improvement proposals, and then take advantage of the governance part of the system. So when voting is available, we can actually have alternative proposals and say, 'Well, if you wanted to have ring signature style privacy with the whole banana, here's how we would do that.' And how much public support do we have for that among Ada holders? And, basically, have a referendum and see which one wins out, and then you can see where in the spectrum you fall. But the important thing is, people need to be informed. If you maximise privacy, you will inadvertently make the protocol illegal within certain jurisdictions and limit market access within certain jurisdictions. If you make the protocol more open, you are inviting more dystopian people to track what you're doing and use it against you. So we'll let the community decide that, but we do have active research. For example, on Dandelion, we've been funding that team at UIUC, and they're creating a version of Dandelion that will come out this year. We also have had discussions with people who have been formalising the Monero cryptography primitives and trying to make them more efficient and better bound to security and better bound to privacy. There's one project out of UCL that was done by Sarah Meiklejohn, and there was one out of China that was done, some professors in Beijing and a few other places – I think it's called RingCT 2.0. So there's certainly a lot of good tech, and we know how to implement that technology, but it's now mostly in the hands of a social phenomenon rather than someone like me making that decision on behalf of the ecosystem.

There's another thing that's seldom mentioned, which is the idea of backdoors. We used to live in the inaudible debate, or, either you give an unlimited-use private backdoor to a trusted actor like the FBI, or you don't. But there can be a spectrum with these things. For example, we can put a backdoor in the system, but it has to be constructed with mono-cryptographic primitives, and the use of it requires majority consent of the system, and it's publicly known if it's used. So would you be willing to invest in a currency that says only a certain group of people, if they come together and they have near-universal consensus inaudible and you know that they've used it, and it can only be used on a particular person instead of the entire system, is that a reasonable compromise? So I think that more-nuanced discussion has to be had, and there is certainly a lot of tech that's being developed to accommodate these types of things. In fact, the inventor of the Snark, Eli Ben-Sasson, has recently started a for-profit company, and he's developed technology like this to augment Zcash to be able to provide these types of backdoors which are auditable and publicly verifiably when used and, in some cases, one-time used, depending on how they're deployed. So we'll certainly be a member of that conversation, and eventually, we'll settle on something, and it's inaudible of how much privacy's required. Closely related to it is also the place of metadata and attribution. So under what contexts should you attach mandated transactions, and how do you do that, and how do you share it? This is really important for compliance. If you look at exchanges, they have KYC (know your customer) and AML (anti-money-laundering) reporting requirements, and because of that, they've inadvertently become data custodians where they hold tons of personally identifiable information about their customers. They don't want to hold it, but they have to because of the law. It would be much better having a minimum viable escalation model where you are allowed to have a challenge-response type of a query where you ask questions about your customers, like, are you a US citizen or not? And you can get some verification of that. But you don't have to have that data necessarily.

The example I like to use is that, in the US, you have to be 21 or older to drink. How we usually verify that is we look at your driver's licence. And because I see that document, I know your address, I know your driver's licence number, I know exactly how old you are and how fat you are, how tall you are, the colour of your eyes, the colour of your hair. That's a bunch of information you don't need to know, but you inadvertently know because of the way that you've done verification. It would be much simpler to be able to just ask a query: are you over the age of 21? And have a response of 'yes' and know that that response is correct. Then I've learnt nothing about you other than that particular question, and we leave it at that. And the proof itself is the sufficient burden for the merchant. So we're certainly involved in that conversation. We have a lab at University of Edinburgh that studies these types of things, a privacy and verified computation lab. It's led by a former Microsoft person, Markulf Kohlweiss, who worked on the Everest project and other things. And privacy is in the remit of the lab, so we'll come up with some options, and then we'll democratically validate those options. And whichever one the Ada holders have decided, we'll implement into the system. And by the way, this takes time, several years.

Q: We actually have a couple of submissions on future outlook. I'm going to read off this. What do you think is the next big thing in the crypto space, outside of your own roadmap? For example, let's say Hashgraph? And related to that, after scalability and sustainability, which do you think will be the next challenge? And what do you think will be your next big ticket, either in the crypto space outside of your current projects, such as Hashgraph?

Charles: I think the magic is that there are a lot of concepts that have been sitting on the academic shelf for 25, 30 years, or emerging trends that we're seeing in hardware or software that can now be dragged into the cryptocurrency space. I'll give three examples: one is multi-party computation; two is trusted hardware; and three is a lot of PL programming language concepts. So in terms of multi-party computation, these protocols or the concept of multi-party computation have been known about since the 1980s, but only recently have the protocols become efficient enough to run on consumer hardware. So, for example, let's look at poker, that's kind of the canonical example. This is a perfect example of something that you probably don't want to run in an Ethereum-style virtual machine. You're not talking about poker games with 100,000 players; you're talking about poker games with five players, 10 players, 15 players, so you don't have a very large set of people. And, really, what do you care about in poker? That people can cheat, and that at the end of the game I'm going to get my winnings and losses. That's basically it. And you don't particularly care if that data gets committed to the blockchain. Why should it be there? It's a game. We're using blockchain as a broker and a payment system, so we're using it to find each other and to make sure that we get paid, but outside of that, what I really care about is I played the game. I don't particularly care about the metadata, or if I do, there's a way of capturing it that's far more efficient. So if you look at the Ethereum-style solutions, they say, 'Oh well, who cares? We're still going to write it as a smart contract and all the same, your poker game, even though you're only playing with five people, is the business of every single person on the entire network and competes for resources with every single person.' So I think there are much more intelligent ways of doing things off-chain, and multi-party computation provides an avenue for that, so we're exploring it. But then, there are also projects like Enigma, out of MIT Media Lab that are exploring these ideas as well. And so I'm really excited to see the intersection of MPC multiparty computation and blockchain, and I think you can do a lot of cool things, not just poker but things like decentralised exchange or even general-purpose computation.

Charles and Duncan at Google HQ

Trusted hardware is really cool too because it's the closest thing we have to magic. And I think it's the only way we have for offline, off-chain payments. So let's say you live in Africa. I was just in Rwanda and I was just before that in Ethiopia, and it would be really crazy for me to go around Addis Ababa and say, 'I'm going to solve all your problems with my magic internet money which requires you to be online all the time to use, when you don't even have a stable internet connection.' They'll just look at me like, 'Okay, crazy white person. Go away.' So the reality is that it would be nice to be able to have a trusted enclave that you can store the private key within that enclave so that you get certain guarantees like the balance associated with that key is correct, you haven't double spent it, and when you transact it, you can move the private key from one enclave to the other and then get a proof that it was destroyed on the enclave that you had. Now, if you can do this – and trusted hardware can potentially allow you to do this – then you can transact offline. Because what I can do is just walk over to someone who has a cellphone or a hardware device and tap it with my hardware device, and, basically, we just move the private key from one device to the other device and transmit over a proof of destruction, and suddenly, they've just got paid. Now, from the blockchain's perspective, no transaction has happened. It's basically like staying the same; nobody knows that you've swapped the private key. But from the person who's received that perspective, they've just been paid as if I've handed them $20. And none of this had to sync, it's just all offline. So that's really exciting me because that extends cryptocurrencies into an environment that they're not meant for and allows us to do things in the maximal privacy because you don't even know if your transaction's happened, that's the most private way of doing a transaction, and there are all kinds of things you can think about. You can even extend that to an offline ATM, for example. So if you can pay people offline, why can't you have an ATM that's offline and it can dispense cash and so forth? So trusted hardware, I think, is pretty magical, and there's a lot of other things you can do with that. But we have a lab that is looking at these things, and there are companies like Rivetz, for example, and Intel's involved with a protocol called Sawtooth, which is associated with the Hyperledger project, and the range of solutions from IOT (internet of things) all the way down to attestation of data and so forth. So that's exciting to me.

And then in terms of PL concepts, and Duncan can probably elaborate better than I can on this, but what excites me the most is this idea of making formal specification sexy. There are some projects historically that have had huge impacts, but they've been extremely time consuming and incredibly expensive. Like seL4, for example, which was a microkernel that was verified. I think it took like 10 years or something like that?

Duncan: Not quite that much, but it was a lot larger.

Charles: It was like five years, or a big chunk of time, and then millions of dollars were spent on it. So when you talk about formal verification software, you tend to have this vision that you're Nasa or you're SpaceX or you're Boeing and you have massive budgets and thousands of engineers, and you have horizons of time that almost any rational person wouldn't invest in normal consumer software, like five years, 10 years, 15 years and so forth. If you look at the smart contract, the smart contract is a small piece of code. Of all canonical examples you'll be able to pull in Solidity, they're not very large, they're like several hundred lines of code, and they have usually pretty well-defined business requirements backing smart contract. So you can have a reasonable discussion about the semantic gap between what you intended on doing and what you actually ended up doing. So this is probably the perfect example of where you'd want to use some form of formal specification approach and verification approach to verify correctness. The other thing is there is high assurance in that if you get them wrong, hundreds of millions to billions of dollars can be lost, and also, you kind of have a very well-defined environment that these things happen to be executing in, so it's like the perfect storm for formal specification and verification. So I'm very excited about where we can take older techniques, modernise them and drag them into our space in a lightweight way.

Duncan: And if we're successful, that not just having everybody write Solidity programs, then we can start with smart contract languages which are amenable to formal analysis because they're designed with it in mind, based on proper PL research rather than just Solidity, which was accidentally designed.

Charles: Right. And also, what's nice is that competition is really brewing in this space. Tezos, for example, with Michelson, and I think it's Liquidity is the name of the language there, and they're definitely pushing hard. Simplicity is another language that's being developed by Blockstream, by proper PL expert Russell O'Connor. And there are other examples in the spaces, end protocol is one, using concepts from F. There was an attempt to port Idris to run on Ethereum, and there's been some formal verification work that's occurring around Ethereum, particularly by a guy named Yoichi Hirai using Lem and a few other things. So there's definitely some effort that's been put into this, and that's really exciting. If you study PL, these techniques tend to be on the fringe, and they tend to be isolated, small groups that work on these things, and nobody pays much attention to them, and when they do, they usually roll their own solution and it's proprietary in-house or it's for a particular project, and then you move on after it's done. The fact that we're now trying to take specification verification concepts and bring them into the mainstream and make them accessible to everyday programmers is a really new concept, and that's really exciting, and I think it bodes incredibly well for software correctness. And I'm really excited about where that can go. So those are the three things: multi-party computation, trusted hardware, and PL stuff.

In terms of things like Hashgraph, you asked about that, there's a lot of hype, guys, about DAGs (directed acyclic graphs) and hype about 10,000 transactions or 100,000 transactions per second, and my opinion is, most of this stuff is junk, either because it's been deployed in a very unrealistic environment or because it doesn't have a proper trade-off profile or because the protocol was designed without any Byzantine resistance or... you know, you have Byzantine actors or the protocol fails. And I'm very sceptical. There are some real protocols that do shard and have great performances, like, for example, OmniLedger is one and Thunderella is another, and these are protocols built within the academic world by academics, based on protocols that are already proven, or at least have some peer review behind them. So without a doubt, such things can exist, but usually when it's happening with things like IOTA, which is based on a technology called Tangle or Hashgraph, is you see a wave of hype come along. The ICO comes, pots of money get raised, and then people psychologically buy into it, and no one really bothers to check if what they've implemented is right or wrong. The other thing is, you don't really need to build a system where one chain runs 100,000 or 200,000 transactions per second; to be frank, you're doing it wrong. Because even if you have that raw processing power, how do you move all that data? You need a network player to do that. And second, where do you store all that? If your security model is everybody has a copy of the blockchain and you're running 100,000 or 200,000 transactions per second, how big do you think your blockchain is going to be in a year? And so, yeah, Google can store it, or the NSA (US National Security Agency) can store it; pick which one you want. It's either California or inaudible, you know, which facility you want your blockchain in. But you necessarily federate the network when you demand this type of performance.

So to me, it makes a lot more sense to have a multi-layered model where you have lots of sidechains. You have overlay protocols, things like payments channels or state channels, and you try to keep your transaction rate within a reasonable envelope. What I am excited about though is the notion of data sharding. I'd like to see ways of breaking up a blockchain and storing it like you would a torrent or storing it like you would a decentralised database. And there are some projects like Siacoin and Filecoin and others that are incredibly well funded, and they have real computer scientists behind them. In fact, Sia, for example, their core paper was given the best paper award at the Eurocrypt conference. They beat our paper; I was pretty upset about that. But that's a high bar compared to where we were just two or three years ago with the first wave of those technologies, like Storage and MaidSafe. That's my long-winded answer. Next question.

Q: I think it's a shorter question. You guys are focusing on the development of the platform, obviously, and making it available for developers as soon as possible, but do you think it might be beneficial to team up maybe with some sort of a real world project to be implemented as a new test case, as a proof of concept, let's say, of scalability of Cardano, and implementing that application as just basically a showcase?

Charles: Yeah. It's kind of like we're building the Xbox. So the question is, what's our Halo?

Q: Yeah, exactly.

Charles: What are our games? What is the catalogue? So we do have a business development arm, and we also have a partner called Emurgo, which is a venture fund in Japan that seeks out projects to deploy on Cardano that will act as these proofs of concept. And we've had some degree of success. We haven't been looking, but we've seen that there's an overwhelming demand for pilots. I was just in Ethiopia and talking with the Ministry of Science and Technology, who we signed an MoU with, they have a very strong desire to do supply chain management on blockchain for coffee production. Now, this is a really big market. There are a million and a half farmers, there are massive amounts of stuff going on in those supply chains, so that would be a great stress test for the platform. But then, there's an open question: what ought to run on the main network and what ought to run on a private permissioned ledger? Hyperledger exists for a reason, and IBM is not dumb. They created a really good product to service the needs of the enterprise. So part of Cardano's remit is not just to look at what should run in a single universal environment that's maximally decentralised, but ask, what can you deploy in a private setting? The canonical example I like to look at is the exchanges. So if you right now look at the way that exchanges handle tokens, basically, they have an address, you send your Bitcoin to that address or your Ether to that address and they have some sort of storage policy between hot wallets and cold wallets. They try to arrange it in just the right configuration, so even if the hot wallet's compromised, the majority of funds stored there aren't lost, which is fine but it's not optimal. It would be much better if you used a sidechain transaction and you sent your token to a private ledger. Because you're already trusting the exchange with your money, so you're not really getting any extra security by being on the main network; you're actually losing security because what you're doing is saying to the exchange, 'You can't put in proprietary business logic to better protect my tokens.' Like, for example, the ability to reverse transactions and so forth in the event that the system gets compromised.

So part of Cardano is to study the relationships between permissioned ledgers and permissionless ledgers and then find linkage points between these two things. So for high-stress applications, things that are ... let's say supply chain management in the coffee industry when you have a million farmers and hundreds of thousands of transactions every day and bloats and tons of data and IOT components putting sensor data on, it makes no sense in hell to run that thing on a big network and have it compete for resources with everybody else. It's just not economically viable. It makes a lot more sense to just have the washing stations and the government and other people run consensus nodes and have a permissioned ledger there and then create some sort of linkage between the two systems. If anything, you hash the ledger regularly and store it on the main ledger; that can be an example of that. So you kind of have your cake and eat it too. You have a blockchain-esque environment, you get the timestamping, you get the automobility, you get the immutability and the record within the reason of decorum, but then you also get very low-cost operation and predictable performance of operation and a much easier time custom-tuning the ledger towards your business logic in your system. So we are looking for pilots like that, but we're also looking for smart contracts to come on to our system, and when Plutus is fully available, we'll be pushing like hell to get lots of people to come and do things. The advantage that we have, because we chose Haskell as our code basis, we kind of work with everybody in the Haskell space. I take it there's like, what, four or five major Haskells that we don't work with at this point. We work with Tweak and FP Complete, and Well-Typed and others. So we kind of know everybody there, and we have a very good brand footprint in the Haskell space. So once Plutus is available, there's going to be probably a large group of Haskell developers who are curious to be able to write smart contracts in a language like Plutus. So that will create a wave of innovation.

We can also spark innovation by creating cohorts of people that are trained developers in our system. So we have an education arm, and we do classes. We did one in Athens, we did one in Barbados, and we're soon doing one in Ethiopia. By the way, that class will be all-female developers; the government requested that. And we teach them Haskell and smart contracts. So once they learn these things, the target platform, they'll write software for us on our system, and they'll write lots of prototypes and so forth. So it's a collection of these types of things. Some of the things you need to do to brand and showcase the system, but they don't necessarily need to be run on the main network, they should be run on permissioned ledgers. Some things you do need to run on the main network, and you have to also have channels of them with which to attract developers to your system, and we have been pursuing that as a company, and our partners have also been pursuing that as a company. But it's important to point out there's overwhelming demand; there are lots of people who want to be doing things with us, and we have to tell them ‘no’ because we don't have the capacity at the moment.

Q: I'm curious ... this is kind of a fuzzy area, but what do you see as the role of cryptocurrency in society? Right now, there's so much talk of 'Lambo' and 'to the moon', and a lot of what's going on is purely speculative work. Are we going to have our grandmothers using this on a day-to-day basis? Is this going behind the scenes? What do you think?

Charles: Well remember, not too long ago the internet revolution was treated the same way. People were making fabulous amounts of money from vaporware products, and the market collapsed. But on the ashes of that, you got the Googles and the Facebooks and the YouTubes and so forth. So cryptocurrency technology, when you decompose it into its fundamental components, is a discussion about trust and coordination, identity, reputation, the representation of value and the movement of value. These are the core components of cryptocurrency technology. So if you look at society and markets, all of these things require an opinion on how these things ought to fit together. Now what ends up happening is, you look at the default configurations we have, almost always there is some sort of middle-men of necessity, not of desire. You don't put Bob in charge, make him the gatekeeper, because you like Bob and he happened to have built the business around it, and because his business is operating well, it becomes a core component of the way the market structure works. A great example would be eBay. Nobody really likes eBay, but eBay is eBay, you kind of have to deal with them. So you need a marketplace, you need a reputation system, you need a way of organising your buyers and sellers, you need a payment system and so forth, and eBay happened to be the winner of that fight, now they have a monopoly and they just kind of run that. Similarly, with YouTube, there's more than one person, sorry guys, that have some issues with YouTube and getting demonetised and so forth, and maybe you guys feel you're doing a good job – some people don't, but no matter what, YouTube's YouTube, it's just of scale. Uber's another example of that.

So there are hundreds of these things that when you start really looking carefully, you notice, in the flow of money, in the flow of commerce, in the flow of insurance, in the flow of commodities in every facet of life. So what cryptocurrency tech is all about is saying, can we reinvent this marketplace where we can get rid of that central broker and directly connect the buyer and seller, allow them to somehow coordinate, somehow trust each other and have a successful commercial transaction? Now a lot goes into that. You need to have things like reputation, you need to have things like insurance, you need to have a payment system, you need to have a value stable currency, which we still don't have, you need to have credit, you need to have the ability to do settlements that you don't even have the full amount of money for – it's called eventual settlement, where I'm going to pay you but I don't have the money on hand to pay you quite yet. It happens quite a lot in the world. You need to have an invoicing system. You need to have lots of stuff. But just because we don't have all the basic components doesn't mean there's no merit to the system. It's just like with the internet, you needed to have the cookie, you needed to have the certificate, you needed to have the browser, you needed to have good web languages, you need to have high-speed internet to be able to get things quickly to people. Eventually, you needed to have mobile internet and so forth. But once these infrastructural components got put in, the internet got great and changed our lives. And so, I view cryptocurrencies much the same way.

Now there is the natural question to ask of where does the token fit in all of this? Because everything that I've mentioned doesn't really require Bitcoin; it just requires protocols and rails and so forth. So what the token acts as is some sort of incentive mechanism for people to maintain these particular types of systems. It's not necessarily what the instrument of value's going to be. The token just decides who gets to be in control at the end of the day, especially in proof-of-stake systems; it's very explicit in this respect. I think any of the proof-of-work systems are eventually going to die out because they're not sustainable or competitive in the long haul, but on proof of stake the token's about who gets the vote, who gets to decide who's in control. And I think what's going to end up happening is you're going to have the liquification of value and the tokenization of value, and we're going to converge to a universal wallet notion. So in life, you have lots of stores of value: you have commodities and you have stocks and you have bonds, you have real estate. I have a lot of airline miles; I'm saving up enough to buy a jet. I travel 200 days a year; I've been to 63 countries, so I'm a pretty tired guy. We have currencies, you have whatever, you have hundreds of these stores of value. Now, we tend not to look at them as the same thing. We tend to look at them as silos, right? My gold is not the same as my airline miles. Why is that? They're just value, they're just wealth, and you put them all together, that's the portfolio of your wealth. So the capstone I think of where the cryptocurrency movement is going is to remove the walls between these tokens of value and give you ways of representing them that are interoperable with each other. So you can turn your gold into silver. You can turn your silver into currency, turn your currency into airline miles. And because all the payment systems are now programmable, thanks to you guys and many others, the merchant gets paid whatever the hell the merchant wants to get paid, now. So I can walk over to Starbucks, and I can have my house tokenized, and I can sell it. There's a market maker that lives in between that, and I sell one-millionth of my home and I can buy that cup of coffee, somebody bought that from me and the merchant gets paid in dollars or pounds.

So that's where I think this is all going, that it's going to have this ecosystem of tokens that live. The tokens that are connected to the command and control of the protocols I think will survive because they become basically like a prediction market in the sense of the utility and value of that protocol to society. It's almost as if you could tokenize BitTorrent or something like that and say, 'Okay, here you go. This is the value of BitTorrent to society.' And that's one thing, but then you're going to have tokenization of concepts, you're going to have tokenization of stores of value. And that's what you're going to end up spending on the system. So those can be government-issued, they can be synthetic in that you somehow create a value-stable cryptocurrency, they're called stable coins and they're traded as if they're dollars, even though they're not issued by a government. They can represent loyalty, you can be selfish, like you can organise your labor as an engineer and say, 'I can take 40 hours of pre-paid labor, and people can buy it on the open market.' You can do all these types of things. That's where we're going, I think, with this entire system. And that's really cool to have a single unified global market, you have a much finer-grained way of handling trust and coordination, having universal identity and reputation space, so people can get backed and get into the system. You can know how to do business with your counterparty in a safe way, and then to have value become liquified and fluid. Now, who's going to win that fight? I have no idea. I'm betting on myself, you know, with my protocol, but I'm not insane enough to believe that we'll probably universally win, nor should we. It should be diverse inaudible.

Q: Great, Charles. Well, this has been very informative. Thank you so much for joining us at Google today and taking time out of your busy schedule.

Charles: Oh, one thing I forgot to mention, real quickly, we're passing out these flyers. We don't have them in your offices. Unfortunately, Richard hasn't mastered teleportation yet.

Richard: We have a little meetup going on tomorrow – The Symphony of Blockchains(/projects/symphony/ “Symphony of Blockchains”) that we mentioned to you. If any of you guys on the screens are coming to London tomorrow, we'd love you to come and join in if you've got free time tomorrow afternoon. Charles will be talking, Lars Brünjes as well on the delegation scheme, and we'll also be showing some art. Free bar, so come and join us if you can.

Charles: And I want to explain real quickly that IOHK has a secret third division. So we're an engineering company and a science company; we're also a design company, and our head of our design office is right there. His name is Richard Wild. So it was really important to me that in addition to getting the science right and getting the engineering right that we actually find ways of representing and visualising what we're doing. So it's not just good enough to write code, the code has to be alive. You have to see it and be able to touch it. So our art department tries to visualise things. And so, we started with Bitcoin and we said, 'How do we visualise a block explorer?' So block explorer just tells you the history – what's in this block and how many transactions were in it, how many fees were paid, etc. Wouldn't it be so cool to put that into a three-dimensional, navigable art piece that's on our website and eventually VR'd and you can walk through it and so forth? It makes it far more accessible to the general public because these concepts, as much as I happen to love them, are really boring. But if you have a piece of 3D art, then you can put that in a gallery and people can see this. And also, it gives you a visual way of representing things. Like, we asked a question of, what is a healthy network? Before Bitcoin Cash came, we had a big crisis in Bitcoin where every block was full, you had to wait hours for transactions to be able to confirm, sometimes days, fees were very high, it cost you $25 to buy a cup of coffee in some cases, so it's not a very healthy state of affairs. So it would be nice to visualise that. You could just look at a blockchain and say, 'Boy, it's a red day. It's not a good day. Normally we have green days, but, no, this is bad. Bitcoin's not healthy right now.' So it would be really cool to do that and allow people to be inspired and understand that.

Also, when we talk about concepts like IOTA or Hashgraph, which are directed acyclic graph-based structures versus a blockchain-based structure, how are they different, conceptually speaking? It's one thing again to talk about the graph theory, it's another thing to actually show people a picture or show people a model. And so, all the art we do is interactive. We use Three.js and WebGL and these types of things. So if you go to our website, you can see some of the art we've done, including the Symphony of Blockchains. And the event we're having today is the first of its kind, and certainly not the last. And there are goals to build up a large portfolio of art. And eventually, we're going to try and create incentive systems for the general public to be able to create these types of things and be able to make money from it and so forth. So you're all welcome to come. Free beer and interesting people to meet.

Q: Great. Thanks a lot Charles.

Charles: Cheers.

The above is an edited transcript of the discussion at Google’s London offices on May 14, 2018.

A major brand refresh for Cardano *

How IOHK Creative redesigned the brand with form following function

7 June 2018 Richard Wild 9 mins read

A major brand refresh for Cardano *

In January, Charles Hoskinson messaged to say he’d come across an online video that he wanted us to check out. I took a look at the link he sent, and found that it was indeed eye catching. A cryptocurrency fan had studied Cardano, and had made an animation of its development history. The data visualisation showed each piece of Cardano code appear on the screen as a brightly coloured node, at the time it had been written over the past two or so years. From only a handful in number, over time they mushroomed on the screen, resulting in a complex structure with many interconnecting parts.

This sparked an idea at IOHK. We’ve had our fair share of trolls, making false accusations online that Cardano is "vapourware", i.e. that we have no product. This animation wasn’t just beautiful, it was functional, because it destroyed those claims completely. Using data freely available from the development website GitHub, where Cardano developers had registered each piece of code as it was written, you could not only see how active they had been, but that Cardano was for a long period the most intensely developed cryptocurrency of all. We wanted to see where we could take this idea.

The visualisation made by a user named Crypto Gource

The design team had recently started a project to refresh our brand across the portfolio of Cardano websites. We would overhaul the design of each website and the information it displayed. We would make the websites more intuitive to use and convey data more effectively and in an information-rich way. This would give users the big picture but also allow them to drill down to understand details in a way that had not been done before with any cryptocurrency project.

Scott Darby and Juli Sudi from IOHK Creative

Because the animation had been so captivating but also served a useful purpose, we decided to make it an intrinsic part of our design refresh and Scott Darby, IOHK creative coder, got to work. Meanwhile, visual designer Juli Sudi would lead the rest of the work, and create a recognisable look and feel for the brand.

Evolution of the initial designs

Groundwork

Juli says: "The first steps for me were finding a style direction and creating core visual elements that could be shared across the pages and serve as a backbone of further design efforts."

She chose iconography and experimented with colour pallets and layout styles. For each potential direction she created a test design using the base theme in different ways, such as varying the icon style, typography, spacing, colour swatches and gradient options. The test versions were shared with the team to gather feedback and to choose the elements to be taken further.

After many experiments and ideas, we landed on the principle that the design should be very functional. Colours would represent something rather than be chosen just because they were pleasing. The form should follow the function. We approved a direction that was still a rough concept, but was an agreed starting point for building up the new design.

Felix Koutchinski

Community Involvement

Next, Juli began a discussion with a community member and graphic designer from Stuttgart, Germany, named Felix Koutchinski. He had come to our attention after he wrote a blog about the Cardano roadmap design. His feedback became important and we worked with him closely. The community should be involved in everything we do because ultimately this technology is theirs.


Chatting to Felix in an external Skype group

Discussion with Felix highlighted many points that we had overlooked about the roadmap. Without a filtering function, it was hard to find the information you were looking for. Sorting the tasks by progress or recent update was a high priority for visitors to the site. The development milestones did not follow each other chronologically but progressed in parallel, which called for a different style of navigation saving users' time moving down the page. We should show completed development more clearly. There should be a summary explaining each part of the project. Users should be able to filter content by category, such as development phase (Shelley, Goguen etc) or by whether the work was complete or not, or by who carried out the work (IOHK, Cardano Foundation, Emurgo).

Below, you can see three layouts. The first is the roadmap as it was, the second is Felix’s design, and the third is the new layout we settled upon.

As these ideas were developing, Juli was also working on a colour pallet and style guide, below.

Medusa

*
Meanwhile, Scott had been working on a proof of concept for Medusa. He studied the main source of the repository. Crypto Gource's original video was offline, however we decided that Medusa would show updates as they were made in real time. Scott says: "This is a dynamic, living artwork, created from a Git repository directory structure. The visualisation is continually evolving in real time based on developer commits. Each time a commit is made, the edited files light up."

Files that have been recently edited are coloured red and dormant files are blue. We chose to make our visualisation 3D in contrast to the original 2D animation. We also found great ideas that we borrowed from cardanoupdates.com, a website produced by a community member.

Medusa will feature on each Cardano website. For the best experience view Medusa on desktop with the latest version of Google Chrome, Mozilla Firefox, Opera, or Safari. Every development project has a different GitHub file structure, and Medusa will represent this and act as a unique fingerprint for each one. The full-screen view with rich interactive functionality is available by clicking on the glowing button in the header of the Cardano roadmap. In future we will include more features. These will offer viewers a means to explore the code, and understand more about the development in a way that is not only useful, but beautiful too.

Bringing it all together

At this point, senior full-stack developer Tomas Vrana began to work with Juli on bringing the designs to life in production. Tom helped refine the designs so they would work well in practice. One example was that if a user filtered content by the partner responsible for the work, it would not make sense to have a column displaying developer commits because Emurgo and the Cardano Foundation do not produce code. Medusa is still very much in its infancy, and at best a working proof of concept. Over time, we will be rolling it out across all devices, and bringing more functionality and intuitive widgets to the UX.

Tomas Vrana and Alexander Rukin

The Cardano roadmap wasn’t the first time we deployed the new designs – they were used on the Symphony of Blockchains and Cardano Testnets websites that recently launched.

Daedalus website redesign New Daedalus website design with Medusa, coming soon

Alexander Rukin took Juli’s designs and very quickly built the new Cardano testnet website that launched at the end of May. The wonderful thing about these new designs is that you can take and translate them for various uses. The content of the testnet website is very different from the roadmap and Alexander did a great job adapting the designs. Alexander is also responsible for the design of Daedalus.

George Clark and Jonathan Smillie, who are full-stack web developers and designers, teamed up to give extra firepower to the testnet website launch. For the first time, Cardano is offering smart contract functionality for testing, and George and Jonny pulled long hours to ensure the website was successfully launched. An added challenge was that the new designs were delivered under a tight deadline, and project manager Robert Moore helped enormously to get us to the finishing line.

Cardano Roadmap with Medusa New Cardano roadmap with Medusa

Today, we launch the refreshed Cardano roadmap, where you can see the result of all the work described above. You can view the new features, from exploring the list of developer commits, to receiving notifications when the roadmap is released each month.

We’d love to hear your feedback, please let us know what you think. We may even end up working with you as we have done with other community members.

Now for the first time Cardano has a real brand identity that exactly represents the project in a visual way. We will be rolling out this restyle across the rest of our assets, including Why Cardano and Cardano docs. Follow IOHK’s Twitter and YouTube to stay updated. The redesign will completely redefine how you understand IOHK projects, and evolves how we communicate as a company.

At IOHK we believe that Cardano is not just a development project: it is distributed systems, it is social science, it is a political movement, and among other things, it is also a design challenge.

Intuitive Roadmap navigation New intuitive navigation
New Cardano Roadmap layout New layout gives a better perspective to the information presented

Join in the discussion about the redesign project on the Cardano Forum.

*This blog has been edited to reflect the change of name for the project from Gource to Medusa.

Semi-Formal Development: The Cardano Wallet

4 June 2018 Edsko de Vries 17 mins read

Semi-Formal Development: The Cardano Wallet

Please note: this post originally appeared on the Well-Typed blog.

TL;DR: A combination of formal modelling and testing using QuickCheck is a powerful tool in the design and implementation of high assurance software. Consistency of the model can be checked by testing invariants, and the real implementation can be tested by comparing it against the model.

As part of our consulting work for IOHK, Well-Typed have been working with IOHK on the design and implementation of the new version of the Cardano cryptocurrency wallet. As a crucial component of this process, we have written a semi-formal specification of the wallet: a mathematical model of the wallet along with invariants and lemmas about how it behaves.

We refer to this specification as “semi-formal” because while it states many of the wallet’s properties, and proves some of them, it by no means proves all of them. As we will see, however, we can use QuickCheck to test such properties, producing counter-examples where they fail to hold. Not only is this an invaluable tool during the development of the specification itself, it also gives us a very principled way of testing the real implementation, even if later we do prove all the remaining properties as well.

In this blog post we will take a look at the specification and see how it drives the design and testing of the new wallet. We will show parts of the formal development, but only to give some idea about what it looks like; we will not really discuss any of its details. The goal of this blog post is not to describe the mathematics but rather the approach and its advantages.

Note: all figure, invariant and section numbers used in this blog post refer to version 1.1 of the specification.

Background: UTxO-style Accounting

Regular bank transactions transfer from and to bank accounts. For example, a transaction might transfer $100 from Alice to Bob. Transactions in cryptocurrencies such as Cardano or Bitcoin are a little different. The to part is similar, except that there might be multiple “outputs”; for example, a transaction might transfer $70 to Bob and $30 to Carol. The from part however works in quite a distinct way; transaction inputs are not accounts but other transactions. For example, let’s call the transaction that transfers $70 to Bob and $30 to Carol “t1”.

t1 	inputs:	…
	outputs:	$70 to Bob, $30 to Carol

Now suppose Bob wants to transfer $50 to Dave. He can create a new transaction that says “take the first output of transaction t1, transfer $50 to Dave and transfer $20 back to me”

1
.

t2	inputs:	first output of t1
	outputs:	$50 to Dave, $20 to Bob

It is important that Bob transfers the $20 “change” back to himself, because a transaction output can be “spent” (that is, used as an input) only once. This style of transactions is called “UTxO” style; UTxO stands for “Unspent Transaction (Tx) Outputs”.

The blockchain is basically a long list of such transactions. The corresponding formal definition looks something like this.

Figure 1

Wallet

The cryptocurrency’s wallet is a piece of software that monitors the state of the blockchain, keeps track of the user’s funds (more precisely, their UTxO) and enables them to create new transactions to be included into the blockchain. The wallet is the primary means by which users interact with the blockchain. Note that the verification of these new transactions and the decision whether or not to include them into the global blockchain is not up to the wallet; how this happens is beyond the scope of this blog post.

A formal specification of the wallet is a mathematical abstraction that strips away all irrelevant detail and just focuses on the core functionality of the wallet. In the basic model of the Cardano wallet, we stripped down the wallet state to just the wallet’s UTxO and its set of pending transactions; the specification is small enough to fit on a single page.

Figure 3

Such a model is simple enough to be studied in-depth and support mathematical proofs of its properties. Yet it is accurate enough to be an invaluable guide in the design of the wallet and be the base for unit tests that drive the real implementation. It can also be used to study where we most need to worry about performance and how we can address those concerns.

Balance

As an example of one of the trickier aspects of the wallet’s design, we will discuss reporting balance in a bit more detail. It may be surprising that reporting balance is non-trivial, but even for regular bank accounts we already have two notions of balance. After Alice transfers $100 to Bob, her online bank account might tell her

  • Your current balance is $1000.
  • There is a pending transaction of $100, so your available balance is $900.

Unlike regular bank transactions, UTxO-style transactions involve change; this leads naturally to three notions of balance. If we take as example transaction t2 above, Bob’s balance might be reported as

  • Your UTxO balance is $1070.
  • There is a pending transaction t2, so your available balance is $1000.
  • However, transaction t2 returns $20 in change, so your total balance is $1020.

Note that the change from t2 becomes available only when transaction t2 has been included into the blockchain.

Although there are user interface questions here (how should we report these different kinds of balance to the user?), from the wallet design perspective this is not yet particularly difficult. The real complexity comes from the fact that there may be temporary disagreement about which transactions are included in the blockchain (such disagreements are known as “forks”; how they arise and are resolved is again well beyond the scope of this blog post).

Let’s stick to the above example, and suppose that t2 is pending, but there was disagreement about t1. After the disagreement got resolved the wallet discovers that t1 is not actually present in the blockchain after all (perhaps it will be later). The wallet’s available balance is now still $1000, but would it really make sense to report its total balance as $1020? It would be strange to have the total balance be larger than the UTxO balance; not wrong per se, but confusing nonetheless.

In the specification we therefore define the concept of minimum balance: the minimum (UTxO) balance of the user “across all possible futures”; this is the only balance the user can be certain of. In the example, this is the future in which neither t1 nor t2 ever make it into the blockchain, and hence we report $1000 as the minimum balance (note that it can never happen that t2 is included but t1 is not, since t2 depends on t1). While this concept makes intuitive sense, in order to make it precise and be able to compute it we needed to introduce the concept of “expected UTxO”: unspent transaction outputs that the wallet expects will become available in the future but haven’t yet.

Figure 8

Of course, even without a formal specification it is possible that we could have come up with this concept of minimum balance and a way to compute it. But having the formal model allows us to think much more clearly about the problems, remove all the clutter that surrounds a “real” implementation of the wallet and focus only on the core ideas.

Internal consistency: invariants

When we introduce a novel concept such as “expected UTxO” into the specification, there is a tricky question: how do we know that what we specified is right? Actually, since we introduced the concept ourselves, the question of whether it was “right” or not is not particularly meaningful. Instead we ask: is what we specified useful?

One way to answer this question is by stating invariants. Invariants are properties that we expect to be true at all times. For example, for the basic model shown above (Figure 3) we have the following invariant:

Invariant 3.4. txins pending ⊆ dom utxo

What this invariant says is that the pending transactions can only spend from the wallet’s unspent outputs. Intuitively, this makes a lot of sense: the wallet should not allow the user to spend funds they don’t have. As we have seen however in the previous section, this invariant does not always hold when we take disagreements about the blockchain into account! When Bob submits transaction t2, spending an output of transaction t1, and only later discovers that actually t1 should not have been included in the blockchain after all, he will have spent an output that is not in his UTxO.

The concept of expected UTxO comes to the rescue, again. The invariant for the full model is instead:

Invariant 7.8. txins pending ⊆ dom (utxo ∪ expected)

In other words, pending transactions can only spend available outputs or outputs that we expect to become available (because they were previously).

Another useful invariant that helps further solidify our intuition about what we mean by expected UTxO says that an output can never be in both the actual UTxO and the expected UTxO

Invariant 7.6. dom utxo ∩ dom expected = ∅

After all, it would be strange to say that an output is expected if we already have it in our wallet. Stating invariants like this allows us to make our intuitions about the concepts we introduce precise, and proving them gives us strong guarantees that our specification makes sense and is internally consistent.

Semi-formal development

Formally proving invariants such as the ones discussed above can be time consuming and requires mathematical training. Naturally, doing the proofs would be ideal, but the main point of this blog post is that we push this approach a long way even if we don’t. After all, we program in Haskell precisely because we can easily translate back and forth between Haskell and mathematics.

To translate the various wallet models from the specification, we use the approach we described in a previous blog on Object Oriented Programming in Haskell (indeed, we developed that approach specifically for this purpose). For instance, here is the Haskell translation of the basic model:

mkWallet :: (Hash h a, Ord a, Buildable st)
         => Ours a -> Lens' st (State h a) -> WalletConstr h a st
mkWallet ours l self st =
  (mkDefaultWallet (l . statePending) self st) {
      utxo       = st ^. l . stateUtxo
    , ours       = ours
    , applyBlock = \b -> self (st & l %~ applyBlock' ours b)
    }

applyBlock' :: Hash h a
            => Ours a -> Block h a -> State h a -> State h a
applyBlock' ours b State{..} = State {
    _stateUtxo    = updateUtxo ours b _stateUtxo
  , _statePending = updatePending   b _statePending
  }

updateUtxo :: forall h a. Hash h a
           => Ours a -> Block h a -> Utxo h a -> Utxo h a
updateUtxo p b = remSpent . addNew
  where
    addNew, remSpent :: Utxo h a -> Utxo h a
    addNew   = utxoUnion (utxoRestrictToOurs p (txOuts b))
    remSpent = utxoRemoveInputs (txIns b)

updatePending :: Hash h a
              => Block h a -> Pending h a -> Pending h a
updatePending b = Map.filter $ \t -> disjoint (trIns t) (txIns b)

It deals with more details than the specification; for instance, it explicitly abstracts away from a specific choice of hash h, as well the types of addresses a. It is therefore a bit more complicated than the spec, but it nonetheless follows the specification very closely. In particular, it is still a model: it does not deal with any networking issues or persistent storage, may not be particularly performant, etc. In other words, this is not intended to be the design of the real wallet. Having this model is nonetheless incredibly useful, for two reasons. The first is that we can use it to test the real wallet; we will discuss that in the next section.

The second reason is that we can use the model to test the invariants. For example, here is the translation of Invariants 7.8 and 7.6 from the previous section to Haskell:

pendingInUtxoOrExpected :: WalletInv h a
pendingInUtxoOrExpected l e =
  invariant (l <> "/pendingInUtxoOrExpected") e $ \w ->
   checkSubsetOf
    ("txins pending",
      txIns (pending w))
    ("utxo ∪ expected",
      utxoDomain (utxo w) `Set.union` utxoDomain (expectedUtxo w))

utxoExpectedDisjoint :: WalletInv h a
utxoExpectedDisjoint l e =
  invariant (l <> "/utxoExpectedDisjoint") e $ \w ->
   checkDisjoint
    ("dom utxo",
      utxoDomain (utxo w))
    ("dom expected",
      utxoDomain (expectedUtxo w))

As for the wallet implementation, the Haskell translation of the invariants deals with more details than the spec; in this case, one of the main differences is that the infrastructure for these invariants is designed to give detailed error reports when the invariants do not hold. Nonetheless, the main part of the invariant is again a direct translation of the specification.

The big gain is that we can now use QuickCheck to test these invariants. We generate random (but valid) events for the wallet (“apply this block”, “submit this new transaction”, “switch to a different fork”) and then check that the invariants hold at each point. For example, in the first release of the wallet specification there was a silly mistake: when the wallet was notified of a new block, it removed the inputs of that block from the expected UTxO, rather than the outputs. A silly mistake, but easy to miss when just reviewing the specification by hand. A proof would have found the mistake, of course, but so can QuickCheck:

Wallet unit tests
  Test pure wallets
    Using Cardano model FAILED [1]

Failures:

  test/unit/Test/Spec/Models.hs:36:
  1) Wallet unit tests, Test pure wallets, Using Cardano model
       predicate failed on: Invalid [] InvariantViolation {
           name:     full/utxoExpectedDisjoint
         , evidence: NotSubsetOf {
                 dom utxo: ..
               , dom expected: ..
               , dom utxo `intersection` dom expected: ..
             }
         , events:   {
                 state: ..
               , action: ApplyBlock ..
               , state: ..
               , action: NewPending Transaction{ .. }
               , state: ..
               ..
               , action: Rollback
               ..
             }
         }

Not only does this tell us that the invariant did not hold; it actually gives us the specific sequence of events that lead to the wallet state in which the invariant does not hold (as well as all the intermediate wallet states), and it tells what the domain of the UTxO and the expected UTxOs are in that state as well as their intersection (which should have been empty but wasn’t).

Testing the real implementation

As mentioned, the Haskell translation of the wallet specification is still a model which ignores a lot of the real world complexity that the full wallet implementation must deal with. Even the datatypes that the model works with are simplified versions of the real thing: transactions don’t include signatures, blocks are not real blocks but just lists of transactions, etc.

Nonetheless, we can use the model to test the real implementation also. We can translate the simplified types of the model to their real counterparts. Since we have QuickCheck generators for the simplified types and we can test the model because we have an implementation of it, we can test the real wallet as shown in the following commuting diagram:

Commute

In words, what this means is that we use the QuickCheck generator to generate wallet events using the simplified model types and then do two things:

  • We execute the model on the simplified types and translate after.
  • We translate first and then execute the real wallet on the translated input.

Either way, we end up with two final answers (both in terms of the real Cardano types), one executed in the model and one executed in the real wallet. We can compare these two, and if they match we can conclude that the real wallet implements the model correctly. Since we check this property at each step and we know that the invariants hold in the model at each step, we can then also conclude that the invariants hold in the real implementation at each step.

For example, when a bug in the full wallet meant that change from pending transactions was sometimes double-counted (in the case where pending transactions use the change of other pending transactions as inputs, a case that can only happen in the presence of forks), the generator will be able to find a counter-example to the above commuting diagram, and then give us the exact sequence of wallet events that leads to a wallet state in which the real wallet and the model disagree as well as the precise value that they disagree on.

Conclusions

Software specifications, where they exist at all, are often informal documents that describe the features that the software is intended to have in natural language. Such specifications do not lend themselves easily to verification or even testing. At the other end of the spectrum, fully formal specifications where every property is verified mathematically are costly and time consuming to produce and require specialized expertise. They are of course the golden standard, but there is also a useful middle-ground: by specifying the model and its properties formally, we can test its properties using QuickCheck. Moreover, we get a model in which we can reason about the core functionality of the application and which we can then compare against the real implementation.

IOHK’s development of the new wallet is open source and can be found on GitHub. Excitingly, IOHK have recently hired someone to start work on the Coq formalization of the wallet specification, which will put the whole specification on an even stronger footing. Of course, that does not make any of the existing work useless: although it will become less important to test the invariants in the model, having the QuickCheck generators available to check the real implementation is still very valuable. Moreover, having the QuickCheck tests validate the invariants before we attempt to prove them formally can also save valuable time if we discover early that an invariant is not, in fact, true.

References

"Formal specification for a Cardano wallet”, Duncan Coutts and Edsko de Vries
GitHub repository for the new wallet


1. We ignore transaction fees for the sake of simplicity.

Artwork,
Creative Commons
Mike Beeple

Vision for blockchain in Africa is becoming a reality

Ethiopia and Rwanda keen to realise promise of the technology

29 May 2018 John O'Connor 7 mins read

Vision for blockchain in Africa is becoming a reality - Input Output

The hope of bringing the benefits of blockchain to Africa has been around even longer than IOHK itself. Founder Charles Hoskinson talked about the promise of the technology driving financial inclusion on the continent in a TEDx speech in 2014. The vision encouraged me to join the Cardano project. This month, IOHK signed an MoU with the Ethiopian government to train and hire junior software developers and use Cardano in its agriculture industry, a step towards this promise being realised. Now the real work begins, as we look to use Cardano to solve real problems in Ethiopia and beyond.

The IOHK team flew into Addis Ababa, the capital, on May 3 for Ethiopia’s first blockchain forum, which we jointly organised with the Ethiopian Ministry of Science and Technology. As Director of African Operations, my last two months in Ethiopia have been a whirlwind of interaction with government, universities and the local technology scene. It was a pleasure to have those that I’ve met join us on stage to discuss what blockchain can do for Ethiopia.

We did not arrive with solutions, but with a commitment to find them. IOHK’s Alan McSherry who leads the Cardano Enterprise team, came out to gather business requirements for permissioned applications of Cardano, as did 13 other members of the IOHK team. Seeing the ideas that emerged from the dialogue between IOHK and the ministry validated our launch in Ethiopia, and our belief in a productive and mutually beneficial relationship. This belief is enshrined in the MoU Charles signed with the Minister of Science and Technology, Dr Getahun.

MoU’s are non-binding but are taken seriously in Ethiopia, because they represent a required first step before entering into contractual obligations. In the MoU, we stated our intent to train and hire Haskell developers from a new office in Addis Ababa. The Ethiopian government stated its intent to help us to deploy Cardano in Ethiopian agriculture. This training course is the first of its kind in Africa. We expect to start in September 2018 with an inaugural class of 30 women developers. All of our trainees will leave with the ability to create blockchain applications to help drive tech-enabled growth in Ethiopia. Those who excel will be hired by IOHK and not only contribute to Cardano code, but help build the blockchain agriculture applications we are exploring in partnership with the government. With 80 million Ethiopians working in this industry, the opportunity for blockchain to make an impact is huge. Coffee is Ethiopia’s largest export, and there are many sophisticated companies along the supply chain who have a deep understanding of the product.

We are in active discussions with many such companies to develop, refine, and implement the technology. Proving the origin of coffee is one such application. I’m also excited about using smart contracts to incentivise smallholder coffee farmers to adopt more productive farming practices (I’ll be writing more about some of our projects like this in the coming weeks). Some 80% of Ethiopia’s population is under the age of 30, and GDP growth is 10% a year. With the right support from government, Ethiopia could transform into an incredibly powerful, technology-enabled economy. Ethiopia’s national animal is the lion, and if that lion has been sleeping, it is now waking up.

Community day

The day after the conference IOHK spent time with many of Ethiopia’s most promising entrepreneurs and the tech community. IOHK worked with Ethiopian accelerators ICE Addis and Bluemoon to have local startups pitch their existing startups, and how they were planning on leveraging blockchain technology. The judge was, of course, Charles.

Charles meeting entrepreneurs in Addis

This was a highlight of the trip for me. After a pitch Charles would mischievously state that he had a few questions, before delivering a grilling that would have a Harvard MBA looking for the nearest exit. The startups held up fantastically.

S­­ince I began talking to local entrepreneurs in Addis two months ago I’ve received many messages from local companies interested in learning more about the technology. The first Cardano Addis meetup was the beginning of a process of fostering understanding and adoption of the technology. The pitches showed me that these startups have not only begun looking at blockchain, but have puzzled out how it can add value to their businesses. Our colleagues in Cardano’s commercial entity Emurgo shared my view, and are in discussions about investing in a cohort of blockchain-enabled Ethiopian technology startups.

Rwanda

The next leg of the trip for the IOHK team took us to Kigali, Rwanda, where Charles was talking at the Transform Africa Summit.

Charles speaking at the Transform Africa Summit in Kigali

The summit aimed to bring together global and regional leaders from government, business and international organizations to collaborate on new ways of shaping, accelerating and sustaining Africa’s digital revolution. Blockchain was a key focus for the conference organisers, and we were honoured to have been invited and recognised for the efforts we are beginning to make in the region. Rwanda is truly a miraculous example of what can be accomplished when a country is united in its purpose and its government has embraced the principles of accountability, efficiency and openness.

The opportunity for IOHK to help enable this transformation is incredibly exciting. There are many countries where the promise of a technology that can bring transparency to government processes would be met with trepidation. It is huge credit to Rwanda that they seek to not only participate, but promote the technology’s adoption through the Transform Africa Summit.

And so on to strategy. Whilst we expect to have operations across many African countries in the future, focusing our initial efforts on Ethiopia and Rwanda will be a powerful start. Ethiopia has a government that is eager to engage in digital transformation and though early in that process, has the ability to deploy the technology across the country. The applications we are discussing could benefit tens of millions of people. Rwanda on the other hand has a population a tenth the size of Ethiopia. However they are leading Africa in their commitment to support innovative technologies. Ideas can be tried and tested. If they fail, lessons will be learnt and the next attempt will be better. This principle is the central driver of innovation, and Rwanda has embraced it.

Charles and John with Dr Getahun, Minister of Science and Technology

After a whirlwind few weeks I’d like to thank everyone who has allowed us to reach this point. I thank ICE Addis, Bluemoon capital, as well as Impact hub in Kigali for hosting events with us and connecting me with the local entrepreneurship. I thank the IOHK and Emurgo teams who flew out to be on the ground and help us establish our work. But most importantly, I’d like to thank Yodahe Zemichael from the Science Technology and Information Centre and Dr Getahun Mekuria, Minister of Science and Technology. You have been more agile, proactive, and enthusiastic partners than we could have hoped for. We look forward to working with you.