Ever wondered how to verify smart contracts written in Solidy? Thanks to our deep embedding of Solidty into Isabelle/HOL, you can now start verifying smart contracts in Isabelle.
Our formalization is available in the Archive of Formal Proofs [1], which can be easily added to Isabelle/HOL. If you want to read a more high-level description of the underlying work, read our conference papers on the topic [3].
In our work “A Denotational Semantics of Solidity in Isabelle/HOL” [1] we presented a formal semantics for Solidity, the most common language for implementing smart contracts. Such a formal semantics one of the corner stones of developing a formal verification approach that, mathematically, can prove the absence of certain types of bugs (e.g., such as the Parity Wallet bug that made USD 280mil worth of Ether inaccessible).
But, of course, any verification can only be as good as the underlying semantics. So, how do we ensure that our formal semantics actually captures the behavior of Solidity faithfully? This is the question we answer in our latest paper [2] that will be presented at the International Conference on Tests and Proofs.
In our approach, we use grammar-based fuzzing, a technique that generates example programs from a formal grammar. Our goal is to ensure that our post-hoc developed formal semantics of Solidity complies to the real world system, i.e., our formalization should behave identical to the implementation. For ensuring this, we generate a test oracle from our formal specification, which allows us to check that executing a test case (generated by the grammar-based fuzzer) executed on the formal semantics yields the same result as executed on the Ethereum blockchain. Our main contributions are:
Update: The formalization is now also available in the Archive of Formal Proofs [3].
Smart contracts are programs, usually automating legal agreements such as financial transactions. Thus, bugs in smart contracts can lead to large financial losses. For example, an incorrectly initialized contract was the root cause of the Parity Wallet bug that made USD 280mil worth of Ether inaccessible. Ether is the cryptocurrency of the Ethereum blockchain that uses Solidity for expressing smart contracts.
In our SEFM paper [1], we address this problem by presenting an executable denotational semantics for Solidity in the interactive theorem prover Isabelle/HOL. This formal semantics builds the foundation of an interactive program verification environment for Solidity programs and allows for inspecting Solidity programs by (symbolic) execution. We combine the latter with grammar-based fuzzing to ensure that our formal semantics complies to the Solidity implementation on the Ethereum Blockchain. Finally, we demonstrate the formal verification of Solidity programs by two examples: constant folding and memory optimization.
The formalization and presented tools are available on Zenodo marmsoler.ea:zenodo-isolidity:2021?.
Why formalize (e.g., developing a formal semantics) an existing standard (e.g., as we did with our formalization of the DOM standard)? Of course, there are obvious benefits such as identifying glitches or areas of (unwanted) under-specification, but what are the wider benefits?
Actually, the most important question that needs to be answered is: what is the relation between the formalization of a standard, the official standard, and implementations that claim to conform to the standard. If there is not a strong link between the formalization between all three artifacts, formal proofs based on the formal specification are only of limited value. This is true regardless if the formalization has been developed in a post-hoc reverse-engineering fashion or is part of the official standard.
Actually, most popular technologies are only specified by standards using a semi-formal or, worse, an informal notation. Moreover, the tools used for writing standards only support, if at all, trivial consistency checks. Thus, it is no surprise that such standards usually contain inconsistencies (e.g., different sections of the same standard that contradict each other) or unwanted under-specifications (e.g., where the authors of the standard omit the specification of important properties that the defined API should fulfill).
Even if a standard is developed formally, or contains a (often non-normative) formalization, two important questions arise:
If the formal model was used for verifying properties, one also needs to validate that the real system fulfills the assumptions made during the verification.
Luckily, for many industrial standards, it is common that the standard includes a compliance-test suite, which gives a first hint how to improve the situation: if the formalization is executable, we can execute the test cases on the formal specification to check that the specification complies to the test suite. But can we do more?
Yes, we can. The following figure shows how we can use test and proof (verification) techniques for establishing strong links between formal standards, compliance test suites, and implementations.
In more detail, we can
This approach shows that not only go test and verification hand-in-hand, it also shows that a formalization of a standard can contribute to improving the informal parts of a standard, such as the compliance test suite.
If you are interested in more detail, please have a look at our TAP paper [1], where we report on applying some of these ideas to our formalization of the DOM standard [2].
At its core, the Document Object Model (DOM) defines a tree-like data structure for representing documents in general and HTML documents in particular. It is the heart of any modern web browser. Formalizing the key concepts of the DOM is a prerequisite for the formal reasoning over client-side JavaScript programs and for the analysis of security concepts in modern web browsers.
As a first step towards a verified client-side web application stack, we model and formally verify the Document Object Model (DOM) in Isabelle/HOL. The Document Object Model (DOM) is the central data structure of all modern web browsers. At its core, the DOM defines a tree-like data structure for representing documents in general and HTML documents in particular. Thus, the correctness of a DOM implementation is crucial for ensuring that a web browser displays web pages correctly. Moreover, the DOM is the core data structure underlying client-side JavaScript programs, i.e, client-side JavaScript programs are mostly programs that read, write, and update the DOM.
We formalize the DOM as a shallow embedding in Isabelle/HOL using a typed data model for the node-tree. Furthermore, we formalize a typed heap for storing (partial) node-trees together with the necessary consistency constraints. Finally, we formalize the operations on this heap that allow manipulating node-trees.
For example, the HOL definitions of \(\text{adopt_node}\), i.e. the method that removes a node from its previous parent, if it had any, and assigns it to the new \(\text{ownerDocument}\), looks as follows: \[ \begin{array}{l} \color{blue}{\textbf{definition}}~\text{adopt_node}~::~\\ ~~~~\_~\text{document_ptr}_\text{Core_DOM}~\Rightarrow~\_~\text{node_ptr}_\text{Core_DOM}~\Rightarrow~\_~\text{dom_prog}\\ ~~\textbf{where}\\ ~~~~\text{adopt_node}~document~node~=~\text{do}~\{~\\ ~~~~~~parent\_opt~\leftarrow~\text{get_parent}~node;~\\ ~~~~~~(\text{case}~parent\_opt~\text{of}~\\ ~~~~~~~~\text{Some}~parent~\Rightarrow~\text{remove_child}~parent~node~\\ ~~~~~~|~\text{None}~\Rightarrow~\text{do}~\{~\\ ~~~~~~~~~~old\_document~\leftarrow~\text{get_owner_document}~(\text{cast}~node);~\\ ~~~~~~~~~~\text{remove_from_disconnected_nodes}~old\_document~node\\ ~~~~~~~~\});~\\ ~~~~~~\text{add_to_disconnected_nodes}~document~node~\\ ~~~~~\}~ \end{array} \] First, \(\text{adopt_node}\) tries to retrieve the parent of the node to be adopted. If the node has a parent node, it removes the node from the children list, otherwise it removes it from the list of disconnected nodes of the previous owner document. Finally, the node is now added to the disconnected nodes of the new document.
We can now formally prove important properties of \(\text{adopt_node}\) such as \[ \begin{array}{l} \color{blue}{\textbf{lemma}}~\text{adopt_node_children_remain_distinct}:\\ ~~\textbf{assumes}~\text{wellformed}:~\text{heap_is_wellformed}~h\\ ~~~~\textbf{and}~\text{parent_known}:~\And~parent.\\ ~~~~~~h~\vdash~\text{get_parent}~node\_ptr~\rightarrow_r~\text{Some}~parent\\ ~~~~~~\Longrightarrow~\text{is_known_ptr}_\text{Core_DOM}~parent\\ ~~~~\textbf{and}~\text{adopt_node}:~h~\vdash~\text{adopt_node}~\text{owner_document}~node_ptr~\rightarrow_h~h2\\ ~~~~\textbf{and}~\text{ptr_known}:~\text{is_known_ptr}_\text{Core_DOM}~ptr\\ ~~~~\textbf{and}~\text{children}:~h2~\vdash~\text{get_child_nodes}~ptr~\rightarrow_r~children\\ ~~\color{blue}{\textbf{shows}}~\text{distinct}~children\\ \end{array} \] This lemma states that after using \(\text{adopt_node}\), all children lists remain distinct.
Our machine-checked formalization of the DOM node tree has the following properties:
For more details, see our WWW paper [1]. The formalization is available in the Archive of Formal Proofs (AFP) [2].