Posted on by Achim D. Brucker, licensed under CC BY-ND 4.0.

Formalizing the Core DOM in Isabelle/HOL

At its core, the Document Object Model (DOM) defines a tree-like data structure for representing documents in general and HTML documents in particular. It is the heart of any modern web browser. Formalizing the key concepts of the DOM is a prerequisite for the formal reasoning over client-side JavaScript programs and for the analysis of security concepts in modern web browsers.

As a first step towards a verified client-side web application stack, we model and formally verify the Document Object Model (DOM) in Isabelle/HOL. The Document Object Model (DOM) is the central data structure of all modern web browsers. At its core, the DOM defines a tree-like data structure for representing documents in general and HTML documents in particular. Thus, the correctness of a DOM implementation is crucial for ensuring that a web browser displays web pages correctly. Moreover, the DOM is the core data structure underlying client-side JavaScript programs, i.e, client-side JavaScript programs are mostly programs that read, write, and update the DOM.

We formalize the DOM as a shallow embedding in Isabelle/HOL using a typed data model for the node-tree. Furthermore, we formalize a typed heap for storing (partial) node-trees together with the necessary consistency constraints. Finally, we formalize the operations on this heap that allow manipulating node-trees.

For example, the HOL definitions of \(\text{adopt_node}\), i.e. the method that removes a node from its previous parent, if it had any, and assigns it to the new \(\text{ownerDocument}\), looks as follows: \[ \begin{array}{l} \color{blue}{\textbf{definition}}~\text{adopt_node}~::~\\ ~~~~\_~\text{document_ptr}_\text{Core_DOM}~\Rightarrow~\_~\text{node_ptr}_\text{Core_DOM}~\Rightarrow~\_~\text{dom_prog}\\ ~~\textbf{where}\\ ~~~~\text{adopt_node}~document~node~=~\text{do}~\{~\\ ~~~~~~parent\_opt~\leftarrow~\text{get_parent}~node;~\\ ~~~~~~(\text{case}~parent\_opt~\text{of}~\\ ~~~~~~~~\text{Some}~parent~\Rightarrow~\text{remove_child}~parent~node~\\ ~~~~~~|~\text{None}~\Rightarrow~\text{do}~\{~\\ ~~~~~~~~~~old\_document~\leftarrow~\text{get_owner_document}~(\text{cast}~node);~\\ ~~~~~~~~~~\text{remove_from_disconnected_nodes}~old\_document~node\\ ~~~~~~~~\});~\\ ~~~~~~\text{add_to_disconnected_nodes}~document~node~\\ ~~~~~\}~ \end{array} \] First, \(\text{adopt_node}\) tries to retrieve the parent of the node to be adopted. If the node has a parent node, it removes the node from the children list, otherwise it removes it from the list of disconnected nodes of the previous owner document. Finally, the node is now added to the disconnected nodes of the new document.

We can now formally prove important properties of \(\text{adopt_node}\) such as \[ \begin{array}{l} \color{blue}{\textbf{lemma}}~\text{adopt_node_children_remain_distinct}:\\ ~~\textbf{assumes}~\text{wellformed}:~\text{heap_is_wellformed}~h\\ ~~~~\textbf{and}~\text{parent_known}:~\And~parent.\\ ~~~~~~h~\vdash~\text{get_parent}~node\_ptr~\rightarrow_r~\text{Some}~parent\\ ~~~~~~\Longrightarrow~\text{is_known_ptr}_\text{Core_DOM}~parent\\ ~~~~\textbf{and}~\text{adopt_node}:~h~\vdash~\text{adopt_node}~\text{owner_document}~node_ptr~\rightarrow_h~h2\\ ~~~~\textbf{and}~\text{ptr_known}:~\text{is_known_ptr}_\text{Core_DOM}~ptr\\ ~~~~\textbf{and}~\text{children}:~h2~\vdash~\text{get_child_nodes}~ptr~\rightarrow_r~children\\ ~~\color{blue}{\textbf{shows}}~\text{distinct}~children\\ \end{array} \] This lemma states that after using \(\text{adopt_node}\), all children lists remain distinct.

Our machine-checked formalization of the DOM node tree has the following properties:

  • It provides a consistency guarantee. Since all definitions in our formal semantics are conservative and all rules are derived, the logical consistency of the DOM node-tree is reduced to the consistency of HOL.
  • It serves as a *technical basis for a proof system. Based on the derived rules and specific setup of proof tactics over node-trees, our formalization provides a generic proof environment for the verification of programs manipulating node-trees.
  • It is executable*, which allows to validate its compliance to the standard by evaluating the compliance test suite on the formal model, and
  • It is extensible, i.e, properties proven over the core DOM do not need to be re-proven for object-oriented extensions such as the HTML document model.

For more details, see our WWW paper [1]. The formalization is available in the Archive of Formal Proofs (AFP) [2].

References

[1]
A. D. Brucker and M. Herzberg, “A formal semantics of the Core DOM in Isabelle/HOL,” in The 2018 web conference companion (WWW), 2018, pp. 741–749. doi: 10.1145/3184558.3185980.
[2]
A. D. Brucker and M. Herzberg, The Core DOM,” Archive of Formal Proofs, Dec. 2018,

Welcome to the blog of the Software Assurance & Security Research Team at the University of Exeter. We blog regularly news, tips & tricks, as well as fun facts about software assurance, reliability, security, testing, verification, hacking, and logic.

You can also follow us on Twitter: @logicalhacking.

Categories

Archive

Tags

academia ai android apidesign appsec bitcoin blockchain bpmn browser browserextensions browsersecurity bug certification chrome composition cordova dast devops devsecops dom dsbd efsm epsrc event extensions fixeffort floss formaldocument formalmethods funding hol-ocl hol-testgen humanfactor hybridapps iast industry internetofthings iot isabelle/hol isabelledof isadof latex logic maintance malicous mbst mobile mobile apps modelinference modeling monads monitoring msc ocl ontology opensource owasp patches pet phd phdlife phishing policy protocols publishing reliability research safelinks safety sap sast sdlc secdevops secureprogramming security securityengineering securitytesting semantics servicecomposition skills smartcontract smartthings softwareeinginering softwaresecurity softwaresupplychain solidity staff&positions statemachine studentproject tcb test&proof testing tips&tricks tools transport tuos uk uoe upgrade usability verification vulnerabilities vulnerableapplication webinar websecurity

Search


blog whole site