Posted on by Achim D. Brucker, licensed under CC BY-ND 4.0.

Formalising EFSM Transition Merging

Quite often, we need to develop a detailed understanding how a system works, without having access to its internal implementation. For example, we might need to understand and old legacy system, which we want to connect to our brand-new infrastructure, or we want to do a security assessment of a third party system.

One approach to develop a better understanding of a “black-box” system is to infer a model (e.g., in form of a Finite State Machine) from system traces.

Consider, for example, the following traces: \[ \begin{align*} &\mathit{select}(\mathit{coke}) \rightarrow \mathit{coin}(50)/[50] \rightarrow \mathit{coin}(50)/[100] \rightarrow \mathit{vend}()/[coke] \\ &\mathit{select}(\mathit{coke}) \rightarrow \mathit{coin}(100)/[100] \rightarrow \mathit{vend}()/[\mathit{coke}] \\ &\mathit{select}(\mathit{pepsi}) \rightarrow \mathit{coin}(50)/[50] \rightarrow \mathit{vend}() \rightarrow \mathit{coin}(50)/[100] \rightarrow \mathit{vend}()/[\mathit{pepsi}] \end{align*} \] These traces describe certain executions of a vending machine (where \(\mathit{label}(\mathit{arguments})/[\mathit{outputs}]\) represents an event).

Using Finite State Machine (FSM) inference, we can construct a FSM that accepts the traces from out example:

An FSM accepting the traces from our example.

In traditional FSM inference, labels are atomic. For example, \(\mathit{select}(\mathit{coke})\) does not represent an event with label \(\mathit{select}\) and input \(\mathit{coke}\), rather the transition is labelled by the literal string “select(coke)” making it a completely separate entity from the transition “select(pepsi).” This is a major problem as it means that information such as the selected drink and accrued funds must be encoded as part of the control state. Increasing product choice or the coins accepted quickly causes an explosion in model size disproportionate to the change in observable behaviour.

EFSM inference can make use of a compact representation of the internal state of the inferred system:

An EFSM accepting the traces from our example.

In this EFSM, which also accepts the traces from our example, the selected drink is stored in a register \(r_1\) for later use in the output of the \(vend\) transition. A second register \(r_2\) (initialised with \(0\) by the select transaction) keeps track of the money inserted so far. Drinks are only dispensed once this value reaches \(100\). This enables customers to pay for their drink with any coin in any order. This is a much more concise and faithful model of the real system.

Current EFSM inference approaches tend to focus on guard expressions, overlooking how individual transitions mutate the data state. The inference of data update functions is a key technical challenge in EFSM inference but significantly complicates the merging process.

In our ICFEM paper [1], we present a novel approach for inferring EFSMs from system traces that also infers the updates of the internal state. Our approach supports the merging of transitions that update the internal data state. Finally, our model is formalised in Isabelle/HOL, allowing for the machine-checked verification of system properties.

References

1. Foster, M., Taylor, R. G., Brucker, A. D., and Derrick, J. “Formalising Extended Finite State Machine Transition MergingICFEM (2018): 373–387. doi:10.1007/978-3-030-02450-5, URL: http://www.brucker.ch/bibliography/abstract/foster.ea-efsm-2018