Modeling Imperative String Operations with Transducers
نویسندگان
چکیده
We present a domain-specific imperative language, Bek, that directly models low-level string manipulation code featuring boolean state, search operations, and substring substitutions. We show constructively that Bek is reversible through a semantics-preserving translation to symbolic finite state transducers, a novel representation for transducers that annotates transitions with logical formulae. Symbolic finite state transducers give us a new way to marry the classic theory of finite state transducers with the recent progress in satisfiability modulo theories (SMT) solvers. We exhibit an efficient well-founded encoding from symbolic finite state transducers into the higher-order theory of algebraic datatypes. We evaluate the practical utility of Bek as a constraint language in the domain of web application sanitization code. We demonstrate that our approach can address real-world queries regarding, for example, the idempotence and relative strictness of popular sanitization functions.
منابع مشابه
BEK: Re-Envisioning In-Browser Privacy
Web applications must use special string-manipulating sanitization functions on untrusted user data, but writing these functions correctly is error prone and time consuming. We present a domain-specific imperative language, BEK, that is expressive enough to capture real web sanitizers used in the Internet Explorer XSS Filter and the Google AutoEscape framework. We exhibit a translation from the...
متن کاملAlgorithmic Verification of Single-Pass List Processing Programs
We introduce streaming data string transducers that map input data strings to output data strings in a single left-to-right pass in linear time. Data strings are (unbounded) sequences of data values, tagged with symbols from a finite set, over a potentially infinite data domain that supports only the operations of equality and ordering. The transducer uses a finite set of states, a finite set o...
متن کاملEncoding second order string ACG with deterministic tree walking transducers
In this paper we study the class of string languages represented by second order Abstract Categorial Grammar. We prove that this class is the same as the class of output languages of determistic tree walking automata. Together with the result of de Groote and Pogodalla (2004) this shows that the higher-order operations involved in the definition of second order ACGs can always be represented by...
متن کاملBelief Propagation with Strings
Strings and string operations are very widely used, particularly in applications that involve text, speech or sequences. Yet the vast majority of probabilistic models contain only numerical random variables, not strings. In this paper, we show how belief propagation can be applied to do inference in models with string random variables which use common string operations like concatenation, find/...
متن کاملOn Precise Modeling of Regular Replacement
This paper studies the precise modeling of various semantics of regular substitution, such as the declarative, finite, greedy, and reluctant replacement, using finite state transducers (FST) as filters. By projecting an FST of regular replacement to its input/output tapes, we are able to solve atomic string constraints, which can be applied to both the forward and backward image computation in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010