0.1 Introduction
A formula in some theory is monadically decomposable if it is equivalent to a Boolean combination of monadic predicates in , i.e., to a monadic decomposition of . Monadic decomposability measures how tightly the free variables in are coupled. For example, is not monadically decomposable in any (finitary) logic over an infinite domain, but can be decomposed, in Presburger arithmetic over natural numbers, since it can be written as .
Veanes et al. [27] initiated the study of monadic decomposability in the setting of Satisfiability Modulo Theories, wherein formulas are required to be quantifierfree. Monadic decomposability has many applications, including symbolic transducers [12] and string analysis [27]. Although the problem was shown to be in general undecidable, a generic semialgorithm for outputting monadic decompositions (if decomposable) was provided. A termination check could in fact be added if the input formula belongs to a theory for which monadic decomposability is decidable, e.g., linear arithmetic, Tarski’s RealClosed Field, and the theory of uninterpreted functions. Hitherto, not much is known about the computational complexity of monadic decomposability problems for many firstorder theories (in particular, quantifierfree theories), and about practical algorithms. This was an open problem raised by Veanes et al. in [27].
Monadic decomposability is intimately connected to the variable partition problem, first studied by Libkin [21] nearly 20 years ago. In particular, a monadic decomposition gives rise to a partition of the free variables of a formula , wherein each part consists of a single variable. More precisely, take a partition of into sets of variables, with linearizations . The formula is decomposable (in some theory ) if it is equivalent to a boolean combination of formulas of the form . As suggested in [21], such variadic decompositions of have potential applications in optimization of database query processing and quantifier elimination. The author gave a general condition for the decidability of variable independence in firstorder theories. This result is unfortunately not easily applicable in the SMT setting for at least two reasons: (i) the full firstorder theory might be undecidable (e.g. theory of uninterpreted functions), and (ii) even for a firstorder theory that admits decidable monadic decompositions, the complexity of the algorithm obtained from [21] could be too prohibitive for the quantifierfree fragment. One example that epitomizes (ii) is the problem of determining whether a given relation over strings represented by a regular transducer could be expressed as a boolean combination of monadic predicates. The result of [21] would give a double exponentialtime algorithm for monadic decomposability, whereas it was recently shown in [5] to be solvable in polynomialtime (resp. polynomialspace) when the transducer is given as a deterministic (resp. nondeterministic) machine.
Contributions.
First, we determine the complexity of deciding monadic decomposability and outputting monadic decompositions (if they exist) for the theory of integer linear arithmetic in the setting of SMT. Our result is summarized in Theorem 0.1.1.
Theorem 0.1.1 (Monadic Decomposability).
Given a quantiferfree formula of Presburger Arithmetic, it is coNPcomplete to decide if is monadically decomposable. This is efficiently reducible to unsatisfiability of quantifierfree Presburger formulas. Moreover, if a decomposition exists, it can be constructed in exponential time.
We show a new application of monadic decomposability in integer linear arithmetic for SMT over strings, which is currently a very active research area, e.g., see [1, 10, 25, 20, 26, 2, 6, 3, 17, 22, 13]. One problem that makes string constraint solving difficult is the presence of additional length constraints, which forces the lengths of the strings in the solutions to satisfy certain linear arithmetic constraints. Whereas satisfiability of string equations with regular constraints is PSPACEcomplete (e.g. see [18, 14]), it is a longstanding open problem [15, 8] whether word equations with length constraints are decidable. Length constraints are omnipresent in Kaluza [25], arguably the first serious string constraint benchmarks obtained from realworld JavaScript applications. Using our monadic decomposability solver, we show that 90% of the Kaluza benchmarks are in fact in a decidable fragment of string constraints, since occurring length constraints can be completely removed by means of decomposition.
Next we extend our result to variadic decomposability (cf. [21]).
Theorem 0.1.2 (Variadic Decomposability).
It is coNPcomplete to decide if is decomposable, given a quantiferfree formula of Presburger Arithmetic and a partition of . This is efficiently reducible to unsatisfiability of quantifierfree Presburger formulas. Moreover, if a decomposition exists, it can be constructed in exponential time.
We show how this could be applied to quantifier elimination. In particular, we show that if a formula , where is quantifierfree, is decomposable—where and are linearizations of the variables in and —then we can compute in exponential time a formula such that , i.e., avoiding the standard doubleexponential blowup (cf. [28]).
Organization.
0.2 Preliminaries
0.2.1 Presburger Syntax
In this paper we study the problem of monadic decomposition for formulas in linear integer arithmetic. All of our results are presented for Presburger arithmetic over natural numbers, but they can be adapted easily to all integers.
Definition 0.2.1 (Fragments of Presburger Arithmetic).
A formula of Presburger arithmetic is a formula of the form where and is a quantifierfree Presburger formula:
where , with , variables range over , and . The operator denotes equality modulo , i.e., whenever is a multiple of . Formulas of the shape , , or are called atoms.
Existential Presburger formulas are formulas of the form for some quantifierfree Presburger formula . We let (resp. ) denote the set of all quantifierfree (resp., existential) Presburger formulas.
Let be a tuple of integer variables. We write for a linear sum over . Let . By slight abuse of notation, we may also write to denote a formula over the variables .
0.2.2 Monadic Decomposability
A quantifierfree formula is called monadic if every atom in contains at most one variable, and it is called monadically decomposable if is equivalent to a monadic formula . In this case, is also called a decomposition of . For our main results we use a slightly refined notion of a formula being decomposable:
Definition 0.2.2 (Monadically Decomposable on ).
Fix a logic (e.g. or ). We say a formula in is monadically decomposable on whenever
for some formulas and in .
It can be observed that a formula is monadically decomposable if and only if it is monadically decomposable on all variables occurring in the formula (cf. Lemma 0.2.5). We expand on this for the variadic case below.
We recall the following characterization of monadic decomposability for formulas with two free variables (cf. [9, 27, 5, 21]), which holds regardless of the theory under consideration. This can be extended easily to formulas with variables, but is not needed in this paper. Given a formula , define the formula as follows:
Proposition 0.2.3.
The relation is an equivalence relation. Furthermore, is monadically decomposable iff has a finite index (i.e. the number of equivalence classes is finite).
Using this proposition, it is easy to show that over a structure with an infinite domain (e.g. integer linear arithmetic) the formula is not monadically decomposable. As was noted already in [21], to check monadic decomposability of a formula in Presburger Arithmetic in general, we may simply check if there is an upper bound on the smallest representation of every equivalence class, i.e.,
However, to derive tight complexity bounds for checking monadic decomposability, this approach is problematic, since the above characterisation has multiple quantifier alternations. Using known results (e.g. [16]), one would only obtain an upper bound in the weak exponential hierarchy [16], which only admits doubleexponential time algorithms.
0.2.3 Variadic Decomposability
The notion of a variadic decomposition generalises monadic decomposition by considering partitions of the occurring variables.
Definition 0.2.4 (Decomposable).
Fix a logic (e.g. or ). Take a formula in and a partition of . We say is decomposable whenever
for some formulas in and linearizations of .
Observe that a formula is monadically decomposable on iff it is decomposable with . Moreover, we say a formula over the set of variables is variadic decomposable on whenever it is decomposable with .
General decompositions can be computed by decomposing on binary partitions , which is why we focus on this binary case in the rest of the paper. We argue why this is the case below.
Let a formula and be given. We can first decompose separately on each where . Using the algorithm in Section 0.4 we obtain for each a decomposition of a specific form:
Note, these decompositions can be performed independently using the algorithm in Section 0.4 and the second conjunct of each disjunct is with replaced by fixed constants . Additionally, each is polynomial in size and each can be represented with polynomially many bits. We note also that our algorithm ensures that each is satisfiable.
Given such decompositions, we can recursively decompose on . We first use the above decomposition for and obtain
Next, we use the decomposition for to decompose the copies of in the decomposition above. We obtain
This process repeats until all have been considered. If is decomposable, we find a decomposition. If is not decomposable, then it would not be possible to do the independent decompositions for each . Thus, for , we can use variadic decompositions on to compute decompositions.
The above algorithm runs in exponential time due both to the exponential size of the decompositions and the branching caused by the disjuncts. If we are only interested in whether a formula is decomposable, it is enough to ask whether it is decomposable on for each . In particular, a formula is monadically decomposable iff is decomposable for each variable . Since the complexity class coNP is closed under intersection, we obtain the following:
Lemma 0.2.5.
A coNP upper bound for monadic decomposability on a given variable implies a coNP upper bound for monadic decomposability. Likewise, a coNP upper bound for variadic decomposability on a given subset of variables implies a coNP upper bound for decomposability.
0.2.4 Example
Consider the formula given by . This formula is monadically decomposable, which means, it is decomposable for .
Our algorithm will first take a decomposition on and might obtain where and . Next, we use a decomposition on . For each we substitute , and as the final decomposition we get
0.3 Monadic Decomposability
0.3.1 Lower Bounds
We first show that unsatisfiability of Boolean formulas can be reduced to monadic decomposability of formulas with only two variables, directly implying coNPhardness:
Lemma 0.3.1 (coNPHardness).
Deciding whether a formula in is monadic decomposable is coNPhard.
Proof.
We reduce from unsatisfiability of propositional formulas to monadic decomposability of . Take a propositional formula . Let be the first primes. Let be the formula obtained from by replacing each occurrence of by . Given an assignment , we let
Thanks to the Chinese Remainder Theorem, is nonempty and periodic with period , which implies that is infinite for every . We also have that iff, for each , is true.
Now define . If is unsatisfiable, then is unsatisfiable and so it is decomposable. Conversely, if can be satisfied by some assignment , then is true for all (infinitely many) . Since all solutions to imply that , by Proposition 0.2.3 we have that is not monadically decomposable. ∎∎
We next provide exponential lower bounds for decompositions in either disjunctive normal form (DNF) or conjunctive normal form (CNF). DNF has been frequently used to represent monadic decompositions by previous papers (e.g. [21, 5, 9]), and it is most suitable for applications in quantifier elimination.
Lemma 0.3.2 (Size of Decomposition).
There exists a family of formulas in such that grows linearly in , while the smallest decomposition on in DNF/CNF is exponential in .
Proof.
Consider the formulas . Using a binary encoding of constants, the size of the formulas is linear in . We show that decompositions in DNF/CNF must be exponential in size.
Disjunctive: Suppose is a monadic decomposition in DNF. Each disjunct , if it is satisfiable at all, has an upper right corner such that holds, but . This immediately implies that exponentially many disjuncts are needed to cover the exponentially many points on the line .
Conjunctive: Suppose is a succinct monadic decomposition of in CNF. Since , it follows that . Therefore, is a succinct decomposition of in DNF, contradicting the lower bound for DNFs.∎∎
0.3.2 Upper Bound
We prove Theorem 0.1.1. Following Lemma 0.2.5, it suffices to show that testing decomposability on a variable is in coNP and that a decomposition can be computed in exponential time. Assume without loss of generality that we have where , and that we are decomposing on the first variable .
We claim that is monadically decomposable on iff
where is a bound exponential in the size of and is a formula asserting that and satisfy the same divisibility constraints. This bound is computable in polynomial time and is described in Section 0.3.4. To define , let be the set of all divisibility constraints or appearing (syntactically) in . Assume without loss of generality that always appears on the lefthand side of a divisibility constraint (i.e., in the position of ). We then define
We prove the claim in the following sections and simultaneously show how to construct the decomposition. Once we have established the above, we can test nondecomposability on by checking
which is decidable in NP. Thus we obtain a coNP decision procedure because the above formula is polynomial in the size of .
Example
We consider some examples. First consider the formula that cannot be decomposed on . Since there are no divisibility constraints, is simply . It is straightforward to see that, , for example by setting , , and .
Now consider the monadically decomposable formula
In this case . We can verify
holds, as it will be the case that and for all the formula will hold whenever holds and . The precondition ensures that the if and only if holds. We will construct the decomposition in the next section.
Expanded Divisibility Constraints
Observe that divisibility constraints are always decomposable. In particular, is equivalent to a finite disjunction of clauses where and are bounded by a multiple of and . The expansion is exponential in size, since the values up to have to be enumerated explicitly.
We define be the set of all constraints of the form where and appears directly in or in the expansion of the divisibility constraints of . This set will be used in the next sections.
0.3.3 Soundness
We show that if
then is decomposable on . We do this by constructing the decomposition.
Although there are doubly exponentially many subsets , there are only exponentially many maximal consistent subsets. We implicitly restrict to such subsets. This is because, for any , there is no value of such that and both hold with but . For any maximal consistent set , let be the smallest integer greater than or equal to satisfying all constraints in . Note, since is maximal, a value that satisfies all constraints in also does not satisfy an constraints not in . The number can be represented using polynomially many bits.
We can now decompose into
This formula is exponential in the size of if only ranges over the maximal consistent subsets of . For values of less than , equivalence with the original formula is immediate. For larger values, we use the fact that, from our original assumption, for any values and that satisfy the same divisibility constraints, we have iff . Hence, we can substitute the values in these cases.
Example
We return to and compute the decomposition on . Assuming
is odd, the decomposition will be as follows. In our presentation we slightly simplify the formula. Strictly speaking
should be expanded to . We simplify these to and , respectively, when instantiated with concrete values of .0.3.4 Completeness
We now show that every formula decomposable on satisfies
We first show that some must exist. Once the existence has been established, we can argue that it must be at most exponential in .
Existence of the Bound
If is decomposable on , then there is an equivalent formula . It is known that every formula is satisfied by a finite union of arithmetic progressions . Let be larger than the largest value of in the arithmetic progressions satisfying the .
We show when then iff for all values and . Assume towards a contradiction that we have values and a tuple of values such that and , but not .
Let be the product of all appearing in some divisibility constraint in . We know that there is some disjunct of the monadic decomposition such that holds. Moreover, let belong to the arithmetic progression . Since we know that also holds for any . That is, we can pump by adding a multiple of , while staying in the same arithmetic progression and satisfying the same divisibility constraints.
Similarly, let be the product of all appearing in the (finite number of) arithmetic progressions that define the monadic decomposition of , limited to disjuncts such that holds. Since does not hold, then also does not hold for any . This means that we can pump staying outside of the arithmetic progressions defining permissible values of for the given values , whilst additionally satisfying the same divisibility constraints.
Now, for each value of satisfying we can consider the disjunctive normal form of . By expanding the divisibility constraints, a disjunct becomes a conjunction of terms of the form, where represents some linear function on ,

or , or

or .
Since there are infinitely many , we can choose one disjunct satisfied by infinitely many . This means that for constraints of the form or with a nonzero , then must be negative or positive respectively (or zero). Otherwise, only a finite number of values of would be permitted.
We know that and do not satisfy the disjunct. We argue that this is a contradiction by considering each term in turn. Since there are infinitely many we can assume without loss of generality that .

If (resp. ) appears and is satisfied by , then must be negative or zero (resp. positive or zero) and will also satisfy the atom.

Atoms of the form do not distinguish values of and thus are satisfied for both and . We cannot have but not since and satisfy the same divisibility constraints.
Thus, it cannot be the case that satisfies the disjunct, while does not. This is our required contradiction. Hence, for all and such that it must be the case that iff . We have thus established the existence of a bound .
Size of the Bound
We now argue that this bound is exponential in the size of , and can thus be encoded in a polynomial number of bits.
Consider the formula that is essentially the negation of our property.
There is some computable bound exponential in the size of (and thus ) such that, if there exists and some such that holds, then there are infinitely many and such that for some we have that holds. An argument for the existence of this bound is given in Appendix .7. In short, we first convert the formula above into a disjunction of conjunctions of linear equalities, using a linear number of slack variables to encode inequalities and divisibility constraints. Then, using a result of Chistikov and Haase [11], we set where is the number of bits needed to encode the largest constant in the converted formula (polynomially related to the size of the formula above), is the maximum number of linear equalities in any disjunct, and is the number of variables (including slack variables).
Now, assume that the smallest is larger than . That is
holds, but it does not hold that
This implies there exists some and such that holds. Thus, there are infinitely many such and , contradicting the fact that all do not satisfy the property. Thus, we take as the value of . It is computable in polynomial time, exponential in size, and representable in a polynomial number of bits.
0.4 Variadic Decomposability
We consider decomposition along several variables instead of just one. In this section, we assume without loss of generality that is given in positive normal form and all (in)equalities rearranged into the form . We may use negation as a shorthand. We require this form because later we use the set of all linear equations in the DNF of a formula. Since negation alters the linear equations, it is more convenient to assume that negation has already been eliminated.
0.4.1 Decomposability
As described in Section 0.2.3, we refine the notion of decomposability to separate only a single set in . Without loss of generality, we assume we are given a formula and we separate the variables in from .
In particular, given a formula we aim to decompose the formula into for some formulas and .
0.4.2 Decomposition
We show that testing whether a given formula is variadic decomposable on is in coNP. This proves Theorem 0.1.2 as the coNP lower bound follows from the monadic case.
Lemma 0.4.1 (Decomposing on ).
Given a formula there is a coNP algorithm to decide if is variadic decomposable on . Moreover, if a decomposition exists, it can be constructed in exponentialtime and is exponential in size.
Let be the set of all such that is a linear inequality appearing in . Our approach will divide the points of into regions where all points within a region can be paired with the same values of to satisfy the formula. These regions are given by a bound . If is within the bound, then two points and are in the same region if . If two points are outside the bound, then by a pumping argument we can show that we have iff .
Let be a partition of into unbounded and bounded functions (where refers to equality being asserted over bounded functions as shown below). Define for each
Note, this formula intentionally does not say anything about the unbounded functions. This is important when we need to derive a bound—such a derivation cannot use a preexisting bound.
We also need to extend to account for and
being vectors. This is a straightforward extension asserting that each variable in
satisfies the same divisibility constraints as its counterpart in . Again, let be the set of all divisibility constraints appearing (syntactically) in . Let , and denote the th variable of , , and respectively. Assume without loss of generality that variables in always either appear on the lefthand side of a divisibility constraint (i.e. in the position) or on both sides. DefineNext, we introduce an operator for comparing a vector of variables with a bound. For let denote the absolute value of . Given a bound and some let
We claim there is an exponential bound such that is variadic decomposable iff for all we have
(DC) 
Note, unsatisfiability can be tested in NP. First guess , then guess .
We prove soundness of the claim in the next section. Completeness is an extension of the argument for the monadic case and is given in Appendix .8. In the monadic case, we were able to take some values of such that both satisfied the same divisibility constraints, but one value satisfied the formula while the other did not. Since these values were large, we derived an infinite number of such value pairs with increasing values. We then used these growing solutions to show that it was impossible for the value of to satisfy the formula, while the value of does not, as they were both beyond the distinguishing power of the linear inequalities. The argument for the variadic case is similar, with the values of and being replaced by the values of and .
0.4.3 Soundness
Assume there is an exponential bound such that for each , Equation DC holds. We show how to produce a decomposition.
As in the monadic case (Section 0.3.3), let be the set of all constraints of the form in the expansion of the divisibility constraints of . Observe again that there are only exponentially many maximal consistent subsets . For each fix a vector of values that satisfies all constraints in and is encodable in a polynomial number of bits. Furthermore, we define
For each and we can define an equivalence relation over values of such that and .
Observe each equivalence relation has an exponential number of equivalence classes depending on the values of the bounded . Let be a set of minimal representatives from each equivalence class such that each representative is representable in a polynomial number of bits. These can be computed by solving an existential Presburger constraint for each set of values of the bounded . In particular, for each and assignments for each , we select a solution to the equation
Comments
There are no comments yet.