217 lines
13 KiB
Markdown
217 lines
13 KiB
Markdown
```
|
|
defmodule Tdd do
|
|
@moduledoc """
|
|
Ternary decision diagram, used for representing set-theoritic types, akin to cduce.
|
|
There are 2 types of nodes:
|
|
- terminal nodes (true, false)
|
|
- variable nodes
|
|
|
|
variable nodes consist of:
|
|
- the variable being tested
|
|
- yes: id of the node if the result of the test is true
|
|
- no: id of the node if the result of the test is false
|
|
- dc: id of the node if the result of the test is irrelevant for the current operation
|
|
|
|
the TDD needs to be ordered and reduced (ROBDD)
|
|
- 'ordered' if different variables appear in the same order on all paths from the root.
|
|
- 'reduced' if the following two rules have been applied to its graph:
|
|
- Merge any isomorphic subgraphs.
|
|
- Eliminate any node whose two children are isomorphic.
|
|
|
|
Working notes:
|
|
- structure of the ordered variables:
|
|
Im thinking of structuring all possible types inside 1 TDD, in contrast to cduce, which uses a `desrc` structure that contains several TDDs (one for each domain, like ints, atoms, functions, etc.), and descr is a union between them.
|
|
For this, I need to come up with a variable structure that'll be ordered.
|
|
My set types will need to represent types like: atoms, strings, ints, maps, tuples, functions, kinds, free polymorphic variables
|
|
Moreso, those types themselves consist of smaller subsets of types like:
|
|
- int < 10
|
|
- int in [1, 2, 3]
|
|
- string > "prefix_"
|
|
- atom == false
|
|
- atom == false or atom == true or atom == nil
|
|
- map == %{"id" => string} and %{string => any | nil}
|
|
- polymorphic types ???
|
|
- etc.
|
|
Dont know how to represent them and make them ordered.
|
|
- node cache:
|
|
I suspect ids of nodes (TDDs) after reduction. This way a comparison between 2 types is just a pointer (id) check in the node cache. But not yet sure.
|
|
- reduction rules: not sure how to approach them
|
|
|
|
"""
|
|
|
|
def node(elem, yes, no, dc = _dont_care) do
|
|
end
|
|
|
|
def sum(one, two) do
|
|
end
|
|
|
|
def intersect(one, two) do
|
|
end
|
|
|
|
def negate(one, two) do
|
|
end
|
|
end
|
|
```
|
|
Please help me refine the theory for impementing set-theoritic types with this.
|
|
Lets focus on ordering of variables in the TDD
|
|
|
|
========================================================================================================================
|
|
|
|
Okay, this is a fascinating and ambitious goal! Building a single, ordered, reduced TDD for all Elixir/Erlang types is a significant challenge, especially with the rich structure of types like maps and functions. Let's break down the ordering of variables.
|
|
|
|
The core idea for ordering in a BDD/TDD is to have a **total global order** on all possible "questions" (predicates) you can ask about a value.
|
|
|
|
**High-Level Strategy for Ordering Variables**
|
|
|
|
1. **Primary Type Discriminators First:** The most fundamental question is "What basic kind of thing is this?"
|
|
2. **General Properties Next:** After knowing the primary type, ask about general properties (e.g., size, length).
|
|
3. **Specific Value/Content Properties:** Then, delve into specific values or structural content.
|
|
4. **Recursive Structure for Nested Types:** For types like tuples and maps, the predicates about their elements/values will effectively be recursive, but the ordering system must flatten this into a global order.
|
|
|
|
**Proposed Variable Ordering Scheme**
|
|
|
|
Let's define "variables" as unique identifiers for predicates. We need a way to sort these identifiers. A good way is to use tuples, where Elixir's natural tuple sorting provides the order.
|
|
|
|
**Category 0: Primary Type Discriminators**
|
|
These are the most fundamental. They will have the lowest sort order.
|
|
Order them alphabetically by the type name.
|
|
* `v_is_atom = {0, :is_atom}`
|
|
* `v_is_binary = {0, :is_binary}`
|
|
* `v_is_float = {0, :is_float}`
|
|
* `v_is_function = {0, :is_function}`
|
|
* `v_is_integer = {0, :is_integer}`
|
|
* `v_is_list = {0, :is_list}`
|
|
* `v_is_map = {0, :is_map}`
|
|
* `v_is_pid = {0, :is_pid}`
|
|
* `v_is_port = {0, :is_port}`
|
|
* `v_is_reference = {0, :is_reference}`
|
|
* `v_is_string = {0, :is_string}` (*Note: Elixir strings are UTF-8 binaries. You might treat them as a subtype of binary or a distinct primary type in your model. For simplicity here, let's assume distinct for now, or you'd have predicates like `{0, :is_binary_utf8}` after `{0, :is_binary}`*)
|
|
* `v_is_tuple = {0, :is_tuple}`
|
|
|
|
**Category 1: Atom-Specific Predicates**
|
|
If `is_atom` is true, what specific atom is it?
|
|
Order by the atom itself.
|
|
* `v_atom_eq_false = {1, :value, false}`
|
|
* `v_atom_eq_nil = {1, :value, nil}`
|
|
* `v_atom_eq_true = {1, :value, true}`
|
|
* `v_atom_eq_specific_A = {1, :value, :an_atom}` (e.g., `:an_atom` comes after `true`)
|
|
* ... (all known/relevant atoms in your system, ordered canonically)
|
|
|
|
**Category 2: Integer-Specific Predicates**
|
|
If `is_integer` is true:
|
|
You need a canonical way to represent integer conditions.
|
|
* Equality: `v_int_eq_N = {2, :eq, N}` (e.g., `{2, :eq, 0}`, `{2, :eq, 10}`)
|
|
* Order by N.
|
|
* Less than: `v_int_lt_N = {2, :lt, N}` (e.g., `{2, :lt, 0}`, `{2, :lt, 10}`)
|
|
* Order by N.
|
|
* Greater than: `v_int_gt_N = {2, :gt, N}` (e.g., `{2, :gt, 0}`, `{2, :gt, 10}`)
|
|
* Order by N.
|
|
* Set membership for finite sets: `v_int_in_SET = {2, :in, Enum.sort(SET)}` (e.g. `{2, :in, [1,2,3]}`)
|
|
* Order by the canonical (sorted list) representation of SET.
|
|
* *This gets complex. Often, BDDs for integers use bit-level tests, but for set-theoretic types, range/specific value tests are more natural.* You might limit this to a predefined, finite set of "interesting" integer predicates.
|
|
|
|
**Category 3: String-Specific Predicates**
|
|
If `is_string` is true:
|
|
* Equality: `v_string_eq_S = {3, :eq, S}` (e.g., `{3, :eq, "foo"}`)
|
|
* Order by S lexicographically.
|
|
* Length: `v_string_len_eq_L = {3, :len_eq, L}`
|
|
* Order by L.
|
|
* Prefix: `v_string_prefix_P = {3, :prefix, P}`
|
|
* Order by P lexicographically.
|
|
* (Suffix, regex match, etc., can be added with consistent ordering rules)
|
|
|
|
**Category 4: Tuple-Specific Predicates**
|
|
If `is_tuple` is true:
|
|
1. **Size first:**
|
|
* `v_tuple_size_eq_N = {4, :size, N}` (e.g., `{4, :size, 0}`, `{4, :size, 2}`)
|
|
* Order by N.
|
|
2. **Element types (recursive structure in variable identifier):**
|
|
For a tuple of a *given size*, we then check its elements. The predicate for an element will re-use the *entire variable ordering scheme* but scoped to that element.
|
|
* `v_tuple_elem_I_PRED = {4, :element, index_I, NESTED_PREDICATE_ID}`
|
|
* Order by `index_I` first.
|
|
* Then order by `NESTED_PREDICATE_ID` (which itself is one of these `{category, type, value}` tuples).
|
|
* Example: Is element 0 an atom? `v_el0_is_atom = {4, :element, 0, {0, :is_atom}}`
|
|
* Example: Is element 0 the atom `:foo`? `v_el0_is_foo = {4, :element, 0, {1, :value, :foo}}`
|
|
* Example: Is element 1 an integer? `v_el1_is_int = {4, :element, 1, {0, :is_integer}}`
|
|
This ensures that all questions about element 0 come before element 1, and for each element, the standard hierarchy of questions is asked.
|
|
|
|
**Category 5: Map-Specific Predicates**
|
|
If `is_map` is true: This is the most complex.
|
|
1. **Size (optional, but can be useful):**
|
|
* `v_map_size_eq_N = {5, :size, N}`
|
|
* Order by N.
|
|
2. **Key Presence:**
|
|
* `v_map_has_key_K = {5, :has_key, K}` (e.g., `{5, :has_key, "id"}`, `{5, :has_key, :name}`)
|
|
* Order by K (canonically, e.g., strings before atoms, then lexicographically/atom-order).
|
|
3. **Key Value Types (recursive structure):**
|
|
For a *given key K* that is present:
|
|
* `v_map_key_K_value_PRED = {5, :key_value, K, NESTED_PREDICATE_ID}`
|
|
* Order by K (canonically).
|
|
* Then order by `NESTED_PREDICATE_ID`.
|
|
* Example: Does map have key `:id` and is its value a string?
|
|
* First variable: `v_map_has_id = {5, :has_key, :id}`
|
|
* If yes, next variable: `v_map_id_val_is_str = {5, :key_value, :id, {0, :is_string}}`
|
|
4. **Predicates for "all other keys" / "pattern keys":**
|
|
This is needed for types like `%{String.t() => integer()}`.
|
|
* `v_map_pattern_key_PRED_value_PRED = {5, :pattern_key, KEY_TYPE_PREDICATE_ID, VALUE_TYPE_PREDICATE_ID}`
|
|
* Example: For `%{String.t() => integer()}`:
|
|
* Key type predicate: `{0, :is_string}`
|
|
* Value type predicate for such keys: `{0, :is_integer}`
|
|
* Variable ID: `{5, :pattern_key, {0, :is_string}, {0, :is_integer}}`
|
|
* These pattern key predicates should likely be ordered *after* specific key predicates. The exact sorting of `KEY_TYPE_PREDICATE_ID` needs careful thought (e.g. `(0, :is_atom)` before `(0, :is_string)`).
|
|
|
|
**Category 6: List-Specific Predicates**
|
|
If `is_list` is true:
|
|
1. **Is Empty:**
|
|
* `v_list_is_empty = {6, :is_empty}`
|
|
2. **Head/Tail Structure (if not empty, recursive):**
|
|
This mirrors how types like `nonempty_list(H, T)` are defined.
|
|
* `v_list_head_PRED = {6, :head, NESTED_PREDICATE_ID}`
|
|
* `v_list_tail_PRED = {6, :tail, NESTED_PREDICATE_ID}` (Note: `NESTED_PREDICATE_ID` for tail would again be list predicates like `{6, :is_empty}` or `{6, :head, ...}`)
|
|
* Example: Head is an atom: `{6, :head, {0, :is_atom}}`
|
|
* Example: Tail is an empty list: `{6, :tail, {6, :is_empty}}`
|
|
* All head predicates come before all tail predicates.
|
|
|
|
**Category 7: Function-Specific Predicates**
|
|
If `is_function` is true:
|
|
1. **Arity:**
|
|
* `v_fun_arity_eq_A = {7, :arity, A}`
|
|
* Order by A.
|
|
2. **Argument Types (very complex, may need simplification for TDDs):**
|
|
* `v_fun_arg_I_PRED = {7, :arg, index_I, NESTED_PREDICATE_ID}`
|
|
3. **Return Type (also complex):**
|
|
* `v_fun_return_PRED = {7, :return, NESTED_PREDICATE_ID}`
|
|
* Function types are often represented by separate structures or simplified in TDDs due to their higher-order nature. Full function type checking within this TDD variable scheme would be extremely elaborate.
|
|
|
|
**Binary, Float, Pid, Port, Reference Predicates:**
|
|
These would get their own categories (e.g., 8, 9, 10...).
|
|
* **Floats:** `{X, :is_float}` -> `{X, :eq, F}`, `{X, :lt, F}`, etc.
|
|
* **Binaries:** `{Y, :is_binary}` -> `{Y, :size, S}`, `{Y, :matches_pattern, Pat}` (e.g. `<<h :: size(8), _ :: binary>>`)
|
|
|
|
**Polymorphic Variables (`alpha`, `beta`, etc.)**
|
|
Polymorphic variables are part of the *type language*, not properties of concrete values. A TDD represents a set of *concrete values*.
|
|
* When you construct a TDD for a type like `list(alpha)`, where `alpha` is free, `alpha` essentially means `any`. So, for predicates concerning list elements, they would all go to their `dc` (don't care) branches, ultimately leading to `true`.
|
|
* If `alpha` is bound (e.g., in `(alpha -> alpha) where alpha = integer`), you first resolve `alpha` to `integer` and then build the TDD for `(integer -> integer)`.
|
|
* So, "free polymorphic variables" don't become TDD variables themselves. They influence which branches are taken during TDD construction for types containing them, often mapping to `any` or `dc` paths.
|
|
|
|
**"Kinds"**
|
|
The "kind" of a type (e.g., `Type.Atom`, `Type.Integer`) is meta-information. The TDD *represents* a type. The variables within the TDD are about properties of *values* that belong to that type. You wouldn't have a TDD variable `is_kind_atom?`. Instead, the TDD for the type `atom` would start with the `v_is_atom = {0, :is_atom}` variable.
|
|
|
|
**The `dc` (Don't Care) Branch**
|
|
The `dc` branch for a variable `v` in a node means that for the type being represented, the outcome of the test `v` is irrelevant to membership *given the path taken to reach this node*.
|
|
* For type `any`, all variables would effectively go to `dc`, all leading to the `true_terminal`.
|
|
* For type `integer`, the test `v_is_atom`:
|
|
* `yes` branch -> `false_terminal` (an atom is not an integer)
|
|
* `no` branch -> continues to test `v_is_integer`, etc.
|
|
* `dc` branch: This is the tricky part in TDDs for types. In some formulations, `dc(N)` is `yes(N) U no(N)`. If a type truly doesn't care about `v_is_atom` (e.g., `any`), then `yes`, `no`, and `dc` might all point to the same child representing `any` for subsequent tests.
|
|
* For `ROBDD` reduction with `dc`, if `yes_child == no_child == dc_child`, the node is redundant. If `yes_child == no_child`, the test `v` is locally irrelevant, and the node might be simplified (parent points to `yes_child`, and the `dc` path needs careful handling depending on the TDD algebra). Standard BDD reduction (if `yes_child == no_child`, eliminate node) assumes only two children. You'll need TDD-specific reduction rules.
|
|
|
|
**Summary of Ordering Logic:**
|
|
1. Use tuples as variable identifiers: ` {category_num, predicate_type_atom, value_or_nested_id}`.
|
|
2. Sort these tuples using Elixir's default tuple comparison. This establishes the global order.
|
|
3. `category_num` provides the coarsest level of ordering.
|
|
4. Within a category, `predicate_type_atom` provides the next level.
|
|
5. Finally, `value_or_nested_id` provides the finest level, potentially recursively applying the same ordering logic for nested structures.
|
|
|
|
This structured approach to variable identifiers is crucial for the "ordered" property of your ROBDD. When implementing `apply` operations (union, intersect, negate), you'll always compare the variable IDs of the current nodes from the two TDDs and operate on the one that comes first in the global order.
|