``` defmodule Tdd do @moduledoc """ Ternary decision diagram, used for representing set-theoritic types, akin to cduce. There are 2 types of nodes: - terminal nodes (true, false) - variable nodes variable nodes consist of: - the variable being tested - yes: id of the node if the result of the test is true - no: id of the node if the result of the test is false - dc: id of the node if the result of the test is irrelevant for the current operation the TDD needs to be ordered and reduced (ROBDD) - 'ordered' if different variables appear in the same order on all paths from the root. - 'reduced' if the following two rules have been applied to its graph: - Merge any isomorphic subgraphs. - Eliminate any node whose two children are isomorphic. Working notes: - structure of the ordered variables: Im thinking of structuring all possible types inside 1 TDD, in contrast to cduce, which uses a `desrc` structure that contains several TDDs (one for each domain, like ints, atoms, functions, etc.), and descr is a union between them. For this, I need to come up with a variable structure that'll be ordered. My set types will need to represent types like: atoms, strings, ints, maps, tuples, functions, kinds, free polymorphic variables Moreso, those types themselves consist of smaller subsets of types like: - int < 10 - int in [1, 2, 3] - string > "prefix_" - atom == false - atom == false or atom == true or atom == nil - map == %{"id" => string} and %{string => any | nil} - polymorphic types ??? - etc. Dont know how to represent them and make them ordered. - node cache: I suspect ids of nodes (TDDs) after reduction. This way a comparison between 2 types is just a pointer (id) check in the node cache. But not yet sure. - reduction rules: not sure how to approach them """ def node(elem, yes, no, dc = _dont_care) do end def sum(one, two) do end def intersect(one, two) do end def negate(one, two) do end end ``` Please help me refine the theory for impementing set-theoritic types with this. Lets focus on ordering of variables in the TDD ======================================================================================================================== Okay, this is a fascinating and ambitious goal! Building a single, ordered, reduced TDD for all Elixir/Erlang types is a significant challenge, especially with the rich structure of types like maps and functions. Let's break down the ordering of variables. The core idea for ordering in a BDD/TDD is to have a **total global order** on all possible "questions" (predicates) you can ask about a value. **High-Level Strategy for Ordering Variables** 1. **Primary Type Discriminators First:** The most fundamental question is "What basic kind of thing is this?" 2. **General Properties Next:** After knowing the primary type, ask about general properties (e.g., size, length). 3. **Specific Value/Content Properties:** Then, delve into specific values or structural content. 4. **Recursive Structure for Nested Types:** For types like tuples and maps, the predicates about their elements/values will effectively be recursive, but the ordering system must flatten this into a global order. **Proposed Variable Ordering Scheme** Let's define "variables" as unique identifiers for predicates. We need a way to sort these identifiers. A good way is to use tuples, where Elixir's natural tuple sorting provides the order. **Category 0: Primary Type Discriminators** These are the most fundamental. They will have the lowest sort order. Order them alphabetically by the type name. * `v_is_atom = {0, :is_atom}` * `v_is_binary = {0, :is_binary}` * `v_is_float = {0, :is_float}` * `v_is_function = {0, :is_function}` * `v_is_integer = {0, :is_integer}` * `v_is_list = {0, :is_list}` * `v_is_map = {0, :is_map}` * `v_is_pid = {0, :is_pid}` * `v_is_port = {0, :is_port}` * `v_is_reference = {0, :is_reference}` * `v_is_string = {0, :is_string}` (*Note: Elixir strings are UTF-8 binaries. You might treat them as a subtype of binary or a distinct primary type in your model. For simplicity here, let's assume distinct for now, or you'd have predicates like `{0, :is_binary_utf8}` after `{0, :is_binary}`*) * `v_is_tuple = {0, :is_tuple}` **Category 1: Atom-Specific Predicates** If `is_atom` is true, what specific atom is it? Order by the atom itself. * `v_atom_eq_false = {1, :value, false}` * `v_atom_eq_nil = {1, :value, nil}` * `v_atom_eq_true = {1, :value, true}` * `v_atom_eq_specific_A = {1, :value, :an_atom}` (e.g., `:an_atom` comes after `true`) * ... (all known/relevant atoms in your system, ordered canonically) **Category 2: Integer-Specific Predicates** If `is_integer` is true: You need a canonical way to represent integer conditions. * Equality: `v_int_eq_N = {2, :eq, N}` (e.g., `{2, :eq, 0}`, `{2, :eq, 10}`) * Order by N. * Less than: `v_int_lt_N = {2, :lt, N}` (e.g., `{2, :lt, 0}`, `{2, :lt, 10}`) * Order by N. * Greater than: `v_int_gt_N = {2, :gt, N}` (e.g., `{2, :gt, 0}`, `{2, :gt, 10}`) * Order by N. * Set membership for finite sets: `v_int_in_SET = {2, :in, Enum.sort(SET)}` (e.g. `{2, :in, [1,2,3]}`) * Order by the canonical (sorted list) representation of SET. * *This gets complex. Often, BDDs for integers use bit-level tests, but for set-theoretic types, range/specific value tests are more natural.* You might limit this to a predefined, finite set of "interesting" integer predicates. **Category 3: String-Specific Predicates** If `is_string` is true: * Equality: `v_string_eq_S = {3, :eq, S}` (e.g., `{3, :eq, "foo"}`) * Order by S lexicographically. * Length: `v_string_len_eq_L = {3, :len_eq, L}` * Order by L. * Prefix: `v_string_prefix_P = {3, :prefix, P}` * Order by P lexicographically. * (Suffix, regex match, etc., can be added with consistent ordering rules) **Category 4: Tuple-Specific Predicates** If `is_tuple` is true: 1. **Size first:** * `v_tuple_size_eq_N = {4, :size, N}` (e.g., `{4, :size, 0}`, `{4, :size, 2}`) * Order by N. 2. **Element types (recursive structure in variable identifier):** For a tuple of a *given size*, we then check its elements. The predicate for an element will re-use the *entire variable ordering scheme* but scoped to that element. * `v_tuple_elem_I_PRED = {4, :element, index_I, NESTED_PREDICATE_ID}` * Order by `index_I` first. * Then order by `NESTED_PREDICATE_ID` (which itself is one of these `{category, type, value}` tuples). * Example: Is element 0 an atom? `v_el0_is_atom = {4, :element, 0, {0, :is_atom}}` * Example: Is element 0 the atom `:foo`? `v_el0_is_foo = {4, :element, 0, {1, :value, :foo}}` * Example: Is element 1 an integer? `v_el1_is_int = {4, :element, 1, {0, :is_integer}}` This ensures that all questions about element 0 come before element 1, and for each element, the standard hierarchy of questions is asked. **Category 5: Map-Specific Predicates** If `is_map` is true: This is the most complex. 1. **Size (optional, but can be useful):** * `v_map_size_eq_N = {5, :size, N}` * Order by N. 2. **Key Presence:** * `v_map_has_key_K = {5, :has_key, K}` (e.g., `{5, :has_key, "id"}`, `{5, :has_key, :name}`) * Order by K (canonically, e.g., strings before atoms, then lexicographically/atom-order). 3. **Key Value Types (recursive structure):** For a *given key K* that is present: * `v_map_key_K_value_PRED = {5, :key_value, K, NESTED_PREDICATE_ID}` * Order by K (canonically). * Then order by `NESTED_PREDICATE_ID`. * Example: Does map have key `:id` and is its value a string? * First variable: `v_map_has_id = {5, :has_key, :id}` * If yes, next variable: `v_map_id_val_is_str = {5, :key_value, :id, {0, :is_string}}` 4. **Predicates for "all other keys" / "pattern keys":** This is needed for types like `%{String.t() => integer()}`. * `v_map_pattern_key_PRED_value_PRED = {5, :pattern_key, KEY_TYPE_PREDICATE_ID, VALUE_TYPE_PREDICATE_ID}` * Example: For `%{String.t() => integer()}`: * Key type predicate: `{0, :is_string}` * Value type predicate for such keys: `{0, :is_integer}` * Variable ID: `{5, :pattern_key, {0, :is_string}, {0, :is_integer}}` * These pattern key predicates should likely be ordered *after* specific key predicates. The exact sorting of `KEY_TYPE_PREDICATE_ID` needs careful thought (e.g. `(0, :is_atom)` before `(0, :is_string)`). **Category 6: List-Specific Predicates** If `is_list` is true: 1. **Is Empty:** * `v_list_is_empty = {6, :is_empty}` 2. **Head/Tail Structure (if not empty, recursive):** This mirrors how types like `nonempty_list(H, T)` are defined. * `v_list_head_PRED = {6, :head, NESTED_PREDICATE_ID}` * `v_list_tail_PRED = {6, :tail, NESTED_PREDICATE_ID}` (Note: `NESTED_PREDICATE_ID` for tail would again be list predicates like `{6, :is_empty}` or `{6, :head, ...}`) * Example: Head is an atom: `{6, :head, {0, :is_atom}}` * Example: Tail is an empty list: `{6, :tail, {6, :is_empty}}` * All head predicates come before all tail predicates. **Category 7: Function-Specific Predicates** If `is_function` is true: 1. **Arity:** * `v_fun_arity_eq_A = {7, :arity, A}` * Order by A. 2. **Argument Types (very complex, may need simplification for TDDs):** * `v_fun_arg_I_PRED = {7, :arg, index_I, NESTED_PREDICATE_ID}` 3. **Return Type (also complex):** * `v_fun_return_PRED = {7, :return, NESTED_PREDICATE_ID}` * Function types are often represented by separate structures or simplified in TDDs due to their higher-order nature. Full function type checking within this TDD variable scheme would be extremely elaborate. **Binary, Float, Pid, Port, Reference Predicates:** These would get their own categories (e.g., 8, 9, 10...). * **Floats:** `{X, :is_float}` -> `{X, :eq, F}`, `{X, :lt, F}`, etc. * **Binaries:** `{Y, :is_binary}` -> `{Y, :size, S}`, `{Y, :matches_pattern, Pat}` (e.g. `<>`) **Polymorphic Variables (`alpha`, `beta`, etc.)** Polymorphic variables are part of the *type language*, not properties of concrete values. A TDD represents a set of *concrete values*. * When you construct a TDD for a type like `list(alpha)`, where `alpha` is free, `alpha` essentially means `any`. So, for predicates concerning list elements, they would all go to their `dc` (don't care) branches, ultimately leading to `true`. * If `alpha` is bound (e.g., in `(alpha -> alpha) where alpha = integer`), you first resolve `alpha` to `integer` and then build the TDD for `(integer -> integer)`. * So, "free polymorphic variables" don't become TDD variables themselves. They influence which branches are taken during TDD construction for types containing them, often mapping to `any` or `dc` paths. **"Kinds"** The "kind" of a type (e.g., `Type.Atom`, `Type.Integer`) is meta-information. The TDD *represents* a type. The variables within the TDD are about properties of *values* that belong to that type. You wouldn't have a TDD variable `is_kind_atom?`. Instead, the TDD for the type `atom` would start with the `v_is_atom = {0, :is_atom}` variable. **The `dc` (Don't Care) Branch** The `dc` branch for a variable `v` in a node means that for the type being represented, the outcome of the test `v` is irrelevant to membership *given the path taken to reach this node*. * For type `any`, all variables would effectively go to `dc`, all leading to the `true_terminal`. * For type `integer`, the test `v_is_atom`: * `yes` branch -> `false_terminal` (an atom is not an integer) * `no` branch -> continues to test `v_is_integer`, etc. * `dc` branch: This is the tricky part in TDDs for types. In some formulations, `dc(N)` is `yes(N) U no(N)`. If a type truly doesn't care about `v_is_atom` (e.g., `any`), then `yes`, `no`, and `dc` might all point to the same child representing `any` for subsequent tests. * For `ROBDD` reduction with `dc`, if `yes_child == no_child == dc_child`, the node is redundant. If `yes_child == no_child`, the test `v` is locally irrelevant, and the node might be simplified (parent points to `yes_child`, and the `dc` path needs careful handling depending on the TDD algebra). Standard BDD reduction (if `yes_child == no_child`, eliminate node) assumes only two children. You'll need TDD-specific reduction rules. **Summary of Ordering Logic:** 1. Use tuples as variable identifiers: ` {category_num, predicate_type_atom, value_or_nested_id}`. 2. Sort these tuples using Elixir's default tuple comparison. This establishes the global order. 3. `category_num` provides the coarsest level of ordering. 4. Within a category, `predicate_type_atom` provides the next level. 5. Finally, `value_or_nested_id` provides the finest level, potentially recursively applying the same ordering logic for nested structures. This structured approach to variable identifiers is crucial for the "ordered" property of your ROBDD. When implementing `apply` operations (union, intersect, negate), you'll always compare the variable IDs of the current nodes from the two TDDs and operate on the one that comes first in the global order.