13 KiB
defmodule Tdd do
@moduledoc """
Ternary decision diagram, used for representing set-theoritic types, akin to cduce.
There are 2 types of nodes:
- terminal nodes (true, false)
- variable nodes
variable nodes consist of:
- the variable being tested
- yes: id of the node if the result of the test is true
- no: id of the node if the result of the test is false
- dc: id of the node if the result of the test is irrelevant for the current operation
the TDD needs to be ordered and reduced (ROBDD)
- 'ordered' if different variables appear in the same order on all paths from the root.
- 'reduced' if the following two rules have been applied to its graph:
- Merge any isomorphic subgraphs.
- Eliminate any node whose two children are isomorphic.
Working notes:
- structure of the ordered variables:
Im thinking of structuring all possible types inside 1 TDD, in contrast to cduce, which uses a `desrc` structure that contains several TDDs (one for each domain, like ints, atoms, functions, etc.), and descr is a union between them.
For this, I need to come up with a variable structure that'll be ordered.
My set types will need to represent types like: atoms, strings, ints, maps, tuples, functions, kinds, free polymorphic variables
Moreso, those types themselves consist of smaller subsets of types like:
- int < 10
- int in [1, 2, 3]
- string > "prefix_"
- atom == false
- atom == false or atom == true or atom == nil
- map == %{"id" => string} and %{string => any | nil}
- polymorphic types ???
- etc.
Dont know how to represent them and make them ordered.
- node cache:
I suspect ids of nodes (TDDs) after reduction. This way a comparison between 2 types is just a pointer (id) check in the node cache. But not yet sure.
- reduction rules: not sure how to approach them
"""
def node(elem, yes, no, dc = _dont_care) do
end
def sum(one, two) do
end
def intersect(one, two) do
end
def negate(one, two) do
end
end
Please help me refine the theory for impementing set-theoritic types with this. Lets focus on ordering of variables in the TDD
========================================================================================================================
Okay, this is a fascinating and ambitious goal! Building a single, ordered, reduced TDD for all Elixir/Erlang types is a significant challenge, especially with the rich structure of types like maps and functions. Let's break down the ordering of variables.
The core idea for ordering in a BDD/TDD is to have a total global order on all possible "questions" (predicates) you can ask about a value.
High-Level Strategy for Ordering Variables
- Primary Type Discriminators First: The most fundamental question is "What basic kind of thing is this?"
- General Properties Next: After knowing the primary type, ask about general properties (e.g., size, length).
- Specific Value/Content Properties: Then, delve into specific values or structural content.
- Recursive Structure for Nested Types: For types like tuples and maps, the predicates about their elements/values will effectively be recursive, but the ordering system must flatten this into a global order.
Proposed Variable Ordering Scheme
Let's define "variables" as unique identifiers for predicates. We need a way to sort these identifiers. A good way is to use tuples, where Elixir's natural tuple sorting provides the order.
Category 0: Primary Type Discriminators These are the most fundamental. They will have the lowest sort order. Order them alphabetically by the type name.
v_is_atom = {0, :is_atom}v_is_binary = {0, :is_binary}v_is_float = {0, :is_float}v_is_function = {0, :is_function}v_is_integer = {0, :is_integer}v_is_list = {0, :is_list}v_is_map = {0, :is_map}v_is_pid = {0, :is_pid}v_is_port = {0, :is_port}v_is_reference = {0, :is_reference}v_is_string = {0, :is_string}(Note: Elixir strings are UTF-8 binaries. You might treat them as a subtype of binary or a distinct primary type in your model. For simplicity here, let's assume distinct for now, or you'd have predicates like{0, :is_binary_utf8}after{0, :is_binary})v_is_tuple = {0, :is_tuple}
Category 1: Atom-Specific Predicates
If is_atom is true, what specific atom is it?
Order by the atom itself.
v_atom_eq_false = {1, :value, false}v_atom_eq_nil = {1, :value, nil}v_atom_eq_true = {1, :value, true}v_atom_eq_specific_A = {1, :value, :an_atom}(e.g.,:an_atomcomes aftertrue)- ... (all known/relevant atoms in your system, ordered canonically)
Category 2: Integer-Specific Predicates
If is_integer is true:
You need a canonical way to represent integer conditions.
- Equality:
v_int_eq_N = {2, :eq, N}(e.g.,{2, :eq, 0},{2, :eq, 10})- Order by N.
- Less than:
v_int_lt_N = {2, :lt, N}(e.g.,{2, :lt, 0},{2, :lt, 10})- Order by N.
- Greater than:
v_int_gt_N = {2, :gt, N}(e.g.,{2, :gt, 0},{2, :gt, 10})- Order by N.
- Set membership for finite sets:
v_int_in_SET = {2, :in, Enum.sort(SET)}(e.g.{2, :in, [1,2,3]})- Order by the canonical (sorted list) representation of SET.
- This gets complex. Often, BDDs for integers use bit-level tests, but for set-theoretic types, range/specific value tests are more natural. You might limit this to a predefined, finite set of "interesting" integer predicates.
Category 3: String-Specific Predicates
If is_string is true:
- Equality:
v_string_eq_S = {3, :eq, S}(e.g.,{3, :eq, "foo"})- Order by S lexicographically.
- Length:
v_string_len_eq_L = {3, :len_eq, L}- Order by L.
- Prefix:
v_string_prefix_P = {3, :prefix, P}- Order by P lexicographically.
- (Suffix, regex match, etc., can be added with consistent ordering rules)
Category 4: Tuple-Specific Predicates
If is_tuple is true:
- Size first:
v_tuple_size_eq_N = {4, :size, N}(e.g.,{4, :size, 0},{4, :size, 2})- Order by N.
- Element types (recursive structure in variable identifier):
For a tuple of a given size, we then check its elements. The predicate for an element will re-use the entire variable ordering scheme but scoped to that element.
v_tuple_elem_I_PRED = {4, :element, index_I, NESTED_PREDICATE_ID}- Order by
index_Ifirst. - Then order by
NESTED_PREDICATE_ID(which itself is one of these{category, type, value}tuples).
- Order by
- Example: Is element 0 an atom?
v_el0_is_atom = {4, :element, 0, {0, :is_atom}} - Example: Is element 0 the atom
:foo?v_el0_is_foo = {4, :element, 0, {1, :value, :foo}} - Example: Is element 1 an integer?
v_el1_is_int = {4, :element, 1, {0, :is_integer}}This ensures that all questions about element 0 come before element 1, and for each element, the standard hierarchy of questions is asked.
Category 5: Map-Specific Predicates
If is_map is true: This is the most complex.
- Size (optional, but can be useful):
v_map_size_eq_N = {5, :size, N}- Order by N.
- Key Presence:
v_map_has_key_K = {5, :has_key, K}(e.g.,{5, :has_key, "id"},{5, :has_key, :name})- Order by K (canonically, e.g., strings before atoms, then lexicographically/atom-order).
- Key Value Types (recursive structure):
For a given key K that is present:
v_map_key_K_value_PRED = {5, :key_value, K, NESTED_PREDICATE_ID}- Order by K (canonically).
- Then order by
NESTED_PREDICATE_ID.
- Example: Does map have key
:idand is its value a string?- First variable:
v_map_has_id = {5, :has_key, :id} - If yes, next variable:
v_map_id_val_is_str = {5, :key_value, :id, {0, :is_string}}
- First variable:
- Predicates for "all other keys" / "pattern keys":
This is needed for types like
%{String.t() => integer()}.v_map_pattern_key_PRED_value_PRED = {5, :pattern_key, KEY_TYPE_PREDICATE_ID, VALUE_TYPE_PREDICATE_ID}- Example: For
%{String.t() => integer()}:- Key type predicate:
{0, :is_string} - Value type predicate for such keys:
{0, :is_integer} - Variable ID:
{5, :pattern_key, {0, :is_string}, {0, :is_integer}}
- Key type predicate:
- These pattern key predicates should likely be ordered after specific key predicates. The exact sorting of
KEY_TYPE_PREDICATE_IDneeds careful thought (e.g.(0, :is_atom)before(0, :is_string)).
Category 6: List-Specific Predicates
If is_list is true:
- Is Empty:
v_list_is_empty = {6, :is_empty}
- Head/Tail Structure (if not empty, recursive):
This mirrors how types like
nonempty_list(H, T)are defined.v_list_head_PRED = {6, :head, NESTED_PREDICATE_ID}v_list_tail_PRED = {6, :tail, NESTED_PREDICATE_ID}(Note:NESTED_PREDICATE_IDfor tail would again be list predicates like{6, :is_empty}or{6, :head, ...})- Example: Head is an atom:
{6, :head, {0, :is_atom}} - Example: Tail is an empty list:
{6, :tail, {6, :is_empty}} - All head predicates come before all tail predicates.
Category 7: Function-Specific Predicates
If is_function is true:
- Arity:
v_fun_arity_eq_A = {7, :arity, A}- Order by A.
- Argument Types (very complex, may need simplification for TDDs):
v_fun_arg_I_PRED = {7, :arg, index_I, NESTED_PREDICATE_ID}
- Return Type (also complex):
v_fun_return_PRED = {7, :return, NESTED_PREDICATE_ID}- Function types are often represented by separate structures or simplified in TDDs due to their higher-order nature. Full function type checking within this TDD variable scheme would be extremely elaborate.
Binary, Float, Pid, Port, Reference Predicates: These would get their own categories (e.g., 8, 9, 10...).
- Floats:
{X, :is_float}->{X, :eq, F},{X, :lt, F}, etc. - Binaries:
{Y, :is_binary}->{Y, :size, S},{Y, :matches_pattern, Pat}(e.g.<<h :: size(8), _ :: binary>>)
Polymorphic Variables (alpha, beta, etc.)
Polymorphic variables are part of the type language, not properties of concrete values. A TDD represents a set of concrete values.
- When you construct a TDD for a type like
list(alpha), wherealphais free,alphaessentially meansany. So, for predicates concerning list elements, they would all go to theirdc(don't care) branches, ultimately leading totrue. - If
alphais bound (e.g., in(alpha -> alpha) where alpha = integer), you first resolvealphatointegerand then build the TDD for(integer -> integer). - So, "free polymorphic variables" don't become TDD variables themselves. They influence which branches are taken during TDD construction for types containing them, often mapping to
anyordcpaths.
"Kinds"
The "kind" of a type (e.g., Type.Atom, Type.Integer) is meta-information. The TDD represents a type. The variables within the TDD are about properties of values that belong to that type. You wouldn't have a TDD variable is_kind_atom?. Instead, the TDD for the type atom would start with the v_is_atom = {0, :is_atom} variable.
The dc (Don't Care) Branch
The dc branch for a variable v in a node means that for the type being represented, the outcome of the test v is irrelevant to membership given the path taken to reach this node.
- For type
any, all variables would effectively go todc, all leading to thetrue_terminal. - For type
integer, the testv_is_atom:yesbranch ->false_terminal(an atom is not an integer)nobranch -> continues to testv_is_integer, etc.dcbranch: This is the tricky part in TDDs for types. In some formulations,dc(N)isyes(N) U no(N). If a type truly doesn't care aboutv_is_atom(e.g.,any), thenyes,no, anddcmight all point to the same child representinganyfor subsequent tests.- For
ROBDDreduction withdc, ifyes_child == no_child == dc_child, the node is redundant. Ifyes_child == no_child, the testvis locally irrelevant, and the node might be simplified (parent points toyes_child, and thedcpath needs careful handling depending on the TDD algebra). Standard BDD reduction (ifyes_child == no_child, eliminate node) assumes only two children. You'll need TDD-specific reduction rules.
Summary of Ordering Logic:
- Use tuples as variable identifiers:
{category_num, predicate_type_atom, value_or_nested_id}. - Sort these tuples using Elixir's default tuple comparison. This establishes the global order.
category_numprovides the coarsest level of ordering.- Within a category,
predicate_type_atomprovides the next level. - Finally,
value_or_nested_idprovides the finest level, potentially recursively applying the same ordering logic for nested structures.
This structured approach to variable identifiers is crucial for the "ordered" property of your ROBDD. When implementing apply operations (union, intersect, negate), you'll always compare the variable IDs of the current nodes from the two TDDs and operate on the one that comes first in the global order.