document
This commit is contained in:
parent
ef0ffd2f08
commit
385ec666aa
220
test.exs
220
test.exs
@ -1,10 +1,220 @@
|
|||||||
defmodule Tdd do
|
defmodule Tdd do
|
||||||
# --- Existing code from your previous version ---
|
|
||||||
# (init_tdd_system, get_state, update_state, make_node, get_node_details,
|
|
||||||
# variable definitions, basic type constructors, apply, sum, print_tdd)
|
|
||||||
# Ensure your `apply/4` function is the corrected version from the MatchError fix.
|
|
||||||
@moduledoc """
|
@moduledoc """
|
||||||
Ternary decision diagram for set-theoretic types.
|
# Ternary Decision Diagram (TDD) for Set-Theoretic Types in Elixir
|
||||||
|
|
||||||
|
## 1. Introduction
|
||||||
|
|
||||||
|
This document outlines the design and implementation of a Ternary Decision Diagram (TDD) based system for representing and manipulating set-theoretic types, inspired by systems like CDuce. The goal is to create a robust way to perform type checking, type inference, and other type-level computations for a rich set of datatypes, similar to those found in Elixir.
|
||||||
|
|
||||||
|
A TDD is a directed acyclic graph (DAG) used to represent a function `f(v1, v2, ..., vn) -> {true, false, dont_care}`. In our context, it represents a characteristic function for a type: given a value, the TDD determines if the value belongs to the type (`true`), does not belong (`false`), or if the specific predicates tested so far are insufficient or irrelevant for this particular type operation (`dont_care`).
|
||||||
|
|
||||||
|
The TDDs are kept **ordered** and **reduced** to ensure a canonical representation for each type, making type equivalence checks (and other operations) efficient.
|
||||||
|
- **Ordered**: Variables (predicates) appear in the same fixed global order on all paths from the root to a terminal.
|
||||||
|
- **Reduced**: Isomorphic subgraphs are merged (shared), and nodes whose children would make the test redundant under certain TDD algebra rules are eliminated or simplified.
|
||||||
|
|
||||||
|
## 2. Core TDD Structure and Operations
|
||||||
|
|
||||||
|
### 2.1. Nodes
|
||||||
|
|
||||||
|
There are two kinds of nodes:
|
||||||
|
|
||||||
|
1. **Terminal Nodes**:
|
||||||
|
* `TRUE_TERMINAL` (ID: `1`): Represents the universal set (type `any`). A path ending here means the value (or part of it) satisfies the type constraints along that path.
|
||||||
|
* `FALSE_TERMINAL` (ID: `0`): Represents the empty set (type `none`). A path ending here means the value fails the type constraints.
|
||||||
|
|
||||||
|
2. **Variable Nodes**:
|
||||||
|
* Represented as a tuple: `{variable_identifier, yes_child_id, no_child_id, dc_child_id}`.
|
||||||
|
* `variable_identifier`: A unique, globally ordered term identifying the predicate being tested at this node (e.g., "is the value an atom?", "is the integer value < 10?").
|
||||||
|
* `yes_child_id`: The ID of the next TDD node if the predicate is true.
|
||||||
|
* `no_child_id`: The ID of the next TDD node if the predicate is false.
|
||||||
|
* `dc_child_id` (Don't Care): The ID of the next TDD node if the predicate is irrelevant for the current type or operation. The semantic interpretation of `dc` is crucial and aligns with common TDD usage (e.g., for a union operation, `dc(A | B) = dc(A) | dc(B)`).
|
||||||
|
|
||||||
|
### 2.2. Node Management (`Tdd` module state)
|
||||||
|
|
||||||
|
The `Tdd` module maintains global state (currently via `Process.put/get` for simplicity, ideally a `GenServer`):
|
||||||
|
|
||||||
|
* `@nodes`: A map from `node_tuple ({variable, yes_id, no_id, dc_id})` to `node_id`. This ensures that structurally identical nodes are shared (part of the "reduced" property).
|
||||||
|
* `@node_by_id`: A map from `node_id` to its `node_tuple` or a terminal symbol (`:true_terminal`, `:false_terminal`).
|
||||||
|
* `@next_id`: The next available integer ID for a new node.
|
||||||
|
* `@op_cache`: A map for memoizing results of operations like `apply` (binary ops), `negate`, and `simplify_with_constraints`. Keys are typically `{{op_name, id1, id2}, result_id}` or `{{op_name, id1}, result_id}`.
|
||||||
|
|
||||||
|
### 2.3. Variable Ordering
|
||||||
|
|
||||||
|
A strict global total order of all possible `variable_identifier`s is essential. This is achieved by defining variable identifiers as Elixir tuples, which have a natural sort order.
|
||||||
|
The proposed structure for variable identifiers is:
|
||||||
|
`{category_integer, predicate_type_atom, specific_value_or_nested_id}`
|
||||||
|
|
||||||
|
Example categories:
|
||||||
|
* `0`: Primary type discriminators (e.g., `is_atom`, `is_integer`, `is_list`).
|
||||||
|
* `1`: Atom-specific predicates (e.g., `value == :foo`).
|
||||||
|
* `2`: Integer-specific predicates (e.g., `value < 10`).
|
||||||
|
* `4`: Tuple-specific predicates (e.g., `size == 2`, `element 0 has_type X`).
|
||||||
|
* And so on for other types.
|
||||||
|
|
||||||
|
### 2.4. Core Operations
|
||||||
|
|
||||||
|
1. **`make_node_raw(variable, yes_id, no_id, dc_id)`**:
|
||||||
|
* The fundamental private function for creating or retrieving unique structural nodes.
|
||||||
|
* Implements structural sharing via the `@nodes` table.
|
||||||
|
* Implements a basic reduction rule: if `yes_id == no_id == dc_id`, the node is redundant, and that common child ID is returned.
|
||||||
|
|
||||||
|
2. **`check_assumptions_consistency(assumptions_map)`**:
|
||||||
|
* A private helper function crucial for semantic reduction.
|
||||||
|
* Takes a map `%{variable_id => value (true/false/:dc)}` representing current path assumptions.
|
||||||
|
* Returns `:consistent` or `:contradiction` based on predefined semantic rules of the type system (e.g., `is_atom=true` AND `is_tuple=true` is a contradiction).
|
||||||
|
* This function will be expanded as more types and predicates are added.
|
||||||
|
|
||||||
|
3. **`simplify_with_constraints(tdd_id, assumptions_map)`**:
|
||||||
|
* A private, memoized, recursive function that takes a `tdd_id` and an `assumptions_map`.
|
||||||
|
* It produces a new `tdd_id` that is semantically equivalent to the input `tdd_id` under the given assumptions, but potentially simpler.
|
||||||
|
* **Crucial Behavior**: If `check_assumptions_consistency(assumptions_map)` returns `:contradiction` at any point, `simplify_with_constraints` immediately returns `@false_node_id`.
|
||||||
|
* If the TDD's variable is already in `assumptions_map`, it follows the constrained path.
|
||||||
|
* Otherwise, it recursively simplifies children, adding the current node's variable assignment to the assumptions for those deeper calls, and rebuilds the node using `make_node_raw`.
|
||||||
|
|
||||||
|
4. **`apply_raw(op_name, op_lambda, u1_id, u2_id)`**:
|
||||||
|
* The private, memoized, recursive Shannon expansion algorithm for binary set operations (union, intersection).
|
||||||
|
* `op_lambda` defines the operation on terminal nodes.
|
||||||
|
* It selects the `top_var` based on the global variable order.
|
||||||
|
* Recursively calls `apply_raw` on the children.
|
||||||
|
* Uses `make_node_raw` to construct result nodes.
|
||||||
|
* This function computes the *structural* result of the operation.
|
||||||
|
|
||||||
|
5. **Public API Operations (`sum/2`, `intersect/2`, `negate/1`)**:
|
||||||
|
* These functions orchestrate the operation:
|
||||||
|
1. Call the respective `_raw` version (e.g., `apply_raw` for `sum`/`intersect`, `negate_raw` for `negate`).
|
||||||
|
2. Take the `raw_result_id` from step 1.
|
||||||
|
3. Return `simplify_with_constraints(raw_result_id, %{})`. This final step ensures that all TDDs exposed through the public API are not only structurally canonical (via `make_node_raw` and `apply_raw`) but also *semantically canonical* (i.e., known impossible paths or contradictions are resolved to `@false_node_id`).
|
||||||
|
|
||||||
|
6. **Type Constructors (e.g., `type_atom()`, `type_atom_literal(:foo)`)**:
|
||||||
|
* These public functions build the TDD for a specific type.
|
||||||
|
* They use `make_node_raw` to define the basic structure.
|
||||||
|
* They then return `simplify_with_constraints(raw_id, %{})` to ensure the constructed type is in its simplest semantic form.
|
||||||
|
|
||||||
|
7. **`is_subtype(sub_id, super_id)`**:
|
||||||
|
* Defined as `simplify_with_constraints(intersect(sub_id, negate(super_id)), %{}) == @false_node_id`.
|
||||||
|
* Since `intersect` and `negate` now return semantically simplified TDDs, if `A ∩ ¬B` represents an empty set, the result of the intersection will be `@false_node_id`.
|
||||||
|
|
||||||
|
## 3. Datatype Representation Details
|
||||||
|
|
||||||
|
This section outlines how various Elixir-like datatypes are (or will be) represented using TDD variables and constructors. All constructors ensure the final TDD is passed through `simplify_with_constraints(raw_id, %{})`.
|
||||||
|
|
||||||
|
### 3.1. Atoms
|
||||||
|
|
||||||
|
* **Variables**:
|
||||||
|
* `@v_is_atom = {0, :is_atom}`: Primary type check.
|
||||||
|
* `v_atom_eq_A = {1, :value, A}`: Checks if the atom's value is `A`. Order by `A`.
|
||||||
|
* **Constructors**:
|
||||||
|
* `type_atom()`: Represents any atom. TDD: `make_node_raw(@v_is_atom, @true_node_id, @false_node_id, @false_node_id)`.
|
||||||
|
* `type_atom_literal(val)`: Represents a specific atom. TDD: `make_node_raw(@v_is_atom, node_for_val_eq, @false_node_id, @false_node_id)` where `node_for_val_eq = make_node_raw(v_atom_eq_A, @true_node_id, @false_node_id, @false_node_id)`.
|
||||||
|
* **Semantic Constraints for `check_assumptions_consistency`**:
|
||||||
|
* If `assumptions_map` contains `{{0, :is_atom}, true}` and `{{0, other_primary_type}, true}` -> contradiction.
|
||||||
|
* If `assumptions_map` contains `{{1, :value, A}, true}` and `{{1, :value, B}, true}` where `A != B` -> contradiction.
|
||||||
|
|
||||||
|
### 3.2. Tuples
|
||||||
|
|
||||||
|
* **Variables**:
|
||||||
|
* `@v_is_tuple = {0, :is_tuple}`: Primary type check.
|
||||||
|
* `v_tuple_size_eq_N = {4, :size, N}`: Checks if tuple size is `N`. Order by `N`.
|
||||||
|
* `v_tuple_elem_I_PRED = {4, :element, Index_I, NESTED_PREDICATE_ID}`: Predicate for element at `Index_I`. `NESTED_PREDICATE_ID` is a variable from the global order, applied to the element. (e.g., `{4, :element, 0, {0, :is_atom}}` checks if element 0 is an atom). Order by `Index_I`, then by `NESTED_PREDICATE_ID`.
|
||||||
|
* **Constructors**:
|
||||||
|
* `type_tuple()`: Any tuple.
|
||||||
|
* `type_empty_tuple()`: The tuple `{}`.
|
||||||
|
* `type_tuple_sized_any(size)`: Any tuple of a given size.
|
||||||
|
* `type_tuple_specific(element_type_ids_list)`: e.g., for `{atom(), integer()}`. This will involve creating nodes for size, then for each element, applying the TDD for that element's type.
|
||||||
|
* **Semantic Constraints**:
|
||||||
|
* `is_tuple=true` vs. other primary types.
|
||||||
|
* If `{{4, :size, N}, true}` and `{{4, :size, M}, true}` where `N != M` -> contradiction.
|
||||||
|
* If `{{4, :size, N}, true}` and a predicate `{{4, :element, I, _}, _}` exists where `I >= N` -> potential contradiction or path simplification (element doesn't exist).
|
||||||
|
|
||||||
|
### 3.3. Integers (Next to Implement)
|
||||||
|
|
||||||
|
* **Variables**:
|
||||||
|
* `@v_is_integer = {0, :is_integer}` (or a new category, e.g., `2` for integer properties).
|
||||||
|
* `v_int_eq_N = {INT_CAT, :eq, N}`.
|
||||||
|
* `v_int_lt_N = {INT_CAT, :lt, N}` (value < N).
|
||||||
|
* `v_int_gt_N = {INT_CAT, :gt, N}` (value > N).
|
||||||
|
* *(Consider also: `lte` (less than or equal), `gte` (greater than or equal) to simplify some range logic, or derive them).*
|
||||||
|
* **Constructors**:
|
||||||
|
* `type_integer()`: Any integer.
|
||||||
|
* `type_int_eq(n)`: A specific integer value.
|
||||||
|
* `type_int_lt(n)`, `type_int_gt(n)`.
|
||||||
|
* `type_int_range(min, max, min_inclusive, max_inclusive)`: Integers within a specific range.
|
||||||
|
* **Semantic Constraints**:
|
||||||
|
* `is_integer=true` vs. other primary types.
|
||||||
|
* `eq(N)` and `eq(M)` with `N != M` -> contradiction.
|
||||||
|
* `eq(N)` and `lt(M)` if `N >= M` -> contradiction.
|
||||||
|
* `eq(N)` and `gt(M)` if `N <= M` -> contradiction.
|
||||||
|
* `lt(N)` and `gt(M)` if `N <= M+1` (or `N <= M` if `gt` means `>=`) -> contradiction. (e.g., `x < 5` and `x > 4` has no integer solution).
|
||||||
|
* *Strategy for complex integer constraints*: Maintain a "current allowed interval" `[min_assumed, max_assumed]` based on `assumptions_map`. If this interval becomes empty or invalid, it's a contradiction. Each new integer assumption (`lt, gt, eq`) refines this interval.
|
||||||
|
|
||||||
|
### 3.4. Lists (Planned)
|
||||||
|
|
||||||
|
* **Variables**:
|
||||||
|
* `@v_is_list = {0, :is_list}`.
|
||||||
|
* `v_list_is_empty = {LIST_CAT, :is_empty}`.
|
||||||
|
* *If not empty*:
|
||||||
|
* `v_list_head_VAR = {LIST_CAT, :head, NESTED_PREDICATE_ID}`: Applies a global predicate to the head.
|
||||||
|
* `v_list_tail_VAR = {LIST_CAT, :tail, NESTED_PREDICATE_ID_FOR_TAIL_LIST}`: Applies a global predicate (usually list predicates) to the tail.
|
||||||
|
* *(Alternative for fixed-length lists/known structure: `{LIST_CAT, :elem, Index_I, NESTED_PREDICATE_ID}` similar to tuples).*
|
||||||
|
* **Constructors**:
|
||||||
|
* `type_list_any()`.
|
||||||
|
* `type_empty_list()`.
|
||||||
|
* `type_cons(head_type_id, tail_type_id)`.
|
||||||
|
* `type_list_of(element_type_id)`: e.g., `list(integer())`.
|
||||||
|
* **Semantic Constraints**:
|
||||||
|
* `is_list=true` vs. other primary types.
|
||||||
|
* If `is_empty=true`, then any predicate about `head` or non-empty `tail` structure is contradictory if it implies existence.
|
||||||
|
|
||||||
|
### 3.5. Strings & Binaries (Planned)
|
||||||
|
|
||||||
|
* **Variables**:
|
||||||
|
* `@v_is_binary = {0, :is_binary}`.
|
||||||
|
* `@v_is_string = {0, :is_string}` (can be a check after `is_binary` or a distinct primary type if model demands).
|
||||||
|
* Size/length predicates: `v_binary_size_eq_N`, `v_string_length_eq_N`.
|
||||||
|
* Content predicates: `v_string_eq_S`, `v_string_prefix_P`, `v_string_suffix_S`, `v_string_matches_regex_R`.
|
||||||
|
* **Semantic Constraints**: Size vs content (e.g., `size=1` and `prefix="foo"` is a contradiction). `eq(S1)` and `eq(S2)` if `S1 != S2`.
|
||||||
|
|
||||||
|
### 3.6. Maps (Planned - Complex)
|
||||||
|
|
||||||
|
* **Variables**:
|
||||||
|
* `@v_is_map = {0, :is_map}`.
|
||||||
|
* `v_map_size_eq_N`.
|
||||||
|
* `v_map_has_key_K`: (K is a canonical representation of an Elixir term).
|
||||||
|
* *If `has_key_K` is true*:
|
||||||
|
* `v_map_key_K_value_VAR = {MAP_CAT, :key_value, K, NESTED_PREDICATE_ID}`: Applies a global predicate to the value associated with key K.
|
||||||
|
* For `%{pattern_key => pattern_value}` types:
|
||||||
|
* This requires careful thought. Might involve predicates like `v_map_all_keys_matching_TYPE_X_have_values_matching_TYPE_Y`.
|
||||||
|
* **Semantic Constraints**: `is_map` vs. others. Size vs. `has_key` interactions. Contradictory type requirements for the same key's value.
|
||||||
|
|
||||||
|
### 3.7. Functions (Planned - Very Complex)
|
||||||
|
|
||||||
|
* Representation of function types (`fun((Arg1Type, Arg2Type, ...) -> ReturnType)`) is a significant challenge for TDDs.
|
||||||
|
* **Variables (Tentative)**:
|
||||||
|
* `@v_is_function = {0, :is_function}`.
|
||||||
|
* `v_fun_arity_eq_A`.
|
||||||
|
* Predicates for argument types at specific positions (e.g., `v_fun_arg_I_type_VAR`).
|
||||||
|
* Predicates for return type (e.g., `v_fun_return_type_VAR`).
|
||||||
|
* Intersection and union of function types involve concepts like contravariance of arguments and covariance of return types. This may require specialized logic beyond simple TDD operations or a very elaborate variable scheme. Often, function types are handled with auxiliary structures in type systems.
|
||||||
|
|
||||||
|
## 4. Current Status & Next Steps
|
||||||
|
|
||||||
|
* **Implemented**: Atoms, basic Tuples (any, empty, sized_any). Core TDD operations (`sum`, `intersect`, `negate`, `is_subtype`) with semantic simplification framework (`simplify_with_constraints` and `check_assumptions_consistency`).
|
||||||
|
* **Passing Tests**: A suite of tests for atom/tuple interactions, unions, intersections, negations, and subtyping, including resolution of contradictions like `atom & tuple == none`.
|
||||||
|
* **Next Immediate Step**: Implement **Integer types** as outlined in section 3.3. This will involve:
|
||||||
|
1. Defining integer-specific predicates and their global order.
|
||||||
|
2. Creating integer type constructors.
|
||||||
|
3. Significantly expanding `check_assumptions_consistency` to handle integer comparisons (`eq`, `lt`, `gt`) and their interactions.
|
||||||
|
4. Adding comprehensive tests for integers.
|
||||||
|
|
||||||
|
## 5. Future Considerations
|
||||||
|
|
||||||
|
* **Performance**: For very large TDDs or complex types, the number of nodes and cache sizes can grow. Investigate optimizations if needed.
|
||||||
|
* **Generality of `check_assumptions_consistency`**: Designing this to be easily extensible and correct for many interacting predicates is challenging. A rule-based system or a more abstract way to define predicate interactions might be beneficial.
|
||||||
|
* **"Don't Care" (`dc`) branch semantics**: Ensure the `dc` branch is consistently and correctly handled in all operations, especially `simplify_with_constraints` if assumptions can make a variable "don't care". Currently, `simplify_with_constraints` assumes `true/false/:dc` values in the `assumptions_map` if a variable is already constrained.
|
||||||
|
* **Type Inference**: Using the TDD operations to infer types or solve type constraints.
|
||||||
|
* **Polymorphism**: Representing and operating on types with free type variables. Typically, free variables are treated as `any` or involve substitution before TDD construction.
|
||||||
|
|
||||||
|
This document provides a snapshot of the current TDD system and a roadmap for its extension. The core principle is the combination of structurally canonical ROBDDs (via `make_node_raw` and `apply_raw`) with a semantic simplification layer (`simplify_with_constraints`) that embeds knowledge of the type system's rules.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# --- Terminal Node IDs ---
|
# --- Terminal Node IDs ---
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user