diff --git a/.aider.chat.history.md b/.aider.chat.history.md index 1cce221..f48d73a 100644 --- a/.aider.chat.history.md +++ b/.aider.chat.history.md @@ -57014,3 +57014,221 @@ mix test #### /map-refresh > The repo map has been refreshed, use /map to view it. + +# aider chat started at 2025-06-15 11:53:21 + +> You can skip this check with --no-gitignore +> Add .aider* to .gitignore (recommended)? (Y)es/(N)o [Yes]: n +> /home/omnissiah/.local/bin/aider --watch-files --model openrouter/google/gemini-2.5-pro-preview --api-key openrouter=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 --editor n +> Error: Read-only file /home/omnissiah/git/pl/CONVENTIONS.md does not exist. Skipping. +> Aider v0.83.2 +> Model: openrouter/google/gemini-2.5-pro-preview with whole edit format +> Git repo: .git with 62 files +> Repo-map: using 4096 tokens, auto refresh + +#### /map-refresh +> The repo map has been refreshed, use /map to view it. + +#### /add lib/til/tdd.ex +> Added lib/til/tdd.ex to the chat + +#### /ask Please help me refine the theory for impementing set-theoritic types with this + +#### Please help me refine the theory for impementing set-theoritic types with this + + + +**Deconstructing the Core Task** + +I'm breaking down the user's request. I'm honing in on "refining the theory" and identifying set-theoretic types as the key area. My focus is to clearly delineate the core task, which is to refine the theoretical underpinnings for set-theoretic types, likely with the aim of improving a TDD/BDD implementation. + + +**Identifying Key Challenges** + +I've moved on from the initial dissection and am now zeroing in on the core difficulties. The user's notes highlight two significant hurdles: integrating various types within a single TDD structure, and determining the appropriate content for a node cache. The initial exploration is revealing potential difficulties in managing the complexity of diverse set-theoretic types within a unified framework. The approach contrasts with existing methods that use multiple BDDs. + + +**Evaluating Current Implementations** + +I'm now scrutinizing the user's existing codebase. The files indicate a TDD implementation, not a BDD as initially stated, which changes the scope. The user has, perhaps inadvertently, already adopted a CDuce-style approach within `tilly`. Key areas like "ignore_child" are revealing and warrant a deeper dive into the nuances of the current structure. + + +**Re-evaluating the Architecture** + +I've refined my thinking about the apparent disconnect in the user's approach. Their `tilly` implementation, which they believe is a BDD implementation, is actually a TDD structure. This suggests they may be overcomplicating the process unnecessarily by creating a new TDD module in the root (`lib/til/tdd.ex`). I believe the existing codebase, which follows a CDuce-like style using separate domain-specific structures, already addresses many of their issues. I'm beginning to formulate a recommendation for an alternative approach, potentially using and extending the 'tilly' code. + + +**Refining the Core Strategy** + +I'm solidifying the plan. My focus now is on articulating a concise, actionable recommendation. The core is to leverage and extend the user's existing `tilly` code. Specifically, I'll advise them to rename `lib/til/tdd.ex` to something like `lib/til/type.ex` and to refactor it, rather than trying to reinvent the wheel. The emphasis will be on utilizing and connecting their CDuce-inspired modules to create type descriptors that aggregate BDDs. + + +**Synthesizing a Recommendation** + +I'm formulating the core advice for the user. It's become clear that they possess a capable, if slightly misidentified, TDD implementation within the `tilly` module, following a CDuce-style approach. The key suggestion is to embrace and extend this existing structure, advising a shift from the potentially complex idea of a single, overarching TDD, to the modular design already in place. + + +**Synthesizing Key Insights** + +I'm organizing my thoughts on the user's work. My current focus is formulating a clear, concise set of recommendations, based on my analysis. I'll advise the user on a refactoring strategy. I'm also incorporating suggestions regarding the core architectural elements, highlighting the benefits of the approach already implemented in the user's `tilly` code. + + +**Outlining the Solution** + +I'm now building the response structure. I'm preparing to inform the user about refactoring the provided code to leverage the CDuce-like approach within their `tilly` directory. My goal is to guide them towards a more efficient implementation based on their work, reframing the design as an evolution rather than a reinvention. + + +**Summarizing the Process** + +I'm now compiling all the thoughts I've had to provide a clear and actionable response. I've clarified the user's key conceptual hurdles and their existing work, pinpointing a path to integrate their work and address their concerns. The current focus is a synthesis of the core insights into a concise and practical strategy. + + + + + + +Of course. This is a fascinating and powerful area of computer science. Your notes in `lib/til/tdd.ex` show you're already thinking about the right problems (ordering, reduction, caching). + +Based on the files you've provided, especially the `tilly/bdd/*` and `tilly/type/store.ex` modules, it seems you have already built a solid foundation that is very close to the CDuce model you mentioned. My main recommendation is to **embrace and build upon your existing `tilly` implementation** rather than starting fresh in `lib/til/tdd.ex`. + +Here is a refined theory and a proposed path forward. + +### 1. Theory Refinement: The Composite Type Model + +Your central question is about a "single TDD" vs. the CDuce model of multiple BDDs. The latter is far more practical and is what your `tilly` code is already structured to do. + +**Let's refine the theory:** + +A **Type** is not a single BDD/TDD. A **Type** is a *composite descriptor* that maps base types (like `:integer`, `:string`, `:atom`) to the root ID of a BDD that represents the set of values for that base type. + +Your `Tilly.Type.Store` is the perfect place to manage these descriptors. A type descriptor would look like this: + +```elixir +# Represents the type: (all integers) OR (the string "hello") OR (the atom :ok) +%Tilly.Type.Descriptor{ + id: 123, # A unique ID for this specific type combination + structure: %{ + integer: 45, # ID of a BDD representing all integers + string: 82, # ID of a BDD representing the set {"hello"} + atom: 99 # ID of a BDD representing the set {:ok} + } +} +``` + +* **BDDs handle the details:** Each BDD (e.g., for integers) only needs to order integer values. The string BDD only orders strings. This solves your variable ordering problem. +* **The Type Descriptor handles the union:** The top-level type is the union of the sets represented by each BDD in its `structure` map. +* **The `Type.Store` handles identity:** By interning these descriptors (which `get_or_intern_descr` does), you achieve your goal: type equality becomes a simple ID comparison. + +### 2. Your Existing Code is a TDD, not a BDD + +Your `Tilly.BDD.Node.mk_split/4` function takes `positive_child_id`, `ignore_child_id`, and `negative_child_id`. This is the definition of a **Ternary Decision Diagram (TDD)** node, which is more powerful than a standard Binary Decision Diagram (BDD). This is great! TDDs are excellent for representing sequences and patterns, which might be useful later. For now, you can think of them as BDDs where the `ignore` branch is often unused for simple value sets. + +### 3. Answering Your Key Questions + +With this composite model in mind, let's re-answer your questions from `tdd.ex`: + +1. **Structure of Ordered Variables:** You don't need a single, globally ordered list of all possible variables. Instead, each domain has its own BDD with its own ordering. + * `IntegerBoolOps` defines ordering for integers. + * `StringBoolOps` defines ordering for strings. + * The top-level `Type.Descriptor` combines them. + +2. **Node Cache:** Your intuition is correct, and you've already implemented it! You have two levels of caching: + * **Type Descriptor Interning:** `Tilly.Type.Store` is your cache for the high-level type descriptors. This is the most important one for making `type1 == type2` fast. + * **BDD Node Interning (Hash-Consing):** Within each BDD, you need to ensure that any node `(variable, yes_child, no_child)` is unique. This is how you achieve reduction. You would typically add a `unique_table` (a map) to your BDD context and use a smart constructor. + +3. **Reduction Rules:** Reduction is not a separate step. It's enforced *during construction* by a "smart constructor" for nodes. + + Let's imagine a `mk_split` function in `Tilly.BDD.Node`: + + ```elixir + # In Tilly.BDD.Node, or a new BDD context module + + # ctx would contain the unique_table and element store for this BDD + def mk_split(ctx, element, positive_child_id, negative_child_id) do + # Rule: Eliminate redundant nodes + if positive_child_id == negative_child_id do + return {ctx, positive_child_id} + end + + # Rule: Merge isomorphic subgraphs (via hash-consing) + node_key = {:split, element, positive_child_id, negative_child_id} + case Map.get(ctx.unique_table, node_key) do + nil -> + # Not found, create a new node + new_node_id = # ... generate new ID + new_node = # ... create the node tuple + # Add to caches + ctx = put_in(ctx.unique_table[node_key], new_node_id) + ctx = put_in(ctx.nodes[new_node_id], new_node) + {ctx, new_node_id} + existing_id -> + # Found, return existing node ID + {ctx, existing_id} + end + end + ``` + +### Recommended Plan + +1. **Delete `lib/til/tdd.ex`**. Its purpose is better served by a new, higher-level module. + +2. **Create `lib/til/type.ex`**. This will be the main public API for working with types. It will use `Tilly.Type.Store` and the `Tilly.BDD` modules internally. + + ```elixir + # lib/til/type.ex + defmodule Til.Type do + @moduledoc "Public API for creating and operating on set-theoretic types." + + # Example: Create a type representing the set of integers < 10 + def int_less_than(typing_ctx, value) do + # 1. Use Tilly.BDD with IntegerBoolOps to build the BDD for this set. + # 2. Get the root ID of the resulting BDD. + # 3. Create a type descriptor: %{integer: bdd_root_id} + # 4. Intern this descriptor using Tilly.Type.Store.get_or_intern_descr/2 + # 5. Return the new typing_ctx and the final type descriptor ID. + end + + # Union of two types + def union(typing_ctx, type_id_1, type_id_2) do + # 1. Get descriptors for both type IDs from the Type.Store. + # 2. For each base type (e.g., :integer), get the BDD IDs from both descriptors. + # 3. Use the appropriate BDD union function (e.g., from IntegerBoolOps) on the two BDDs. + # 4. Collect the resulting BDD IDs into a new descriptor structure. + # 5. Intern the new descriptor and return its ID. + end + + # intersection, negation, etc. would follow the same pattern. + end + ``` + +3. **Enhance `Tilly.BDD`** with a context/manager that holds the `unique_table` for node reduction, as sketched in the `mk_split` example above. + +This approach leverages the excellent foundation you've already built in the `tilly` namespace and provides a clear, scalable path to implementing a powerful type system. + +> Tokens: 4.4k sent, 2.5k received. Cost: $0.000000030 message, $0.000000030 session. +> lib/til/type.ex +> Add file to the chat? (Y)es/(N)o/(A)ll/(S)kip all/(D)on't ask again [Yes]: n +> notes +> +> +> ^C again to exit +> +> +> ^C KeyboardInterrupt + +# aider chat started at 2025-06-15 11:55:56 + +> You can skip this check with --no-gitignore +> Add .aider* to .gitignore (recommended)? (Y)es/(N)o [Yes]: n +> /home/omnissiah/.local/bin/aider --watch-files --model openrouter/google/gemini-2.5-pro-preview --api-key openrouter=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 --editor n +> Error: Read-only file /home/omnissiah/git/pl/CONVENTIONS.md does not exist. Skipping. +> Aider v0.83.2 +> Model: openrouter/google/gemini-2.5-pro-preview with whole edit format +> Git repo: .git with 62 files +> Repo-map: using 4096 tokens, auto refresh + +#### /context +> Aider v0.83.2 +> Model: openrouter/google/gemini-2.5-pro-preview with context edit format +> Git repo: .git with 62 files +> Repo-map: using 8192 tokens, always refresh diff --git a/.aider.input.history b/.aider.input.history index 1b67915..fef2f90 100644 --- a/.aider.input.history +++ b/.aider.input.history @@ -4164,3 +4164,21 @@ # 2025-06-15 02:23:11.675263 +/map-refresh + +# 2025-06-15 11:53:28.432969 ++n + +# 2025-06-15 11:54:06.645621 ++/map-refresh + +# 2025-06-15 11:54:11.240897 ++/add lib/til/tdd.ex + +# 2025-06-15 11:54:52.625145 ++/ask Please help me refine the theory for impementing set-theoritic types with this + +# 2025-06-15 11:55:46.654825 ++n + +# 2025-06-15 11:56:09.668461 ++/context diff --git a/.gitignore b/.gitignore index 28efb92..b15d763 100644 --- a/.gitignore +++ b/.gitignore @@ -24,3 +24,5 @@ pl-*.tar # Temporary files, for example, from tests. /tmp/ + +.aider.tags.cache.v4/ diff --git a/g1 b/g1 new file mode 100644 index 0000000..80a8250 --- /dev/null +++ b/g1 @@ -0,0 +1,318 @@ +``` +defmodule Tdd do + @moduledoc """ + Ternary decision diagram, used for representing set-theoritic types, akin to cduce. + There are 2 types of nodes: + - terminal nodes (true, false) + - variable nodes + + variable nodes consist of: + - the variable being tested + - yes: id of the node if the result of the test is true + - no: id of the node if the result of the test is false + - dc: id of the node if the result of the test is irrelevant for the current operation + + the TDD needs to be ordered and reduced (ROBDD) + - 'ordered' if different variables appear in the same order on all paths from the root. + - 'reduced' if the following two rules have been applied to its graph: + - Merge any isomorphic subgraphs. + - Eliminate any node whose two children are isomorphic. + + Working notes: + - structure of the ordered variables: + Im thinking of structuring all possible types inside 1 TDD, in contrast to cduce, which uses a `desrc` structure that contains several TDDs (one for each domain, like ints, atoms, functions, etc.), and descr is a union between them. + For this, I need to come up with a variable structure that'll be ordered. + My set types will need to represent types like: atoms, strings, ints, maps, tuples, functions, kinds? + Moreso, those types themselves consist of smaller subsets of types like: + - int < 10 + - int in [1, 2, 3] + - string > "prefix_" + - atom == false + - atom == false or atom == true or atom == nil + - map == %{"id" => string} and %{string => any | nil} + - etc. + Dont know how to represent them and make them ordered. + - node cache: + I don't yet know what it should contain, I suspect ids of nodes (TDDs) after reduction. This way a comparison between 2 types is just a pointer (id) check in the node cache. But not yet sure. + - reduction rules: not sure how to approach them + + """ + + def node(elem, yes, no, dc = _dont_care) do + end + + def sum(one, two) do + end + + def intersect(one, two) do + end + + def negate(one, two) do + end +end +``` +Please help me refine the theory for impementing set-theoritic types with this + + +======================================================================================================================== + + +Okay, this is a fascinating and ambitious project! Building a TDD-based type system like CDuce is a significant undertaking. Let's refine the theory. + +You've got the core ideas right: terminals, variable nodes, and the need for ordering and reduction. The `dc` (don't care) child is the main differentiator from BDDs and key to TDDs' power in certain applications. + +Let's break down your working notes and expand on them. + +**1. Structure of Ordered Variables (The Core Challenge)** + +Your intuition to have a single TDD for all types is interesting. CDuce uses one TDD *per domain* (atoms, integers, etc.) and then a `descr` (descriptor) which is a union of these. A single TDD means your variable ordering must be global and able to discriminate across all these domains first, then within them. + +**Proposed Hierarchical Variable Ordering:** + +The variables in your TDD will represent *predicates* about a value. The order is crucial. + +* **Level 0: Base Type Discriminators:** These variables test the fundamental Erlang/Elixir type of a value. + * `is_atom?` + * `is_integer?` + * `is_float?` + * `is_string?` (Elixir strings are UTF-8 binaries) + * `is_binary?` (General binaries) + * `is_list?` + * `is_tuple?` + * `is_map?` + * `is_function?` + * `is_pid?` + * `is_reference?` + * *Order*: Define a canonical order for these (e.g., alphabetical, or by commonality). Let's say `is_atom` < `is_binary` < `is_float` < ... + +* **Level 1+: Type-Specific Predicates:** Once a base type is determined (e.g., the `yes` branch of `is_integer?` is taken), subsequent variables test properties specific to that type. + + * **For Atoms:** + * `value_is(specific_atom)`: e.g., `value_is(true)`, `value_is(false)`, `value_is(nil)`, `value_is(:foo)`. + * *Order*: Canonical order of the atoms themselves (e.g., `false` < `nil` < `true` < other atoms alphabetically). + * This implies a potentially large number of variables if you want to distinguish many specific atoms. Often, you care about a few (booleans, nil) and "any other atom". + + * **For Integers:** + * `value < N`, `value == N` (less common as a direct variable, usually derived), `value_in_range(A,B)`. + * CDuce uses a finite partitioning of the integer line. For example, variables could be `value < 0`, `value < 10`, `value < MAX_INT_SMALL_ENOUGH_FOR_OPTIMIZATION`. + * *Order*: `value < N1` < `value < N2` if `N1 < N2`. + + * **For Strings/Binaries:** + * `length_is(L)` or `length_in_range(L1,L2)` + * `prefix_is(prefix_string)` + * `matches_regex(regex_pattern)` (Regexes can be complex to integrate directly as simple TDD variables due to their expressive power. CDuce has specific handling for regular expression types on sequences). + * *Order*: By length, then by prefix lexicographically, etc. + + * **For Tuples:** + * `arity_is(N)` + * `element_at(Index)_is_type(TypeRef)`: This is recursive. `TypeRef` would be the ID of another TDD representing the type of the element. + * *Order*: `arity_is(0)` < `arity_is(1)` < ... then `element_at(0)_is_type(T)` < `element_at(1)_is_type(T')`. The `TypeRef` itself would need to be part of the variable ordering if you allow different type tests for the same element index. + + * **For Maps:** This is the most complex. + * `has_key(KeyAtomOrString)` + * `value_for_key(Key)_is_type(TypeRef)` + * `all_keys_are_type(KeyTypeRef)` + * `all_values_are_type(ValueTypeRef)` + * CDuce has specialized structures for map patterns. You might need variables like `matches_map_pattern(PatternID)`. + * *Order*: By key name, then by tests on those keys. + + * **For Functions:** + * `arity_is(N)` + * `accepts_args_types(ArgTypesTupleRef)` + * `returns_type(ReturnTypeRef)` + * *Order*: By arity, then by argument types, then return type. + +**The "Variable" Itself:** +A "variable" in your TDD node won't be just a name; it'll be an identifier that maps to a specific predicate function and its parameters. E.g., `var_id = 101` might mean `fn val -> is_integer(val) and val < 10 end`. You'll need a global, ordered registry of these predicate variables. + +**2. Node Cache (`unique_table`)** + +Yes, this is crucial for the "reduced" property. +* It's a hash table mapping `(variable_id, yes_node_id, no_node_id, dc_node_id)` to a unique `node_id`. +* **Terminals:** + * `true_node_id` (e.g., 1): Represents the "any" type or universal set. + * `false_node_id` (e.g., 0): Represents the "empty" type or bottom type. + These are the leaves of your TDD. +* When you create a node, you first check this table. If an identical node (same variable and children) exists, you reuse its ID. Otherwise, you create a new one, assign it a new ID, and store it. +* This ensures that any isomorphic subgraphs are merged (Reduction Rule 1). + +**3. Reduction Rules** + +You've listed the ROBDD rules. Let's adapt them for TDDs: + +* **Merge any isomorphic subgraphs:** Handled by the `unique_table` as described above. Two subgraphs are isomorphic if they are structurally identical (same variables in same places leading to same terminals or further isomorphic subgraphs). +* **Eliminate any node whose three children are isomorphic:** If `yes_id == no_id == dc_id` for a variable node `v`, then the test `v` is irrelevant for distinguishing the type further down this path. The node `(v, child_id, child_id, child_id)` should be replaced by `child_id`. This rule is applied *before* looking up/inserting into the `unique_table`. + +**The Role of `dc` (Don't Care):** + +The `dc` child is what makes it a TDD. It typically means the variable being tested is irrelevant or its value doesn't affect membership for a particular path or operation. + +* **In Type Representation:** + * If you're representing the type "integer", and your current variable is `is_atom?`: + * `yes` child would be `false_node_id` (an integer is not an atom). + * `no` child would lead to further tests for integers (or `true_node_id` if "integer" is all you're specifying). + * `dc` child: This is tricky for pure representation. If the variable `is_atom?` *must* be evaluated, then `dc` might also be `false_node_id`. + * However, `dc` becomes powerful when combining types or dealing with partially specified types. If a type expression doesn't constrain variable `v`, its `dc` branch relative to `v` might be taken. + * Consider the type `any`. For any variable `v`, the node would be `(v, true_node_id, true_node_id, true_node_id)`. By reduction rule 2, this simplifies to just `true_node_id`. + * Consider the type `none` (empty set). For any variable `v`, node is `(v, false_node_id, false_node_id, false_node_id)`, reduces to `false_node_id`. + +* **In Operations (`sum`, `intersect`, `negate`):** The `dc` child is heavily used in the `apply` algorithm. + When `apply(op, tdd1, tdd2)`: + * If `var(tdd1) == var(tdd2)`: Recurse on `(op, tdd1.yes, tdd2.yes)`, `(op, tdd1.no, tdd2.no)`, `(op, tdd1.dc, tdd2.dc)`. + * If `var(tdd1) < var(tdd2)` (in global order): `tdd2` doesn't explicitly test `var(tdd1)`. So, `var(tdd1)` is effectively "don't care" for `tdd2`. + The operation proceeds by testing `var(tdd1)`: + * `yes_child = apply(op, tdd1.yes, tdd2)` + * `no_child = apply(op, tdd1.no, tdd2)` + * `dc_child = apply(op, tdd1.dc, tdd2)` (This is where TDD differs significantly from BDD's apply). + The new node is `mknode(var(tdd1), yes_child, no_child, dc_child)`. + * Symmetric case for `var(tdd2) < var(tdd1)`. + +**4. `mknode(variable_id, yes_id, no_id, dc_id)` function:** + +This is your core constructor. + +```elixir +defmodule Tdd do + # Store nodes in an array/map, IDs are indices/keys + # @nodes %{0 => :false_terminal, 1 => :true_terminal, ...} + # @unique_table %{{var_id, yes_id, no_id, dc_id} => node_id} + # @variables %{var_id => {predicate_fun, predicate_params, global_order_index}} + + @true_id 1 + @false_id 0 + + # Internal state for unique table, nodes, variable definitions + # This would typically be in a GenServer or an ETS table in a real system + # For simplicity, let's imagine it's passed around or module attributes are used carefully. + + # Example variable IDs (you'd have a system for this) + # @var_is_atom 10 + # @var_is_integer 11 + # @var_int_lt_10 20 + + def get_true(), do: @true_id + def get_false(), do: @false_id + + # Pseudo-code for mknode + def node(variable_id, yes_id, no_id, dc_id) do + # Reduction Rule 2: Eliminate node whose children are isomorphic + if yes_id == no_id and no_id == dc_id do + # If all children are the same, this variable provides no discrimination + # for this path. The node is redundant. + return yes_id + end + + # Check unique_table (conceptually) + # entry = {variable_id, yes_id, no_id, dc_id} + # if Map.has_key?(@unique_table, entry) do + # return @unique_table[entry] + # else + # new_node_id = generate_new_id() + # @nodes = Map.put(@nodes, new_node_id, {variable_id, yes_id, no_id, dc_id}) + # @unique_table = Map.put(@unique_table, entry, new_node_id) + # return new_node_id + # end + # Actual implementation would use a GenServer or ETS for state + :not_implemented_yet # Placeholder + end +end +``` + +**5. Operations (`sum`, `intersect`, `negate`)** + +These are built using a generic `apply` function, often memoized (using a `computed_table`). + +`apply(op_lambda, tdd1_id, tdd2_id)`: +1. **Handle terminal cases:** + * `op_lambda.(@false_id, @false_id)`, `op_lambda.(@false_id, @true_id)`, etc. + * Some ops have identities/annihilators: + * Union: `T | true = true`, `T | false = T`. + * Intersection: `T & false = false`, `T & true = T`. +2. **Check `computed_table`:** `cache_key = {op_lambda_id, tdd1_id, tdd2_id}`. If result exists, return it. +3. **Recursive step:** + * Fetch node data for `tdd1_id` and `tdd2_id`. Let them be `N1`, `N2`. + * Let `v1 = N1.variable_id`, `v2 = N2.variable_id`. + * Compare `v1` and `v2` based on global variable order. + * **If `v1 == v2`**: + `res_yes = apply(op_lambda, N1.yes, N2.yes)` + `res_no = apply(op_lambda, N1.no, N2.no)` + `res_dc = apply(op_lambda, N1.dc, N2.dc)` + `result = mknode(v1, res_yes, res_no, res_dc)` + * **If `v1 < v2` (in order)**: + `res_yes = apply(op_lambda, N1.yes, tdd2_id)` + `res_no = apply(op_lambda, N1.no, tdd2_id)` + `res_dc = apply(op_lambda, N1.dc, tdd2_id)` + `result = mknode(v1, res_yes, res_no, res_dc)` + * **If `v1 > v2` (in order)**: (symmetric to above) + `res_yes = apply(op_lambda, tdd1_id, N2.yes)` + `res_no = apply(op_lambda, tdd1_id, N2.no)` + `res_dc = apply(op_lambda, tdd1_id, N2.dc)` + `result = mknode(v2, res_yes, res_no, res_dc)` +4. Store `result` in `computed_table` for `cache_key`. +5. Return `result`. + +**`negate(tdd_id)`:** +This is simpler, an `apply_unary` operation. +1. If `tdd_id == @true_id`, return `@false_id`. +2. If `tdd_id == @false_id`, return `@true_id`. +3. Check `computed_table_for_negation`. +4. Fetch node `N`. + `res_yes = negate(N.yes)` + `res_no = negate(N.no)` + `res_dc = negate(N.dc)` + `result = mknode(N.variable_id, res_yes, res_no, res_dc)` +5. Store and return `result`. + +**Concrete Example: Type `atom(:foo) | atom(:bar)`** + +Variables (simplified order): `is_atom`, `atom_is_foo`, `atom_is_bar`. +1. TDD for `atom(:foo)`: + * `v_is_atom = mknode(:is_atom, v_atom_is_foo, @false_id, @false_id)` + * `v_atom_is_foo = mknode(:atom_is_foo, @true_id, @false_id, @false_id)` + (Assuming if not atom, or atom but not foo, it's false. `dc` is false as these tests are always relevant here). +2. TDD for `atom(:bar)`: + * `v_is_atom_2 = mknode(:is_atom, v_atom_is_bar, @false_id, @false_id)` + * `v_atom_is_bar = mknode(:atom_is_bar, @true_id, @false_id, @false_id)` + +Now, `sum(tdd_foo, tdd_bar)` using logical OR (`||`): +Top variable is `:is_atom` for both. +`res_yes = apply(||, v_atom_is_foo, v_atom_is_bar)` +`res_no = apply(||, @false_id, @false_id)` -> `@false_id` +`res_dc = apply(||, @false_id, @false_id)` -> `@false_id` + +Consider `apply(||, v_atom_is_foo, v_atom_is_bar)`: +Variables are `:atom_is_foo` and `:atom_is_bar`. Assume `:atom_is_foo < :atom_is_bar`. +Current var is `:atom_is_foo` (from `v_atom_is_foo`). +`ry = apply(||, v_atom_is_foo.yes, v_atom_is_bar)` + `= apply(||, @true_id, v_atom_is_bar)` -> `@true_id` (since `true || X = true`) +`rn = apply(||, v_atom_is_foo.no, v_atom_is_bar)` + `= apply(||, @false_id, v_atom_is_bar)` -> `v_atom_is_bar` (since `false || X = X`) +`rd = apply(||, v_atom_is_foo.dc, v_atom_is_bar)` + `= apply(||, @false_id, v_atom_is_bar)` -> `v_atom_is_bar` + +So, `res_yes = mknode(:atom_is_foo, @true_id, v_atom_is_bar, v_atom_is_bar)`. +This node effectively means: "If it's :foo, then true. Else (if not :foo), then it depends on whether it's :bar (from `v_atom_is_bar`). If :atom_is_foo is don't care, it also depends on :bar." +The reduction rule for `no_id == dc_id` isn't standard for BDDs but might make sense for TDDs if "not X" and "X is irrelevant" lead to the same sub-condition. CDuce's BDDs don't have this `dc` child directly in nodes, it's more implicit in their algorithm. +If we stick to standard ROBDD `apply` ideas slightly modified for three children, `dc` handling when variables differ is key. + +**Key Data Structures to Manage:** + +1. **Node Storage:** A list, array, or map to store the actual nodes. An ID is an index/key into this. + * `nodes = %{0 => {:terminal, :false}, 1 => {:terminal, :true}, 2 => {:node, var_id1, yes2, no2, dc2}, ...}` +2. **Unique Table:** `(variable_id, yes_id, no_id, dc_id) -> node_id`. This is for structural sharing and reduction rule 1. + * `unique_table = %{{var_id1, yes2, no2, dc2} => 2, ...}` +3. **Variable Definitions:** A way to map `variable_id` to its actual predicate logic and its position in the global order. + * `variables = %{var_id1 => %{name: :is_atom, order_idx: 0, predicate: &is_atom/1}, ...}` +4. **Computed Tables (Memoization for Operations):** + * For binary ops: `{(op_name, id1, id2) -> result_id}` + * For unary ops: `{(op_name, id1) -> result_id}` + +**Summary of Refinements:** + +1. **Global Hierarchical Variable Ordering:** Discriminate base types first, then type-specific properties. All variables (predicates) need a unique ID and a place in this global order. +2. **`mknode` is Central:** It implements reduction rule 2 (all children same -> return child) and uses the `unique_table` (reduction rule 1). +3. **`apply` Algorithm:** This is the workhorse for `sum`, `intersect`, etc. Its logic for handling cases where variables differ across operands is where the TDD structure (and `dc` child) truly shows its behavior. The `dc` child of a node `N1=(v,y,n,d)` would be used in `apply(op, N1.dc, N2)` if `v < var(N2)` and the `dc` path for `v` is taken. +4. **`dc` Child Interpretation:** In the `apply` algorithm, if `var(T1) < var(T2)`, `T2` doesn't test `var(T1)`. So, `T2` is passed "as is" when considering `T1.yes`, `T1.no`, and `T1.dc`. The resulting node for `var(T1)` will have its children determined by these recursive calls. The `dc` child in a node structure `(v, Y, N, D)` represents the type if the predicate `v` is "skipped" or considered irrelevant for that path. +5. **State Management:** You'll need a robust way to manage the shared state of nodes, unique table, and computed tables (e.g., a GenServer or ETS tables). + +This is a complex system. Start simple: implement the `mknode` with unique table for just atoms and booleans. Then add `negate`. Then `sum` or `intersect`. Gradually add more complex variable types. Good luck! diff --git a/g2.md b/g2.md new file mode 100644 index 0000000..8f3b1aa --- /dev/null +++ b/g2.md @@ -0,0 +1,216 @@ +``` +defmodule Tdd do + @moduledoc """ + Ternary decision diagram, used for representing set-theoritic types, akin to cduce. + There are 2 types of nodes: + - terminal nodes (true, false) + - variable nodes + + variable nodes consist of: + - the variable being tested + - yes: id of the node if the result of the test is true + - no: id of the node if the result of the test is false + - dc: id of the node if the result of the test is irrelevant for the current operation + + the TDD needs to be ordered and reduced (ROBDD) + - 'ordered' if different variables appear in the same order on all paths from the root. + - 'reduced' if the following two rules have been applied to its graph: + - Merge any isomorphic subgraphs. + - Eliminate any node whose two children are isomorphic. + + Working notes: + - structure of the ordered variables: + Im thinking of structuring all possible types inside 1 TDD, in contrast to cduce, which uses a `desrc` structure that contains several TDDs (one for each domain, like ints, atoms, functions, etc.), and descr is a union between them. + For this, I need to come up with a variable structure that'll be ordered. + My set types will need to represent types like: atoms, strings, ints, maps, tuples, functions, kinds, free polymorphic variables + Moreso, those types themselves consist of smaller subsets of types like: + - int < 10 + - int in [1, 2, 3] + - string > "prefix_" + - atom == false + - atom == false or atom == true or atom == nil + - map == %{"id" => string} and %{string => any | nil} + - polymorphic types ??? + - etc. + Dont know how to represent them and make them ordered. + - node cache: + I suspect ids of nodes (TDDs) after reduction. This way a comparison between 2 types is just a pointer (id) check in the node cache. But not yet sure. + - reduction rules: not sure how to approach them + + """ + + def node(elem, yes, no, dc = _dont_care) do + end + + def sum(one, two) do + end + + def intersect(one, two) do + end + + def negate(one, two) do + end +end +``` +Please help me refine the theory for impementing set-theoritic types with this. +Lets focus on ordering of variables in the TDD + +======================================================================================================================== + +Okay, this is a fascinating and ambitious goal! Building a single, ordered, reduced TDD for all Elixir/Erlang types is a significant challenge, especially with the rich structure of types like maps and functions. Let's break down the ordering of variables. + +The core idea for ordering in a BDD/TDD is to have a **total global order** on all possible "questions" (predicates) you can ask about a value. + +**High-Level Strategy for Ordering Variables** + +1. **Primary Type Discriminators First:** The most fundamental question is "What basic kind of thing is this?" +2. **General Properties Next:** After knowing the primary type, ask about general properties (e.g., size, length). +3. **Specific Value/Content Properties:** Then, delve into specific values or structural content. +4. **Recursive Structure for Nested Types:** For types like tuples and maps, the predicates about their elements/values will effectively be recursive, but the ordering system must flatten this into a global order. + +**Proposed Variable Ordering Scheme** + +Let's define "variables" as unique identifiers for predicates. We need a way to sort these identifiers. A good way is to use tuples, where Elixir's natural tuple sorting provides the order. + +**Category 0: Primary Type Discriminators** +These are the most fundamental. They will have the lowest sort order. +Order them alphabetically by the type name. +* `v_is_atom = {0, :is_atom}` +* `v_is_binary = {0, :is_binary}` +* `v_is_float = {0, :is_float}` +* `v_is_function = {0, :is_function}` +* `v_is_integer = {0, :is_integer}` +* `v_is_list = {0, :is_list}` +* `v_is_map = {0, :is_map}` +* `v_is_pid = {0, :is_pid}` +* `v_is_port = {0, :is_port}` +* `v_is_reference = {0, :is_reference}` +* `v_is_string = {0, :is_string}` (*Note: Elixir strings are UTF-8 binaries. You might treat them as a subtype of binary or a distinct primary type in your model. For simplicity here, let's assume distinct for now, or you'd have predicates like `{0, :is_binary_utf8}` after `{0, :is_binary}`*) +* `v_is_tuple = {0, :is_tuple}` + +**Category 1: Atom-Specific Predicates** +If `is_atom` is true, what specific atom is it? +Order by the atom itself. +* `v_atom_eq_false = {1, :value, false}` +* `v_atom_eq_nil = {1, :value, nil}` +* `v_atom_eq_true = {1, :value, true}` +* `v_atom_eq_specific_A = {1, :value, :an_atom}` (e.g., `:an_atom` comes after `true`) +* ... (all known/relevant atoms in your system, ordered canonically) + +**Category 2: Integer-Specific Predicates** +If `is_integer` is true: +You need a canonical way to represent integer conditions. +* Equality: `v_int_eq_N = {2, :eq, N}` (e.g., `{2, :eq, 0}`, `{2, :eq, 10}`) + * Order by N. +* Less than: `v_int_lt_N = {2, :lt, N}` (e.g., `{2, :lt, 0}`, `{2, :lt, 10}`) + * Order by N. +* Greater than: `v_int_gt_N = {2, :gt, N}` (e.g., `{2, :gt, 0}`, `{2, :gt, 10}`) + * Order by N. +* Set membership for finite sets: `v_int_in_SET = {2, :in, Enum.sort(SET)}` (e.g. `{2, :in, [1,2,3]}`) + * Order by the canonical (sorted list) representation of SET. + * *This gets complex. Often, BDDs for integers use bit-level tests, but for set-theoretic types, range/specific value tests are more natural.* You might limit this to a predefined, finite set of "interesting" integer predicates. + +**Category 3: String-Specific Predicates** +If `is_string` is true: +* Equality: `v_string_eq_S = {3, :eq, S}` (e.g., `{3, :eq, "foo"}`) + * Order by S lexicographically. +* Length: `v_string_len_eq_L = {3, :len_eq, L}` + * Order by L. +* Prefix: `v_string_prefix_P = {3, :prefix, P}` + * Order by P lexicographically. +* (Suffix, regex match, etc., can be added with consistent ordering rules) + +**Category 4: Tuple-Specific Predicates** +If `is_tuple` is true: +1. **Size first:** + * `v_tuple_size_eq_N = {4, :size, N}` (e.g., `{4, :size, 0}`, `{4, :size, 2}`) + * Order by N. +2. **Element types (recursive structure in variable identifier):** + For a tuple of a *given size*, we then check its elements. The predicate for an element will re-use the *entire variable ordering scheme* but scoped to that element. + * `v_tuple_elem_I_PRED = {4, :element, index_I, NESTED_PREDICATE_ID}` + * Order by `index_I` first. + * Then order by `NESTED_PREDICATE_ID` (which itself is one of these `{category, type, value}` tuples). + * Example: Is element 0 an atom? `v_el0_is_atom = {4, :element, 0, {0, :is_atom}}` + * Example: Is element 0 the atom `:foo`? `v_el0_is_foo = {4, :element, 0, {1, :value, :foo}}` + * Example: Is element 1 an integer? `v_el1_is_int = {4, :element, 1, {0, :is_integer}}` + This ensures that all questions about element 0 come before element 1, and for each element, the standard hierarchy of questions is asked. + +**Category 5: Map-Specific Predicates** +If `is_map` is true: This is the most complex. +1. **Size (optional, but can be useful):** + * `v_map_size_eq_N = {5, :size, N}` + * Order by N. +2. **Key Presence:** + * `v_map_has_key_K = {5, :has_key, K}` (e.g., `{5, :has_key, "id"}`, `{5, :has_key, :name}`) + * Order by K (canonically, e.g., strings before atoms, then lexicographically/atom-order). +3. **Key Value Types (recursive structure):** + For a *given key K* that is present: + * `v_map_key_K_value_PRED = {5, :key_value, K, NESTED_PREDICATE_ID}` + * Order by K (canonically). + * Then order by `NESTED_PREDICATE_ID`. + * Example: Does map have key `:id` and is its value a string? + * First variable: `v_map_has_id = {5, :has_key, :id}` + * If yes, next variable: `v_map_id_val_is_str = {5, :key_value, :id, {0, :is_string}}` +4. **Predicates for "all other keys" / "pattern keys":** + This is needed for types like `%{String.t() => integer()}`. + * `v_map_pattern_key_PRED_value_PRED = {5, :pattern_key, KEY_TYPE_PREDICATE_ID, VALUE_TYPE_PREDICATE_ID}` + * Example: For `%{String.t() => integer()}`: + * Key type predicate: `{0, :is_string}` + * Value type predicate for such keys: `{0, :is_integer}` + * Variable ID: `{5, :pattern_key, {0, :is_string}, {0, :is_integer}}` + * These pattern key predicates should likely be ordered *after* specific key predicates. The exact sorting of `KEY_TYPE_PREDICATE_ID` needs careful thought (e.g. `(0, :is_atom)` before `(0, :is_string)`). + +**Category 6: List-Specific Predicates** +If `is_list` is true: +1. **Is Empty:** + * `v_list_is_empty = {6, :is_empty}` +2. **Head/Tail Structure (if not empty, recursive):** + This mirrors how types like `nonempty_list(H, T)` are defined. + * `v_list_head_PRED = {6, :head, NESTED_PREDICATE_ID}` + * `v_list_tail_PRED = {6, :tail, NESTED_PREDICATE_ID}` (Note: `NESTED_PREDICATE_ID` for tail would again be list predicates like `{6, :is_empty}` or `{6, :head, ...}`) + * Example: Head is an atom: `{6, :head, {0, :is_atom}}` + * Example: Tail is an empty list: `{6, :tail, {6, :is_empty}}` + * All head predicates come before all tail predicates. + +**Category 7: Function-Specific Predicates** +If `is_function` is true: +1. **Arity:** + * `v_fun_arity_eq_A = {7, :arity, A}` + * Order by A. +2. **Argument Types (very complex, may need simplification for TDDs):** + * `v_fun_arg_I_PRED = {7, :arg, index_I, NESTED_PREDICATE_ID}` +3. **Return Type (also complex):** + * `v_fun_return_PRED = {7, :return, NESTED_PREDICATE_ID}` + * Function types are often represented by separate structures or simplified in TDDs due to their higher-order nature. Full function type checking within this TDD variable scheme would be extremely elaborate. + +**Binary, Float, Pid, Port, Reference Predicates:** +These would get their own categories (e.g., 8, 9, 10...). +* **Floats:** `{X, :is_float}` -> `{X, :eq, F}`, `{X, :lt, F}`, etc. +* **Binaries:** `{Y, :is_binary}` -> `{Y, :size, S}`, `{Y, :matches_pattern, Pat}` (e.g. `<>`) + +**Polymorphic Variables (`alpha`, `beta`, etc.)** +Polymorphic variables are part of the *type language*, not properties of concrete values. A TDD represents a set of *concrete values*. +* When you construct a TDD for a type like `list(alpha)`, where `alpha` is free, `alpha` essentially means `any`. So, for predicates concerning list elements, they would all go to their `dc` (don't care) branches, ultimately leading to `true`. +* If `alpha` is bound (e.g., in `(alpha -> alpha) where alpha = integer`), you first resolve `alpha` to `integer` and then build the TDD for `(integer -> integer)`. +* So, "free polymorphic variables" don't become TDD variables themselves. They influence which branches are taken during TDD construction for types containing them, often mapping to `any` or `dc` paths. + +**"Kinds"** +The "kind" of a type (e.g., `Type.Atom`, `Type.Integer`) is meta-information. The TDD *represents* a type. The variables within the TDD are about properties of *values* that belong to that type. You wouldn't have a TDD variable `is_kind_atom?`. Instead, the TDD for the type `atom` would start with the `v_is_atom = {0, :is_atom}` variable. + +**The `dc` (Don't Care) Branch** +The `dc` branch for a variable `v` in a node means that for the type being represented, the outcome of the test `v` is irrelevant to membership *given the path taken to reach this node*. +* For type `any`, all variables would effectively go to `dc`, all leading to the `true_terminal`. +* For type `integer`, the test `v_is_atom`: + * `yes` branch -> `false_terminal` (an atom is not an integer) + * `no` branch -> continues to test `v_is_integer`, etc. + * `dc` branch: This is the tricky part in TDDs for types. In some formulations, `dc(N)` is `yes(N) U no(N)`. If a type truly doesn't care about `v_is_atom` (e.g., `any`), then `yes`, `no`, and `dc` might all point to the same child representing `any` for subsequent tests. + * For `ROBDD` reduction with `dc`, if `yes_child == no_child == dc_child`, the node is redundant. If `yes_child == no_child`, the test `v` is locally irrelevant, and the node might be simplified (parent points to `yes_child`, and the `dc` path needs careful handling depending on the TDD algebra). Standard BDD reduction (if `yes_child == no_child`, eliminate node) assumes only two children. You'll need TDD-specific reduction rules. + +**Summary of Ordering Logic:** +1. Use tuples as variable identifiers: ` {category_num, predicate_type_atom, value_or_nested_id}`. +2. Sort these tuples using Elixir's default tuple comparison. This establishes the global order. +3. `category_num` provides the coarsest level of ordering. +4. Within a category, `predicate_type_atom` provides the next level. +5. Finally, `value_or_nested_id` provides the finest level, potentially recursively applying the same ordering logic for nested structures. + +This structured approach to variable identifiers is crucial for the "ordered" property of your ROBDD. When implementing `apply` operations (union, intersect, negate), you'll always compare the variable IDs of the current nodes from the two TDDs and operate on the one that comes first in the global order. diff --git a/test.exs b/test.exs new file mode 100644 index 0000000..d8eb966 --- /dev/null +++ b/test.exs @@ -0,0 +1,363 @@ +defmodule Tdd do + @moduledoc """ + Ternary decision diagram for set-theoretic types. + """ + + # --- Terminal Node IDs --- + @false_node_id 0 + @true_node_id 1 + + # --- State: Managed via module attributes (for simplicity in this example) --- + # @nodes: Map from {var, yes_id, no_id, dc_id} -> node_id + # @node_by_id: Map from node_id -> {var, yes_id, no_id, dc_id} + # @next_id: Integer for the next available node ID + # @op_cache: Cache for apply operations: {{op_name, id1, id2} -> result_id} + + defguard is_terminal_id(id) when id == @false_node_id or id == @true_node_id + + # Initialize state (call this once, or ensure it's idempotent) + def init_tdd_system do + # Node 0 represents FALSE, Node 1 represents TRUE. + # These don't have var/children, so we represent them specially or handle them in functions. + # For @node_by_id, we can store a marker. + Process.put(:nodes, %{}) + Process.put(:node_by_id, %{@false_node_id => :false_terminal, @true_node_id => :true_terminal}) + Process.put(:next_id, 2) # Start after true/false + Process.put(:op_cache, %{}) + IO.puts("TDD system initialized.") + end + + # Helper to get current state + defp get_state do + %{ + nodes: Process.get(:nodes, %{}), + node_by_id: Process.get(:node_by_id, %{@false_node_id => :false_terminal, @true_node_id => :true_terminal}), + next_id: Process.get(:next_id, 2), + op_cache: Process.get(:op_cache, %{}) + } + end + + # Helper to update state + defp update_state(changes) do + current_state = get_state() + new_state = Map.merge(current_state, changes) + Process.put(:nodes, new_state.nodes) + Process.put(:node_by_id, new_state.node_by_id) + Process.put(:next_id, new_state.next_id) + Process.put(:op_cache, new_state.op_cache) + end + + # --- Node Creation and Reduction --- + # This is the core function to get or create a node. + # It implements the reduction rules. + def make_node(variable, yes_id, no_id, dc_id) do + # Reduction Rule: If all children are identical, this node is redundant. + # (This is a strong form of reduction for TDDs) + if yes_id == no_id && yes_id == dc_id do + # IO.puts "Reduction (all children same): var #{inspect variable} -> #{yes_id}" + yes_id + else + state = get_state() + node_tuple = {variable, yes_id, no_id, dc_id} + + if Map.has_key?(state.nodes, node_tuple) do + # Node already exists + state.nodes[node_tuple] + else + # Create new node + new_id = state.next_id + # IO.puts "Creating node #{new_id}: #{inspect variable}, y:#{yes_id}, n:#{no_id}, d:#{dc_id}" + update_state(%{ + nodes: Map.put(state.nodes, node_tuple, new_id), + node_by_id: Map.put(state.node_by_id, new_id, node_tuple), + next_id: new_id + 1 + }) + new_id + end + end + end + + # Get details of a node (useful for apply and debugging) + def get_node_details(id) when is_terminal_id(id) do + if id == @true_node_id, do: :true_terminal, else: :false_terminal + end + def get_node_details(id) do + state = get_state() + state.node_by_id[id] # Returns {var, yes, no, dc} or nil + end + + + # --- Variable Definitions (subset for now) --- + # Category 0: Primary Type Discriminators + @v_is_atom {0, :is_atom} + @v_is_tuple {0, :is_tuple} + # Add other primary types here in order, e.g., {0, :is_integer}, {0, :is_list} + + # Category 1: Atom-Specific Predicates + def v_atom_eq(atom_val), do: {1, :value, atom_val} + + # Category 4: Tuple-Specific Predicates + def v_tuple_size_eq(size), do: {4, :size, size} + def v_tuple_elem_pred(index, nested_pred_id), do: {4, :element, index, nested_pred_id} + + # --- Basic Type Constructors --- + # These construct TDDs representing specific types. + # For simplicity, they assume an "empty" background (all other types are false). + # `dc_id` is often `@false_node_id` in constructors because if a variable is "don't care" + # for this specific type, it means it *could* be something that makes it NOT this type. + + def type_any, do: @true_node_id + def type_none, do: @false_node_id + + # Represents the type `atom()` (any atom) + def type_atom do + # The variable order means we must consider @v_is_tuple *before* specific atom properties + # if we were to build from a full list of variables. + # However, for a simple constructor, we focus on the defining properties. + # The `apply` algorithm will handle interactions with other types correctly due to variable ordering. + + # If it's an atom, it's true for this type. + # If it's not an atom, it's false for this type. + # If we "don't care" if it's an atom, it's not specific enough to be `type_atom`, so false. + make_node(@v_is_atom, @true_node_id, @false_node_id, @false_node_id) + end + + # Represents a specific atom literal, e.g., `:foo` + def type_atom_literal(atom_val) do + var_eq = v_atom_eq(atom_val) # e.g., {1, :value, :foo} + + # Node for specific atom value: if value matches, true; else false. + atom_val_node = make_node(var_eq, @true_node_id, @false_node_id, @false_node_id) + + # Node for primary type: if is_atom, check value; else false. + make_node(@v_is_atom, atom_val_node, @false_node_id, @false_node_id) + end + + # Represents `tuple()` (any tuple) + def type_tuple do + make_node(@v_is_tuple, @true_node_id, @false_node_id, @false_node_id) + end + + # Represents an empty tuple `{}` + def type_empty_tuple do + var_size_0 = v_tuple_size_eq(0) + tuple_size_node = make_node(var_size_0, @true_node_id, @false_node_id, @false_node_id) + make_node(@v_is_tuple, tuple_size_node, @false_node_id, @false_node_id) + end + + # Represents a tuple of a specific size with any elements, e.g., `{any, any}` for size 2 + def type_tuple_sized_any(size) do + var_size = v_tuple_size_eq(size) + # For element checks, 'any' means the element specific predicates would all go to their dc child, + # which leads to true. So, for representing `tuple_of_size_N_with_any_elements`, + # we only need to check the size after confirming it's a tuple. + tuple_size_node = make_node(var_size, @true_node_id, @false_node_id, @false_node_id) + make_node(@v_is_tuple, tuple_size_node, @false_node_id, @false_node_id) + end + + # --- The APPLY Algorithm --- + # apply(op_lambda, u1_id, u2_id) + # op_lambda takes (terminal1_val, terminal2_val) and returns resulting terminal_val + # e.g., for OR: fn :true_terminal, _ -> :true_terminal; _, :true_terminal -> :true_terminal; _,_ -> :false_terminal end + # which then maps to @true_node_id or @false_node_id. + + def apply(op_name, op_lambda, u1_id, u2_id) do + state = get_state() + # Ensure cache_key is canonical for commutative operations + cache_key = {op_name, Enum.sort([u1_id, u2_id])} + + cond do + # 1. Check cache first + Map.has_key?(state.op_cache, cache_key) -> + state.op_cache[cache_key] + + # 2. Base case: Both u1 and u2 are terminals + is_terminal_id(u1_id) && is_terminal_id(u2_id) -> + res_terminal_symbol = op_lambda.(get_node_details(u1_id), get_node_details(u2_id)) + # Result is a terminal ID, no need to cache these simple computations directly, + # the calls leading to them will be cached. + if res_terminal_symbol == :true_terminal, do: @true_node_id, else: @false_node_id + + # 3. Recursive case: At least one is non-terminal. + # Handle sub-cases where one is terminal and the other is not. + true -> + # Pre-check for terminal optimizations specific to the operation + # This can make some operations (like sum with @true_node_id) much faster. + # For sum: + # if u1_id == @true_node_id or u2_id == @true_node_id, result is @true_node_id (if op is sum) + # if u1_id == @false_node_id, result is u2_id (if op is sum) + # if u2_id == @false_node_id, result is u1_id (if op is sum) + # These are handled by op_lambda if both are terminals. If one is terminal, + # the recursive calls below will hit that. + + u1_details = get_node_details(u1_id) + u2_details = get_node_details(u2_id) + + result_id = + cond do + # Case 3a: u1 is terminal, u2 is not + u1_details == :true_terminal or u1_details == :false_terminal -> + {var2, y2, n2, d2} = u2_details # u2 must be a node + res_y = apply(op_name, op_lambda, u1_id, y2) + res_n = apply(op_name, op_lambda, u1_id, n2) + res_d = apply(op_name, op_lambda, u1_id, d2) + make_node(var2, res_y, res_n, res_d) + + # Case 3b: u2 is terminal, u1 is not + u2_details == :true_terminal or u2_details == :false_terminal -> + {var1, y1, n1, d1} = u1_details # u1 must be a node + res_y = apply(op_name, op_lambda, y1, u2_id) + res_n = apply(op_name, op_lambda, n1, u2_id) + res_d = apply(op_name, op_lambda, d1, u2_id) + make_node(var1, res_y, res_n, res_d) + + # Case 3c: Both u1 and u2 are non-terminal nodes + true -> + {var1, y1, n1, d1} = u1_details + {var2, y2, n2, d2} = u2_details + + top_var = + cond do + var1 == var2 -> var1 + # Elixir's tuple comparison provides the global variable order + var1 < var2 -> var1 + true -> var2 # var2 < var1 + end + + # Recursive calls: if a variable is not the top_var for a TDD, + # that TDD is passed through unchanged to the next level of recursion. + res_y = + cond do + top_var == var1 && top_var == var2 -> apply(op_name, op_lambda, y1, y2) + top_var == var1 -> apply(op_name, op_lambda, y1, u2_id) + true -> apply(op_name, op_lambda, u1_id, y2) + end + + res_n = + cond do + top_var == var1 && top_var == var2 -> apply(op_name, op_lambda, n1, n2) + top_var == var1 -> apply(op_name, op_lambda, n1, u2_id) + true -> apply(op_name, op_lambda, u1_id, n2) + end + + res_d = + cond do + top_var == var1 && top_var == var2 -> apply(op_name, op_lambda, d1, d2) + top_var == var1 -> apply(op_name, op_lambda, d1, u2_id) + true -> apply(op_name, op_lambda, u1_id, d2) + end + make_node(top_var, res_y, res_n, res_d) + end + + # Cache the result of this (op, u1, u2) call + update_state(%{op_cache: Map.put(state.op_cache, cache_key, result_id)}) + result_id + end + end + + # --- Set Operations --- + def sum(tdd1_id, tdd2_id) do + op_lambda_sum = fn + (:true_terminal, _) -> :true_terminal # true or X = true + (_, :true_terminal) -> :true_terminal # X or true = true + (:false_terminal, t2_val) -> t2_val # false or X = X + (t1_val, :false_terminal) -> t1_val # X or false = X + end + apply(:sum, op_lambda_sum, tdd1_id, tdd2_id) + end + + # Placeholder for intersect and negate + # def intersect(one, two) do end + # def negate(one) do end + + + # --- Helper to inspect the TDD structure (for debugging) --- + def print_tdd(id, indent \\ 0) do + prefix = String.duplicate(" ", indent) + details = get_node_details(id) + IO.puts "#{prefix}ID #{id}: #{inspect details}" + case details do + {_var, y, n, d} -> + IO.puts "#{prefix} Yes ->" + print_tdd(y, indent + 1) + IO.puts "#{prefix} No ->" + print_tdd(n, indent + 1) + IO.puts "#{prefix} DC ->" + print_tdd(d, indent + 1) + :true_terminal -> :ok + :false_terminal -> :ok + nil -> IO.puts "#{prefix} Error: Unknown ID #{id}" + end + end + +end + +# --- Example Usage --- +# Initialize the system (important!) +Tdd.init_tdd_system() + +IO.puts "\n--- TDD for :foo ---" +tdd_foo = Tdd.type_atom_literal(:foo) +Tdd.print_tdd(tdd_foo) +# Expected: +# ID 4: {{0, :is_atom}, 3, 0, 0} +# Yes -> +# ID 3: {{1, :value, :foo}, 1, 0, 0} +# Yes -> +# ID 1: :true_terminal +# No -> +# ID 0: :false_terminal +# DC -> +# ID 0: :false_terminal +# No -> +# ID 0: :false_terminal +# DC -> +# ID 0: :false_terminal + +IO.puts "\n--- TDD for :bar ---" +tdd_bar = Tdd.type_atom_literal(:bar) +Tdd.print_tdd(tdd_bar) +# Similar structure, but with {1, :value, :bar} and different intermediate IDs like 6, 5. + +IO.puts "\n--- TDD for :foo or :bar ---" +tdd_foo_or_bar = Tdd.sum(tdd_foo, tdd_bar) +Tdd.print_tdd(tdd_foo_or_bar) + +# Expected structure for tdd_foo_or_bar (variable order matters): +# Top var: {0, :is_atom} +# Yes branch: (node for :foo or :bar, given it's an atom) +# Top var: {1, :value, :bar} (assuming :bar < :foo lexicographically for sorting example, or however Elixir sorts them) +# Yes branch (:bar is true): true_terminal +# No branch (:bar is false): (node for checking :foo) +# Top var: {1, :value, :foo} +# Yes branch (:foo is true): true_terminal +# No branch (:foo is false): false_terminal +# DC branch: false_terminal +# DC branch: false_terminal (if value is don't care, it's not :bar) +# No branch: false_terminal (not an atom) +# DC branch: false_terminal (is_atom is don't care) + +IO.puts "\n--- TDD for empty tuple {} ---" +tdd_empty_tuple = Tdd.type_empty_tuple() +Tdd.print_tdd(tdd_empty_tuple) +# Expected: +# ID X: {{0, :is_tuple}, Y, 0, 0} +# Yes -> +# ID Y: {{4, :size, 0}, 1, 0, 0} +# ... leads to true/false ... +# No -> ID 0 +# DC -> ID 0 + +IO.puts "\n--- TDD for :foo or {} ---" +tdd_foo_or_empty_tuple = Tdd.sum(tdd_foo, tdd_empty_tuple) +Tdd.print_tdd(tdd_foo_or_empty_tuple) +# Expected structure (variable order {0, :is_atom} < {0, :is_tuple}): +# Top var: {0, :is_atom} +# Yes branch: (node for :foo, given it's an atom, and not a tuple) +# ... (structure of tdd_foo, but false branch of {0, :is_tuple} might point here) +# No branch: (node for {}, given it's not an atom) +# ... (structure of tdd_empty_tuple, but false branch of {0, :is_atom} might point here) +# DC branch: (complicated, depends on how dc propagates for sum) +# For sum, if `is_atom` is dc for foo and for tuple, then result is dc. +# Since our constructors use false for dc, `apply(:sum, false_id, false_id)` for dc branches yields `false_id`. +# So, the DC branch of the root will be false.