Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need compare method on cl:hash-table #70

Open
dtenny opened this issue Feb 24, 2025 · 7 comments
Open

Need compare method on cl:hash-table #70

dtenny opened this issue Feb 24, 2025 · 7 comments

Comments

@dtenny
Copy link

dtenny commented Feb 24, 2025

Obviously subjective, however given the existence of #58
it seems like the right thing to do to me (in part because I just tripped on it).

If you're supporting structural equivalence for cl:vector and other collections, excluding cl:hash-table seems inconsistent. In the example I just tripped over I had cl:hash-table types which were themselves keys in an FSet map. I was then comparing to a structurally equivalent hierarchy but one which had fset maps instead of cl:hash-tables and fset:equal? said nil.

That said, if you ever add a way for FSet users to specify equality and comparison behavior then I can do it myself. And, to the extend I'm not relying on Fset key comparison for sets and maps, I can use my own external comparison, which was punting to FSet:equal? in this case for fset types.

In my library's case, I strive for structural equivalence for all cl:list, cl:vector, cl:hash-table and fset types, as well as my own persistent list and queue types.

@slburson
Copy link
Owner

slburson commented Feb 27, 2025

It's very unclear to me how to write a compare method on hash tables. If they're eq or eql tables, calling compare on pairs of keys makes no sense, because the equivalence relation it induces is coarser than those predicates; for instance, such a table can map two different instances of a list that prints as (foo) to two different values. So while I might be able to tell whether two such tables are equal or unequal, I don't offhand see a way to order them. But even telling whether they're equal is problematic, because it requires knowing the correct equivalence relation to apply to the values as well, and that's not at all clear.

Some implementations, including SBCL and Franz Allegro but not Clozure, will let you specify a nonstandard hash function and test to make-hash-table. So, on those implementations, you could use :test 'fset:equal? :hash-function 'fset:hash-value. (I still consider fset:hash-value experimental, but it does exist.) Then, compare on hash tables could simply refuse to work on any other kind of table. That's the only approach I can see that makes complete sense semantically.

Currently, cl:equal and fset:equal? behave almost identically on most built-in types, so I could almost see pretending that they're the same for this purpose. That is, we would allow compare to work on equal hash tables, using compare to compare their keys and values. The catch with this is, you're pushing for a divergence between equal and equal?. To take an example, suppose table A maps (1 2 3) to 17, and table B maps #(1 2 3) to 17. Are those tables equal? You want to say that they are, and I sympathize, but that's not consistent with equal semantics.

And then there's equalp, which is even worse, because it's coarser than equal?. If table A maps "foo" to "bar" and table B maps "FOO" to "bar", are they equal? The semantics of equalp say that they are. But mercifully, equalp tables are rarely used in practice.

And beyond semantics, there's an implementation difficulty. CL doesn't have hash table iterators; it only has maphash, which doesn't permit two hash tables to be iterated in parallel. This is not a barrier if we only care about equality, but it makes comparison difficult if the key sets of the two tables are not equal. Do you have a plan for this? Maybe I can come up with one, but it's not obvious. I suppose, if we force the tables to use fset:equal? anyway, we could convert them to maps, but that's rather expensive.

In short, this has "can of worms" written all over it. I'm inclined to punt.

@slburson
Copy link
Owner

BTW, it has always been possible for users to specify equality and comparison behavior on types FSet doesn't already support, or even override the behavior on types it does support (just be aware this will invalidate any existing collections in your session that use the type). It's easy to add methods on compare. Look in order.lisp for examples.

@dtenny
Copy link
Author

dtenny commented Feb 28, 2025

Yeah, understood about the problems and difficulty, this was always a wish-list item, and I thought I saw somewhere in FSet docs or discussion that it might be a possibility someday so I figured there was no harm in asking :-) Also noted, I use cl:equal for cl hashtables because equalp ignores string case, and that's not acceptable.

For my clojure APIs, the cl:hash-table, fset:set, and fset:map key comparison semantics are restrictive compared to the usual clojure semantics. I can live with that for now and try to document the issue ad nauseum.

I expect it is likely I'll need to implement custom set/map/(mutable)hashtable classes down the road if I really want to solve it. Unsorted sets are pretty easy since all I need is the equality function (which works great with the CL lists-as-set interfaces too). Whether I really want to solve it for hash tables is another issue, but right now my hands are full with other things.

On the cl:hash-table thing, and not pushing at all, but say you have an fset set and you feed it two identical hash tables with the same equality predicate. The wish is that they would compare equal instead of being treated as separate instances because they are not EQ. Again, totally understand not wanting to gum up FSet with this stuff. Will leave you in peace with thanks for the the stuff which has been working very well.

@dtenny
Copy link
Author

dtenny commented Feb 28, 2025

Oops, missed that second post about order.lisp, will have a look!

@slburson
Copy link
Owner

slburson commented Mar 1, 2025

For my clojure APIs, the cl:hash-table, fset:set, and fset:map key comparison semantics are restrictive compared to the usual clojure semantics.

In what way?

@dtenny
Copy link
Author

dtenny commented Mar 1, 2025

For my clojure APIs, the cl:hash-table, fset:set, and fset:map key comparison semantics are restrictive compared to the usual clojure semantics.

In what way?

In pure clojure comparisons, equivalence is mostly extended to structural map equivalence, lists can be compared with vectors, and seqs will let you compare otherwise uncompariable things, e.g.

(= {:a 1 :b 2} [[:a 1] [:b 2]]) => false
but
(= (seq {:a 1 :b 2}) [[:a 1] [:b 2]]) => true

For my Common Lisp version of the APIs, the equivalence is also extended to CL hashtables, CL vectors, and CL lists, so there's a bit more combinatorics in what compares.

The upshot is that in clojure "=" can hide a lot of data boundaries.

@slburson
Copy link
Owner

slburson commented Mar 1, 2025

FSet has similar functionality:

FSET-USER> (equal? (convert 'seq (map (:a 1) (:b 2))) (seq '(:a . 1) '(:b . 2)))
T
;;; or if you don't like the dotted pairs...
FSET-USER> (equal? (convert 'seq (map (:a 1) (:b 2)) :pair-fn #'list) (seq '(:a 1) '(:b 2)))
T

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants