[scheme-reports-wg2] Re: [Scheme-reports] Seeking review of sets and hash tables proposals

Show/hide message thread
[scheme-reports-wg2] Re: [Scheme-reports] Seeking review of sets and hash tables proposals John Cowan (26 May 2013 16:04 UTC)
Re: [Scheme-reports] Seeking review of sets and hash tables proposals Alexey Radul (27 May 2013 01:08 UTC)
[scheme-reports-wg2] Re: [Scheme-reports] Seeking review of sets and hash tables proposals John Cowan (28 May 2013 01:51 UTC)
[scheme-reports-wg2] Re: [Scheme-reports] Seeking review of sets and hash tables proposals John Cowan 26 May 2013 16:04 UTC
Alexey Radul scripsit:

> I am not sure of the value of specifying names for functional update
> procedures that are permitted to take O(n) time.

Okay, I'm convinced.  The argument I find compelling is that it's normally
the ! procedures that require extra care and thought (e.g. `reverse`
is safe, `reverse!` is only sometimes safe), whereas it will be the
non-! procedures that require it in this case, and that's confusing.
In any case, it's easy to provide them with `hash-table-copy!`, and
not obvious that there is any better way.

> I am fond of the following universal lookup procedure [1]:
>   (hash-table-search table key
>     (lambda (value update remove)
>       ...)
>     (lambda (insert)
>       ...))

Nice one.  Added.

> To the extent this item contains a suggestion, it is to consider
> specifying a `search' procedure for hash tables, sets, and bags, [2]

I'll think about it for sets and bags.

> [3] set-search could be simpler, because `value' is always #t and
> `update' makes no sense, so those could go away.  For bag-search,
> `update' still makes no sense, but `remove' and `insert' could accept
> an exact non-negative integer to mean "how many times".

Indeed.

> If a set is to be implemented as a hash table, how is an appropriate
> hash function to be derived from the equivalence predicate?

This is precisely why we need a way to bind them together.  Currently,
however, going beyond the Five Equivalences for sets/bags is an option
implementations need not support.

> If the programmer's intent is that the hash table's values can be
> accessed by traversing the hash table, then the entry should be kept.
> If the programmer's intent is that the only way to use the hash table
> to get one of the values is by looking up the key, then the entry
> should be removed and the key and value garbage collected.

I don't understand the first kind of intention: can you give a concrete
example?  The way it looks to me, non-ephemeral hash tables are just
(possibly tolerably) buggy versions of ephemeral ones.

> It seems as though the in-place mutation of hash-table-map! would be
> considerably more useful (for efficiency) if the proc were not allowed
> to change the keys to inequivalent ones.

The original idea, which is not well explained, is that the association
being processed by map! is conceptually removed and the new association
inserted in its place.  But you're right that the user really shouldn't
be allowed to mutate the keys a hash table contains while scanning it.

So I've removed `hash-table-map` in favor of `hash-table-update-each!`
which is like `hash-table-for-each` except that the value returned is
used to update the association.  For the same reason I have also removed
`hash-table-remove!` from the draft; what it wants to do should be done
with `hash-table-delete-keys!`, which is allowed to be efficient.

> It also seems that leaving the key alone would be a common use case
> for hash-table-map; is it worth specifying a variant that accepts
> this?  Or specifying that the procedure given to hash-table-map may
> return either one value or two, with the second being assumed to be
> the key if absent?

I don't see that knowing this allows you to optimize anything.

> I am surprised by the proposal to allow the merger argument to the
> various hash table manipulation procedures to return an arbitrary
> new key as a result of a collision.  What purpose does this serve?
> Its cost strikes me as high: what if the merger returns a key that is
> not equivalent to either of its input keys?  What if that key conflicts
> with a different key already present in the result being built?  This is
> also inconsistent with the mergers specified for hash-table-union!
> and hash-table-intersection!.

Those are pure value mergers, but they now take the keys as arguments
as well, since they may provide useful context.

> Why is the failure argument to hash-table-replace! mandatory?  If not
> a typo, that seems like a useless irregularity.

Because the default failure continuation is to signal an error, so
`(hash-table-replace hash-table key)` would be the same as
`(hash-table-ref hash-table key)`, which doesn't mutate anything.

> hash-table-push! should probably say "may be implemented more
> efficiently than" to be consistent

Done.

> Presumably all the hash table traversal operations should forbid any
> procedural arguments from mutating the hash table underfoot (with the
> possible exception of changing the binding of the key being
> inspected).

Added.

> Clarification in "hash tables as functions": presumably
> hash-table-mutator should accept a key and a value, rather than just a
> key; and presumably hash-table-replacer should return the result of
> the underlying hash-table-replace! rather than nothing.

Clarified systematically.

> MIT Scheme supports hash tables with arbitrary equivalence predicates:
> see strong-hash-table/constructor and weak-hash-table/constructor.

Added to the list of possible implementations.

--
Do what you will,                       John Cowan
   this Life's a Fiction                cowan@ccil.org
And is made up of                       http://www.ccil.org/~cowan
   Contradiction.  --William Blake

--
You received this message because you are subscribed to the Google Groups "scheme-reports-wg2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheme-reports-wg2+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.