> A perfect hash table allows for constant search time.
>
> At runtime, you cannot generally generate a perfect hash table
> (without overhead that is often not worth it), so O(log N) is
> usually ok.
>
> But you're explicitly talking about compile-time constant strings,
> and in this case, you can usually make a perfact hash table. So
> that's your comparison.
>
> As I wrote in PM, have you looked at programs such as gperf?
>
> These are some of the interesting issues when dealing with a lot of
> compile-time constant strings, rather than squeezing out a byte or
> two in return for worse performance and unstable programs.
Making a perfect hash requires 7 additional tables, 3 independent hashes of the string, 3 tables to convert the hash values into summands, and one table to convert the sums into pointers to the strings. (There is a trick that can reduce this to 5 tables, but it makes the first hashing round slower.)
On the other hand, organizing the indices into a heap, as I suggested, requires no extra storage.
As I pointed out in earlier posts, the issue of whether to use a hash, or a tree, or even a hash tree is independent of how the strings are stored. It just happens that my method of storing the strings allows you to create a heap that speeds up searching without using any additional storage.
Frank Rubin