In GBase 8a, TEXT and BLOB columns cannot have Hash indexes. The reason is a fundamental mismatch between the storage characteristics of large objects and the operating principles of Hash indexes.
How Hash Indexes Work
A Hash index computes a hash value (e.g., CRC32) from the indexed column, producing a fixed‑length integer key that points to the actual row. This requires stable, comparable input values that can be hashed quickly — typically columns with controllable length.
Four Conflicts Between TEXT/BLOB and Hash Indexes
Massive, variable‑length data:
TEXTcan hold up to 10,922 characters;BLOBranges from 32 KB to 64 MB. Hashing tens of KB or even MB of data consumes enormous I/O and CPU, defeating the purpose of an index.Fragmented or external storage: Large objects may be split into chunks or stored externally. A Hash index needs contiguous, complete data to guarantee the same hash across different reads. Fragmented storage breaks this stability, causing index misses.
Unpredictable cardinality: Long texts may have high cardinality but costly hashing; short or highly repetitive texts can cause excessive hash collisions, reducing filtering efficiency.
Rarely used with equality queries:
TEXTcolumns are typically used for full‑text search (LIKE '%...%'), andBLOBcolumns store binary streams. Hash indexes support only equality lookups — not range, prefix, or pattern matching — so they would be created and never used.
What GBase 8a Offers Instead
For TEXT, GBase 8a provides a full‑text search index — a purpose‑built solution that tokenizes text and supports LIKE and MATCH ... AGAINST. BLOB columns are usually left unindexed since they rarely appear in query conditions.
Trying to put a Hash index on a large‑object column is like measuring a football field with a caliper — the tool is completely wrong for the job. GBase 8a's restriction here is both reasonable and necessary, steering users toward the right indexing strategy for each data type in a gbase database.
Top comments (0)