public class SymbolTable extends Object
 For optimal performance, usage pattern should be one where matches
 should be very common (esp. after "warm-up"), and as with most hash-based
 maps/sets, that hash codes are uniformly distributed. Also, collisions
 are slightly more expensive than with HashMap or HashSet, since hash codes
 are not used in resolving collisions; that is, equals() comparison is
 done with all symbols in same bucket index.
 Finally, rehashing is also more expensive, as hash codes are not
 stored; rehashing requires all entries' hash codes to be recalculated.
 Reason for not storing hash codes is reduced memory usage, hoping
 for better memory locality.
Usual usage pattern is to create a single "master" instance, and either use that instance in sequential fashion, or to create derived "child" instances, which after use, are asked to return possible symbol additions to master instance. In either case benefit is that symbol table gets initialized so that further uses are more efficient, as eventually all symbols needed will already be in symbol table. At that point no more Symbol String allocations are needed, nor changes to symbol table itself.
Note that while individual SymbolTable instances are NOT thread-safe (much like generic collection classes), concurrently used "child" instances can be freely used without synchronization. However, using master table concurrently with child instances can only be done if access to master instance is read-only (ie. no modifications done).
| Modifier and Type | Field and Description | 
|---|---|
| protected static float | DEFAULT_FILL_FACTOR | 
| protected static int | DEFAULT_TABLE_SIZEDefault initial table size; no need to make it miniscule, due
 to couple of things: first, overhead of array reallocation
 is significant,
 and second, overhead of rehashing is also non-negligible. | 
| protected static String | EMPTY_STRING | 
| protected com.ctc.wstx.util.SymbolTable.Bucket[] | mBucketsOverflow buckets; if primary doesn't match, lookup is done
 from here. | 
| protected boolean | mDirtyFlag that indicates if any changes have been made to the data;
 used to both determine if bucket array needs to be copied when
 (first) change is made, and potentially if updated bucket list
 is to be resync'ed back to master instance. | 
| protected int | mIndexMaskMask used to get index from hash values; equal to
  mBuckets.length - 1, when mBuckets.length is
 a power of two. | 
| protected boolean | mInternStringsFlag that determines whether Strings to be added need to be
 interned before being added or not. | 
| protected int | mSizeCurrent size (number of entries); needed to know if and when
 rehash. | 
| protected int | mSizeThresholdLimit that indicates maximum size this instance can hold before
 it needs to be expanded and rehashed. | 
| protected String[] | mSymbolsPrimary matching symbols; it's expected most match occur from
 here. | 
| protected int | mThisVersionVersion of this table instance; used when deriving new concurrently
 used versions from existing 'master' instance. | 
| Constructor and Description | 
|---|
| SymbolTable()Method for constructing a master symbol table instance; this one
 will create master instance with default size, and with interning
 enabled. | 
| SymbolTable(boolean internStrings)Method for constructing a master symbol table instance. | 
| SymbolTable(boolean internStrings,
           int initialSize)Method for constructing a master symbol table instance. | 
| SymbolTable(boolean internStrings,
           int initialSize,
           float fillFactor)Main method for constructing a master symbol table instance; will
 be called by other public constructors. | 
| Modifier and Type | Method and Description | 
|---|---|
| double | calcAvgSeek() | 
| static int | calcHash(char[] buffer,
        int start,
        int len)Implementation of a hashing method for variable length
 Strings. | 
| static int | calcHash(String key) | 
| String | findSymbol(char[] buffer,
          int start,
          int len,
          int hash)Main access method; will check if actual symbol String exists;
 if so, returns it; if not, will create, add and return it. | 
| String | findSymbol(String str)Similar to to  findSymbol(char[],int,int,int); used to either
 do potentially cheap intern() (if table already has intern()ed version),
 or to pre-populate symbol table with known values. | 
| String | findSymbolIfExists(char[] buffer,
                  int start,
                  int len,
                  int hash)Similar to {link #findSymbol}, but will not add passed in symbol
 if it is not in symbol table yet. | 
| boolean | isDirectChildOf(SymbolTable t) | 
| boolean | isDirty() | 
| SymbolTable | makeChild()"Factory" method; will create a new child instance of this symbol
 table. | 
| void | mergeChild(SymbolTable child)Method that allows contents of child table to potentially be
 "merged in" with contents of this symbol table. | 
| void | setInternStrings(boolean state) | 
| int | size() | 
| int | version() | 
protected static final int DEFAULT_TABLE_SIZE
Let's use 128 as the default; it allows for up to 96 symbols, and uses about 512 bytes on 32-bit machines.
protected static final float DEFAULT_FILL_FACTOR
protected static final String EMPTY_STRING
protected boolean mInternStrings
protected String[] mSymbols
protected com.ctc.wstx.util.SymbolTable.Bucket[] mBuckets
Note: Number of buckets is half of number of symbol entries, on assumption there's less need for buckets.
protected int mSize
protected int mSizeThreshold
protected int mIndexMask
mBuckets.length - 1, when mBuckets.length is
 a power of two.protected int mThisVersion
protected boolean mDirty
public SymbolTable()
public SymbolTable(boolean internStrings)
public SymbolTable(boolean internStrings,
                   int initialSize)
public SymbolTable(boolean internStrings,
                   int initialSize,
                   float fillFactor)
internStrings - Whether Strings to add are intern()ed or notinitialSize - Minimum initial size for bucket array; internally
   will always use a power of two equal to or bigger than this value.fillFactor - Maximum fill factor allowed for bucket table;
   when more entries are added, table will be expanded.public SymbolTable makeChild()
Note: while data access part of this method is synchronized, it is generally not safe to both use makeChild/mergeChild, AND to use instance actively. Instead, a separate 'root' instance should be used on which only makeChild/mergeChild are called, but instance itself is not used as a symbol table.
public void mergeChild(SymbolTable child)
Note that caller has to make sure symbol table passed in is really a child or sibling of this symbol table.
public void setInternStrings(boolean state)
public int size()
public int version()
public boolean isDirty()
public boolean isDirectChildOf(SymbolTable t)
public String findSymbol(char[] buffer, int start, int len, int hash)
public String findSymbolIfExists(char[] buffer, int start, int len, int hash)
public String findSymbol(String str)
findSymbol(char[],int,int,int); used to either
 do potentially cheap intern() (if table already has intern()ed version),
 or to pre-populate symbol table with known values.public static int calcHash(char[] buffer,
                           int start,
                           int len)
len - Length of String; has to be at least 1 (caller guarantees
   this pre-condition)public static int calcHash(String key)
public double calcAvgSeek()
Copyright © 2022 FasterXML. All rights reserved.