LcsHash Class Reference

LcsHash class is used by LcsclusterAppendExecStream. More...

#include <LcsHash.h>

List of all members.

Public Member Functions

 LcsHash ()
 ~LcsHash ()
void init (PBuffer hashBlockInit, SharedLcsClusterNodeWriter clusterBlockWriterInit, TupleDescriptor const &colTupleDescInit, uint columnIdInit, uint blockSizeInit)
 Initializes the LcsHash object.
void insert (TupleDatum &colTupleDatum, LcsHashValOrd *valOrd, bool *undoInsert)
 Inserts a single column tuple into the hash table.
void insert (PBuffer dataWithLen, LcsHashValOrd *valOrd, bool *undoInsert)
 Inserts a data buffer of a column into the hash table.
void undoInsert (TupleDatum &colTupleDatum)
 Undoes the previous insert of a column tuple.
void undoInsert (PBuffer dataWithLen)
 Undoes the previous insert of a column data buffer.
void prepareFixedOrVariableBatch (uint8_t *rowArray, uint numRows)
 Prepares a fixed or variable batch to be written to the cluster block.
void prepareCompressedBatch (uint8_t *rowArray, uint numRows, uint16_t *numVals, uint16_t *offsetIndexVector)
 Prepares a compressed batch to be written to the cluster block.
void clearFixedEntries ()
 Clears the fixed values from batch to indicate the offset is not longer useful because the key storage can be relocated between batches.
void startNewBatch (uint leftOvers)
 Prepares LcsHash object for a new batch.
void restore (uint numVals, uint16_t lastValOff)
 Sets up hash with values from an existing cluster block.
uint getMaxValueSize ()
 Gets the maximum value length.
bool isHashFull (uint leftOvers=0)
 Checks if the hash table is full.

Private Member Functions

uint computeKey (PBuffer dataWithLen)
 Compue hash key from value.
bool search (uint key, PBuffer dataWithLen, LcsHashValOrd *valOrd, LcsHashValueNode **v)
 Search for ordinal using hash key and column data value.

Private Attributes

uint columnId
 column for which this LcsHash structure is built.
LcsHashTable hash
 LcsHashTable object contains logic to fit the data strcture into one block as well as logic to manage the hash entries and hash value nodes in the hash table.
SharedLcsClusterNodeWriter clusterBlockWriter
 Block writer object used to add new value to a cluster block.
TupleDescriptor colTupleDesc
 The column tuple descriptor.
TupleDataWithBuffer colTuple
 The column currently begin compressed.
UnalignedAttributeAccessor attrAccessor
 Attribute accessor of the column.
TupleDataWithBuffer searchTuple
 The column being compared against.
boost::scoped_array< FixedBuffercolTupleBuffer
 Scratch memory to store the current column value being compressed.
uint16_t valCnt
 Number of unique values in the current batch.
uint maxValueSize
 Largest value size in the current batch.
LcsUndoType undo
 Structure containing info for undoing the most recent hash insert.
uint8_tmagicTable
 Hash seed values.
uint numChecks
 Hash table statistics: number of hash key checks.
uint numMatches
 Hash table statistics: number of hash key matches.
SharedLcsCompareColKeyUsingOffsetIndex compareInst
 Helper class to LcsCompare.


Detailed Description

LcsHash class is used by LcsclusterAppendExecStream.

LcsClusterAppendExecStream splits up the columns in a cluster and passes column tuples(consisting of only one column) to LcsHash, which tranforms the tuple and sends to the LcsClusterNodeWriter data pointers with encoded length information.

Definition at line 447 of file LcsHash.h.


Constructor & Destructor Documentation

LcsHash::LcsHash (  )  [explicit]

Definition at line 61 of file LcsHash.cpp.

References columnId, MagicTable, magicTable, maxValueSize, numChecks, numMatches, and valCnt.

00062 {
00063     columnId                = 0;
00064     valCnt                  = 0;
00065     maxValueSize            = 0;
00066     magicTable              = MagicTable;
00067     numMatches              = 0;
00068     numChecks               = 0;
00069 }

LcsHash::~LcsHash (  )  [inline]

Definition at line 559 of file LcsHash.h.

00560     {
00561     }


Member Function Documentation

uint LcsHash::computeKey ( PBuffer  dataWithLen  )  [private]

Compue hash key from value.

Parameters:
[in] dataWithLen pointer to buffer with value and length info encoded at the first 1 or 2 bytes.
Returns:
hash key

Definition at line 510 of file LcsHash.cpp.

References attrAccessor, UnalignedAttributeAccessor::getStoredByteCount(), hash, magicTable, and LcsHashTable::numHashEntries().

Referenced by insert(), and restore().

00511 {
00512     uint8_t  keyVal[2] = {0,0}, oldKeyVal[2]={0,17};
00513     uint     i, colSize = attrAccessor.getStoredByteCount(dataWithLen);
00514 
00515     /*
00516      * Compute the hash key over all the bytes, inlcuding the length
00517      * bytes. This saves the implicit memcpy in loadValue.
00518      */
00519     for (i = 0;
00520          i < colSize;
00521          oldKeyVal[0] = keyVal[0], oldKeyVal[1] = keyVal[1], i++, dataWithLen++)
00522     {
00523         keyVal[0] = magicTable[oldKeyVal[0] ^ *dataWithLen];
00524         keyVal[1] = magicTable[oldKeyVal[1] ^ *dataWithLen];
00525     }
00526 
00527     return ((keyVal[1]<<8) + keyVal[0]) % hash.numHashEntries();
00528 }

bool LcsHash::search ( uint  key,
PBuffer  dataWithLen,
LcsHashValOrd valOrd,
LcsHashValueNode **  v 
) [private]

Search for ordinal using hash key and column data value.

Parameters:
[in] key hash key to locate
[in] dataWithLen pointer to data buffer with length info encoded
[out] valOrd hash value node ordinal number
[out] v hash value node if value is previously inserted
Returns:
true if a match in both the key and the data value is found

Definition at line 295 of file LcsHash.cpp.

References attrAccessor, clusterBlockWriter, colTuple, colTupleDesc, columnId, TupleDescriptor::compareTuples(), LcsHashTable::getFirstValueNode(), LcsHashTable::getNextValueNode(), hash, UnalignedAttributeAccessor::loadValue(), numChecks, numMatches, TupleDataWithBuffer::resetBuffer(), searchTuple, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.

Referenced by insert(), and restore().

00300 {
00301     LcsHashValueNode       *valueNode;
00302     bool    compareRes;
00303 
00304     attrAccessor.loadValue(colTuple[0], dataWithLen);
00305 
00306     for (valueNode = hash.getFirstValueNode(key);
00307          valueNode != NULL;
00308          valueNode = hash.getNextValueNode(valueNode))
00309      {
00310         numChecks++;
00311 
00312         /*
00313          * Skips invalid hash entries.
00314          * Entries were invalidated by clearFixedEntries.
00315          */
00316         if (valueNode->valueOffset == 0) {
00317             continue;
00318         }
00319 
00320         attrAccessor.loadValue(
00321             searchTuple[0],
00322             clusterBlockWriter->getOffsetPtr(columnId, valueNode->valueOffset));
00323 
00324         compareRes = colTupleDesc.compareTuples(colTuple, searchTuple);
00325 
00326         /*
00327          * Prepare for next loadValue.
00328          */
00329         searchTuple.resetBuffer();
00330 
00331         if (compareRes == 0) {
00332             numMatches++;
00333             *valOrd = valueNode->valueOrd;
00334             if (vNode) {
00335                 *vNode = valueNode;
00336             }
00337             colTuple.resetBuffer();
00338             return true;
00339         }
00340     }
00341 
00342     colTuple.resetBuffer();
00343     /*
00344      * No Match.
00345      */
00346     return false;
00347 }

void LcsHash::init ( PBuffer  hashBlockInit,
SharedLcsClusterNodeWriter  clusterBlockWriterInit,
TupleDescriptor const &  colTupleDescInit,
uint  columnIdInit,
uint  blockSizeInit 
)

Initializes the LcsHash object.

Parameters:
[in] hashBlockInit block to hold the hash table
[in] clusterBlockWriterInit reference to the node writer
[in] colTupleDescInit tuple descriptor for the column tuple
[in] columnIdInit column ID for which this LcsHash is compressing.
[in] blockSizeInit block size

Definition at line 71 of file LcsHash.cpp.

References attrAccessor, clusterBlockWriter, colTuple, colTupleBuffer, colTupleDesc, columnId, compareInst, UnalignedAttributeAccessor::compute(), TupleDataWithBuffer::computeAndAllocate(), FixedBuffer, UnalignedAttributeAccessor::getMaxByteCount(), hash, LcsHashTable::init(), maxValueSize, searchTuple, and valCnt.

00077 {
00078     /*
00079      * clears and sets up hash block
00080      */
00081     memset(hashBlockInit,0,blockSizeInit);
00082     hash.init(hashBlockInit,  blockSizeInit);
00083 
00084     columnId                = columnIdInit;
00085     valCnt                  = 0;
00086     maxValueSize            = 0;
00087     clusterBlockWriter      = clusterBlockWriterInit;
00088 
00089     assert(colTupleDescInit.size() == 1);
00090     colTupleDesc            = colTupleDescInit;
00091     attrAccessor.compute(colTupleDesc[0]);
00092 
00093     /*
00094      * Temporary in-memory representation for the tuple this LcsHash
00095      * is working on.
00096      */
00097     colTuple.computeAndAllocate(colTupleDesc);
00098     searchTuple.computeAndAllocate(colTupleDesc);
00099 
00100     /*
00101      * colTupleBuffer provides storage for storing the length as well the data
00102      * buffer in a compat format described in TupleData.h. The length could take
00103      * up to 2 bytes.
00104      */
00105     colTupleBuffer.reset(
00106         new FixedBuffer[attrAccessor.getMaxByteCount()]);
00107 
00108     compareInst = SharedLcsCompareColKeyUsingOffsetIndex(
00109         new LcsCompareColKeyUsingOffsetIndex(
00110             clusterBlockWriter, &hash, colTupleDesc, columnId, attrAccessor));
00111 }

void LcsHash::insert ( TupleDatum colTupleDatum,
LcsHashValOrd valOrd,
bool *  undoInsert 
)

Inserts a single column tuple into the hash table.

It also causes the column value to be inserted into the cluster block if needed.

Parameters:
[in] colTupleDatum column tuple to insert
[out] valOrd hash value node ordinal
[out] undoInsert true if this insert should be undone

Definition at line 113 of file LcsHash.cpp.

References attrAccessor, colTupleBuffer, and UnalignedAttributeAccessor::storeValue().

00117 {
00118     /*
00119      * gets the data buffer with length encoded
00120      */
00121     PBuffer dataWithLen = colTupleBuffer.get();
00122     attrAccessor.storeValue(colTupleDatum, dataWithLen);
00123 
00124     insert(dataWithLen, valOrd, undoInsert);
00125 }

void LcsHash::insert ( PBuffer  dataWithLen,
LcsHashValOrd valOrd,
bool *  undoInsert 
)

Inserts a data buffer of a column into the hash table.

It also causes the column value to be inserted into the cluster block if needed.

Parameters:
[in] dataWithLen data buffer of column tuple to insert
[out] valOrd hash value node ordinal
[out] undoInsert true if this insert should be undone

Definition at line 128 of file LcsHash.cpp.

References attrAccessor, clusterBlockWriter, columnId, computeKey(), LcsHashTable::getNewValueNode(), UnalignedAttributeAccessor::getStoredByteCount(), hash, LcsHashTable::insertNewValueNode(), LcsHashTable::isFull(), LcsHashValOrd::isValueInBatch(), maxValueSize, NEWBATCHVALUE, NEWENTRY, NOTHING, search(), LcsUndoType::set(), LcsHashValOrd::setValueInBatch(), undo, valCnt, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.

00132 {
00133     uint        key;
00134     uint16_t    newValueOffset;
00135     LcsHashValueNode *vPtr = 0;
00136     TupleStorageByteLength storageLength;
00137 
00138     /*
00139      * Compression mode could change dynamically so we have to check everytime.
00140      * If this batch will not be compressed, then there is no reason to
00141      * generate a real hash code.  Hash code generation is expensive, so
00142      * try to avoid it.
00143      */
00144     bool noCompress = clusterBlockWriter->noCompressMode(columnId);
00145     key = noCompress ? 0 : computeKey(dataWithLen);
00146 
00147     *undoInsert     = false;
00148 
00149     /*
00150      * If value is not in hash, or
00151      * if we are not in compress mode
00152      * (in which case we allow duplicates in the hash table),
00153      * then adds value to the hash.
00154      */
00155     if (noCompress || !search(key, dataWithLen, valOrd, &vPtr)) {
00156         LcsHashValueNode       *newNode;
00157 
00158         /*
00159          * If hash table is full,  or
00160          * if the cluster page is full
00161          * then returns and indicates the need to undo the insert.
00162          */
00163         *undoInsert =
00164             hash.isFull() ||
00165             !clusterBlockWriter->addValue(
00166                 columnId, dataWithLen, &newValueOffset);
00167 
00168         if (*undoInsert) {
00169             /*
00170              * Prepares undo action.
00171              */
00172             undo.set(NOTHING, key, maxValueSize, 0);
00173             return;
00174         }
00175 
00176         /*
00177          * Inserts a new node only when the above call does not return
00178          * undoInsert.  If a new node is inserted but the undoInsert above is
00179          * true, the subsequent undoInsert() call will not roll back the new
00180          * node correctly if undo.what is not set to NEWENTRY(the default value
00181          * is NOTHING).
00182          */
00183         newNode = hash.getNewValueNode();
00184         newNode->valueOffset = newValueOffset;
00185         *valOrd = valCnt ++;
00186         valOrd->setValueInBatch();
00187         newNode->valueOrd = *valOrd;
00188 
00189         hash.insertNewValueNode(key, newNode);
00190 
00191         /*
00192          * Prepares undo action.
00193          */
00194         undo.set(NEWENTRY, key, maxValueSize, 0);
00195 
00196         storageLength = attrAccessor.getStoredByteCount(dataWithLen);
00197 
00198         if (storageLength > maxValueSize) {
00199             maxValueSize = storageLength;
00200         }
00201     } else {
00202         /*
00203          * We found the value in the hash (from the Search() call above),
00204          * so it is already in the block,
00205          * but it still may not be part of the current batch.
00206          * Whether it is or not, call addValue(), so that we can adjust
00207          * space left in the block.
00208          */
00209         bool bFirstTimeInBatch = !valOrd->isValueInBatch();
00210 
00211         *undoInsert =
00212             !clusterBlockWriter->addValue(columnId, bFirstTimeInBatch);
00213 
00214         if (*undoInsert) {
00215             /*
00216              * Prepares undo action.
00217              */
00218             undo.set(NOTHING, key, maxValueSize, 0);
00219             return;
00220         }
00221 
00222         (vPtr->valueOrd).setValueInBatch();
00223         *valOrd = vPtr->valueOrd;
00224 
00225         /*
00226          * Prepares undo action.
00227          */
00228         if (bFirstTimeInBatch) {
00229             undo.set(NEWBATCHVALUE, key, maxValueSize, vPtr);
00230         } else {
00231             undo.set(NOTHING, key, maxValueSize, 0);
00232         }
00233     }
00234     /*
00235      * Otherwise the value is already in the hash, and the current batch
00236      * already has a pointer to that value, so don't do anything.
00237      */
00238 }

void LcsHash::undoInsert ( TupleDatum colTupleDatum  ) 

Undoes the previous insert of a column tuple.

This will be called if we are trying to add all of the columns in a cluster and at least one can't fit we will remove previously added cluster column values.

Parameters:
[in] colTupleDatum column tuple just inserted

Definition at line 240 of file LcsHash.cpp.

References attrAccessor, colTupleBuffer, and UnalignedAttributeAccessor::storeValue().

00241 {
00242     /*
00243      * gets the data buffer with length encoded
00244      */
00245     PBuffer dataWithLen = colTupleBuffer.get();
00246     attrAccessor.storeValue(colTupleDatum, dataWithLen);
00247 
00248     undoInsert(dataWithLen);
00249 
00250 }

void LcsHash::undoInsert ( PBuffer  dataWithLen  ) 

Undoes the previous insert of a column data buffer.

Parameters:
[in] dataWithLen data buffer to column tuple just inserted

Definition at line 252 of file LcsHash.cpp.

References clusterBlockWriter, columnId, hash, LcsUndoType::key, maxValueSize, NEWBATCHVALUE, NEWENTRY, NOTHING, LcsUndoType::origMaxValueSize, LcsUndoType::reset(), undo, LcsHashTable::undoNewValueNode(), valCnt, LcsHashValueNode::valueOrd, LcsUndoType::vPtr, and LcsUndoType::what.

00253 {
00254     switch (undo.what) {
00255     case NOTHING:
00256         {
00257             /*
00258              * Value already existed in the batch.
00259              */
00260             clusterBlockWriter->undoValue(columnId, NULL, false);
00261             break;
00262         }
00263     case NEWENTRY:
00264         {
00265             /*
00266              * First time in block.
00267              *
00268              * To remove the a new value entry
00269              * 1) decrements the total count,
00270              * 2) resets location where next value entry will gox
00271              * 3) removes entry from hash
00272              * 4) resets maximum value size
00273              * 5) removes value from block
00274              */
00275             valCnt--;
00276             hash.undoNewValueNode(undo.key);
00277             maxValueSize = undo.origMaxValueSize;
00278             clusterBlockWriter->undoValue(columnId, dataWithLen, true);
00279             break;
00280         }
00281     case NEWBATCHVALUE:
00282         {
00283             /*
00284              * Already in block but first time in batch.
00285              * Need to remove value from batch
00286              */
00287             clusterBlockWriter->undoValue(columnId, NULL, true);
00288             (&undo.vPtr->valueOrd)->clearValueInBatch();
00289             break;
00290         }
00291     }
00292     undo.reset();
00293 }

void LcsHash::prepareFixedOrVariableBatch ( uint8_t rowArray,
uint  numRows 
)

Prepares a fixed or variable batch to be written to the cluster block.

Parameters:
[in,out] rowArray upon input, this array holds value node ordinals; at output, the array holds offsets for values in a batch
[in] numRows number of values in a batch

Definition at line 407 of file LcsHash.cpp.

References hash, and LcsHashTable::valueNodes.

00410 {
00411     uint            i;
00412     uint16_t        *rowWORDArray=(uint16_t*)rowArray;
00413     LcsHashValueNode       *pValueNodes;
00414 
00415     pValueNodes = hash.valueNodes;
00416 
00417     /*
00418      * Stores the offset to the column values in rowWORDArray
00419      */
00420     for (i = 0; i < numRows; i++) {
00421         rowWORDArray[i] = pValueNodes[rowWORDArray[i]].valueOffset;
00422     }
00423 }

void LcsHash::prepareCompressedBatch ( uint8_t rowArray,
uint  numRows,
uint16_t numVals,
uint16_t offsetIndexVector 
)

Prepares a compressed batch to be written to the cluster block.

Parameters:
[in,out] rowArray upon input, this array holds value nodes ordinals; at output, it holds the indices to the offset array of the column value stored on a cluster block.
[in] numRows number of values in a batch
[in] numVals (out) number of distinct values in a batch
[out] offsetIndexVector at output, it holds the offsets of the column value stored on a cluster block.

Definition at line 350 of file LcsHash.cpp.

References compareInst, hash, LcsHashValueNode::sortedOrd, valCnt, LcsHashTable::valueNodes, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.

00355 {
00356     uint16_t    i;
00357     uint16_t    *rowWORDArray=(uint16_t*)rowArray;
00358     *numVals        = 0;
00359 
00360     /*
00361      * Puts all value ordinals in batch in Vals array.
00362      */
00363     for (i = 0; i < valCnt; i++) {
00364         if ((hash.valueNodes[i].valueOrd).isValueInBatch()) {
00365             hash.valueNodes[i].sortedOrd = *numVals;
00366             offsetIndexVector[(*numVals)++] = i;
00367         }
00368     }
00369 
00370     /*
00371      * Sorts the value ordinals based on the key values.
00372      */
00373     std::sort(
00374         offsetIndexVector,
00375         offsetIndexVector + (*numVals),
00376         LcsCompare(compareInst));
00377 
00378     /*
00379      * Now OffsetIndexVector is sorted. Sets sortedOrd, which is basically index
00380      * into the OffsetIndexVector, in valueNodes array.
00381      */
00382     for (i = 0; i < *numVals; i++) {
00383         hash.valueNodes[offsetIndexVector[i]].sortedOrd = i;
00384     }
00385 
00386     /*
00387      * Having stored the sortedOrd away, replaces value Ordinals in
00388      * OffsetIndexVector array with offset from the valueNodes array. Now
00389      * OffsetIndexVector will contain offsets sorted based on the values they
00390      * point to.
00391      */
00392     for (i = 0; i < *numVals; i++) {
00393         offsetIndexVector[i] =
00394             hash.valueNodes[offsetIndexVector[i]].valueOffset;
00395     }
00396 
00397     /*
00398      * Stores the index to OffsetIndexVector in the Row array. Now the
00399      * rowWORDArray contains indices to the OffsetIndexVector,  which conains
00400      * offsets sorted based on the column values they point to.
00401      */
00402     for (i = 0; i < numRows; i++) {
00403         rowWORDArray[i] = hash.valueNodes[rowWORDArray[i]].sortedOrd;
00404     }
00405 }

void LcsHash::clearFixedEntries (  ) 

Clears the fixed values from batch to indicate the offset is not longer useful because the key storage can be relocated between batches.

Definition at line 426 of file LcsHash.cpp.

References clusterBlockWriter, columnId, hash, valCnt, LcsHashTable::valueNodes, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.

00427 {
00428     /*
00429      * Only clears entries if the next batch is not guaranteed to be fixed mode.
00430      */
00431     if (!clusterBlockWriter->noCompressMode(columnId)) {
00432         for (uint i = 0; i < valCnt; i++) {
00433             if ((hash.valueNodes[i].valueOrd).isValueInBatch()) {
00434                 hash.valueNodes[i].valueOffset = 0;
00435             }
00436         }
00437     }
00438 }

void LcsHash::startNewBatch ( uint  leftOvers  ) 

Prepares LcsHash object for a new batch.

Parameters:
[in] leftOvers number of left over hash value nodes from the previous batch. The new batch needs to leave room apriori for these nodes.

Definition at line 488 of file LcsHash.cpp.

References clusterBlockWriter, columnId, hash, LcsHashTable::isFull(), maxValueSize, LcsHashTable::resetBatch(), LcsHashTable::resetHash(), and valCnt.

00489 {
00490     /*
00491      * If the hash is full we need to start over. Otherwise just clear the
00492      * entries used in building th eprevious batch.
00493      */
00494     if (clusterBlockWriter->noCompressMode(columnId) ||
00495         hash.isFull(leftOvers))
00496     {
00497         hash.resetHash();
00498         valCnt       = 0;
00499         maxValueSize = 0;
00500     } else {
00501         hash.resetBatch();
00502     }
00503 }

void LcsHash::restore ( uint  numVals,
uint16_t  lastValOff 
)

Sets up hash with values from an existing cluster block.

This is called when appending to an existing block.

Parameters:
[in] numVals number of values for this column
[in] lastValOff offset of the last value stored for this column

Definition at line 440 of file LcsHash.cpp.

References attrAccessor, clusterBlockWriter, columnId, computeKey(), dummy(), LcsHashTable::getNewValueNode(), UnalignedAttributeAccessor::getStoredByteCount(), hash, LcsHashTable::insertNewValueNode(), LcsHashTable::isFull(), maxValueSize, search(), valCnt, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.

00441 {
00442     uint            i;
00443     uint            key;
00444     PBuffer         dataWithLen;
00445     LcsHashValueNode      *newNode;
00446     LcsHashValOrd   dummy;
00447     LcsHashValOrd   valOrd;
00448     TupleStorageByteLength storageLength;
00449 
00450     /*
00451      * Compression mode could change dynamically so we have to check everytime.
00452      * If this batch will not be compressed, then there is no reason to
00453      * generate a real hash code.  Hash code generation is expensive, so
00454      * try to avoid it.
00455      */
00456     bool noCompress = clusterBlockWriter->noCompressMode(columnId);
00457 
00458     for (i = 0; i < numVals && !(hash.isFull()); i++) {
00459         dataWithLen = clusterBlockWriter->getOffsetPtr(columnId,lastValOff);
00460         key = noCompress ? 0 : computeKey(dataWithLen);
00461 
00462         /*
00463          * If value is not in hash, or if we are not in compress mode (in which
00464          * case we allow duplicates in the hash table), then adds value to the
00465          * hash.
00466          */
00467         if (noCompress || !search(key, dataWithLen, &dummy)) {
00468             newNode = hash.getNewValueNode();
00469 
00470             valOrd = valCnt++;
00471             newNode->valueOrd = valOrd;
00472             newNode->valueOffset = (uint16_t) lastValOff;
00473 
00474             hash.insertNewValueNode(key,  newNode);
00475 
00476             storageLength = attrAccessor.getStoredByteCount(dataWithLen);
00477             if (storageLength > maxValueSize) {
00478                 maxValueSize = storageLength;
00479             }
00480         }
00481 
00482         lastValOff = clusterBlockWriter->getNextVal(
00483             columnId,
00484             (uint16_t)lastValOff);
00485     }
00486 }

uint LcsHash::getMaxValueSize (  )  [inline]

Gets the maximum value length.

Returns:
data value length, including the bytes encoding the length information.

Definition at line 893 of file LcsHash.h.

References maxValueSize.

00894 {
00895     return maxValueSize;
00896 }

bool LcsHash::isHashFull ( uint  leftOvers = 0  )  [inline]

Checks if the hash table is full.

Parameters:
[in] leftOvers number of left over hash value nodes from the previous batch. The next batch need to leave room apriori for these nodes.
Returns:
true if hash table is full

Definition at line 898 of file LcsHash.h.

References hash, and LcsHashTable::isFull().

00899 {
00900     return hash.isFull(leftOvers);
00901 }


Member Data Documentation

uint LcsHash::columnId [private]

column for which this LcsHash structure is built.

Definition at line 454 of file LcsHash.h.

Referenced by clearFixedEntries(), init(), insert(), LcsHash(), restore(), search(), startNewBatch(), and undoInsert().

LcsHashTable LcsHash::hash [private]

LcsHashTable object contains logic to fit the data strcture into one block as well as logic to manage the hash entries and hash value nodes in the hash table.

Definition at line 461 of file LcsHash.h.

Referenced by clearFixedEntries(), computeKey(), init(), insert(), isHashFull(), prepareCompressedBatch(), prepareFixedOrVariableBatch(), restore(), search(), startNewBatch(), and undoInsert().

SharedLcsClusterNodeWriter LcsHash::clusterBlockWriter [private]

Block writer object used to add new value to a cluster block.

Definition at line 466 of file LcsHash.h.

Referenced by clearFixedEntries(), init(), insert(), restore(), search(), startNewBatch(), and undoInsert().

TupleDescriptor LcsHash::colTupleDesc [private]

The column tuple descriptor.

Definition at line 471 of file LcsHash.h.

Referenced by init(), and search().

TupleDataWithBuffer LcsHash::colTuple [private]

The column currently begin compressed.

Definition at line 476 of file LcsHash.h.

Referenced by init(), and search().

UnalignedAttributeAccessor LcsHash::attrAccessor [private]

Attribute accessor of the column.

Definition at line 481 of file LcsHash.h.

Referenced by computeKey(), init(), insert(), restore(), search(), and undoInsert().

TupleDataWithBuffer LcsHash::searchTuple [private]

The column being compared against.

Definition at line 486 of file LcsHash.h.

Referenced by init(), and search().

boost::scoped_array<FixedBuffer> LcsHash::colTupleBuffer [private]

Scratch memory to store the current column value being compressed.

Definition at line 491 of file LcsHash.h.

Referenced by init(), insert(), and undoInsert().

uint16_t LcsHash::valCnt [private]

Number of unique values in the current batch.

The type of this field should be the same as that of LcsHashValOrd.

Definition at line 497 of file LcsHash.h.

Referenced by clearFixedEntries(), init(), insert(), LcsHash(), prepareCompressedBatch(), restore(), startNewBatch(), and undoInsert().

uint LcsHash::maxValueSize [private]

Largest value size in the current batch.

The size includes the bytes encoding the length information.

Definition at line 503 of file LcsHash.h.

Referenced by getMaxValueSize(), init(), insert(), LcsHash(), restore(), startNewBatch(), and undoInsert().

LcsUndoType LcsHash::undo [private]

Structure containing info for undoing the most recent hash insert.

Definition at line 508 of file LcsHash.h.

Referenced by insert(), and undoInsert().

uint8_t* LcsHash::magicTable [private]

Hash seed values.

Definition at line 513 of file LcsHash.h.

Referenced by computeKey(), and LcsHash().

uint LcsHash::numChecks [private]

Hash table statistics: number of hash key checks.

Definition at line 518 of file LcsHash.h.

Referenced by LcsHash(), and search().

uint LcsHash::numMatches [private]

Hash table statistics: number of hash key matches.

Definition at line 523 of file LcsHash.h.

Referenced by LcsHash(), and search().

SharedLcsCompareColKeyUsingOffsetIndex LcsHash::compareInst [private]

Helper class to LcsCompare.

It stores the comparison context.

Definition at line 528 of file LcsHash.h.

Referenced by init(), and prepareCompressedBatch().


The documentation for this class was generated from the following files:
Generated on Mon Jun 22 04:00:37 2009 for Fennel by  doxygen 1.5.1