#include <LcsHash.h>
Public Member Functions | |
LcsHash () | |
~LcsHash () | |
void | init (PBuffer hashBlockInit, SharedLcsClusterNodeWriter clusterBlockWriterInit, TupleDescriptor const &colTupleDescInit, uint columnIdInit, uint blockSizeInit) |
Initializes the LcsHash object. | |
void | insert (TupleDatum &colTupleDatum, LcsHashValOrd *valOrd, bool *undoInsert) |
Inserts a single column tuple into the hash table. | |
void | insert (PBuffer dataWithLen, LcsHashValOrd *valOrd, bool *undoInsert) |
Inserts a data buffer of a column into the hash table. | |
void | undoInsert (TupleDatum &colTupleDatum) |
Undoes the previous insert of a column tuple. | |
void | undoInsert (PBuffer dataWithLen) |
Undoes the previous insert of a column data buffer. | |
void | prepareFixedOrVariableBatch (uint8_t *rowArray, uint numRows) |
Prepares a fixed or variable batch to be written to the cluster block. | |
void | prepareCompressedBatch (uint8_t *rowArray, uint numRows, uint16_t *numVals, uint16_t *offsetIndexVector) |
Prepares a compressed batch to be written to the cluster block. | |
void | clearFixedEntries () |
Clears the fixed values from batch to indicate the offset is not longer useful because the key storage can be relocated between batches. | |
void | startNewBatch (uint leftOvers) |
Prepares LcsHash object for a new batch. | |
void | restore (uint numVals, uint16_t lastValOff) |
Sets up hash with values from an existing cluster block. | |
uint | getMaxValueSize () |
Gets the maximum value length. | |
bool | isHashFull (uint leftOvers=0) |
Checks if the hash table is full. | |
Private Member Functions | |
uint | computeKey (PBuffer dataWithLen) |
Compue hash key from value. | |
bool | search (uint key, PBuffer dataWithLen, LcsHashValOrd *valOrd, LcsHashValueNode **v) |
Search for ordinal using hash key and column data value. | |
Private Attributes | |
uint | columnId |
column for which this LcsHash structure is built. | |
LcsHashTable | hash |
LcsHashTable object contains logic to fit the data strcture into one block as well as logic to manage the hash entries and hash value nodes in the hash table. | |
SharedLcsClusterNodeWriter | clusterBlockWriter |
Block writer object used to add new value to a cluster block. | |
TupleDescriptor | colTupleDesc |
The column tuple descriptor. | |
TupleDataWithBuffer | colTuple |
The column currently begin compressed. | |
UnalignedAttributeAccessor | attrAccessor |
Attribute accessor of the column. | |
TupleDataWithBuffer | searchTuple |
The column being compared against. | |
boost::scoped_array< FixedBuffer > | colTupleBuffer |
Scratch memory to store the current column value being compressed. | |
uint16_t | valCnt |
Number of unique values in the current batch. | |
uint | maxValueSize |
Largest value size in the current batch. | |
LcsUndoType | undo |
Structure containing info for undoing the most recent hash insert. | |
uint8_t * | magicTable |
Hash seed values. | |
uint | numChecks |
Hash table statistics: number of hash key checks. | |
uint | numMatches |
Hash table statistics: number of hash key matches. | |
SharedLcsCompareColKeyUsingOffsetIndex | compareInst |
Helper class to LcsCompare. |
LcsClusterAppendExecStream splits up the columns in a cluster and passes column tuples(consisting of only one column) to LcsHash, which tranforms the tuple and sends to the LcsClusterNodeWriter data pointers with encoded length information.
Definition at line 447 of file LcsHash.h.
LcsHash::LcsHash | ( | ) | [explicit] |
Definition at line 61 of file LcsHash.cpp.
References columnId, MagicTable, magicTable, maxValueSize, numChecks, numMatches, and valCnt.
00062 { 00063 columnId = 0; 00064 valCnt = 0; 00065 maxValueSize = 0; 00066 magicTable = MagicTable; 00067 numMatches = 0; 00068 numChecks = 0; 00069 }
Compue hash key from value.
[in] | dataWithLen | pointer to buffer with value and length info encoded at the first 1 or 2 bytes. |
Definition at line 510 of file LcsHash.cpp.
References attrAccessor, UnalignedAttributeAccessor::getStoredByteCount(), hash, magicTable, and LcsHashTable::numHashEntries().
Referenced by insert(), and restore().
00511 { 00512 uint8_t keyVal[2] = {0,0}, oldKeyVal[2]={0,17}; 00513 uint i, colSize = attrAccessor.getStoredByteCount(dataWithLen); 00514 00515 /* 00516 * Compute the hash key over all the bytes, inlcuding the length 00517 * bytes. This saves the implicit memcpy in loadValue. 00518 */ 00519 for (i = 0; 00520 i < colSize; 00521 oldKeyVal[0] = keyVal[0], oldKeyVal[1] = keyVal[1], i++, dataWithLen++) 00522 { 00523 keyVal[0] = magicTable[oldKeyVal[0] ^ *dataWithLen]; 00524 keyVal[1] = magicTable[oldKeyVal[1] ^ *dataWithLen]; 00525 } 00526 00527 return ((keyVal[1]<<8) + keyVal[0]) % hash.numHashEntries(); 00528 }
bool LcsHash::search | ( | uint | key, | |
PBuffer | dataWithLen, | |||
LcsHashValOrd * | valOrd, | |||
LcsHashValueNode ** | v | |||
) | [private] |
Search for ordinal using hash key and column data value.
[in] | key | hash key to locate |
[in] | dataWithLen | pointer to data buffer with length info encoded |
[out] | valOrd | hash value node ordinal number |
[out] | v | hash value node if value is previously inserted |
Definition at line 295 of file LcsHash.cpp.
References attrAccessor, clusterBlockWriter, colTuple, colTupleDesc, columnId, TupleDescriptor::compareTuples(), LcsHashTable::getFirstValueNode(), LcsHashTable::getNextValueNode(), hash, UnalignedAttributeAccessor::loadValue(), numChecks, numMatches, TupleDataWithBuffer::resetBuffer(), searchTuple, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.
Referenced by insert(), and restore().
00300 { 00301 LcsHashValueNode *valueNode; 00302 bool compareRes; 00303 00304 attrAccessor.loadValue(colTuple[0], dataWithLen); 00305 00306 for (valueNode = hash.getFirstValueNode(key); 00307 valueNode != NULL; 00308 valueNode = hash.getNextValueNode(valueNode)) 00309 { 00310 numChecks++; 00311 00312 /* 00313 * Skips invalid hash entries. 00314 * Entries were invalidated by clearFixedEntries. 00315 */ 00316 if (valueNode->valueOffset == 0) { 00317 continue; 00318 } 00319 00320 attrAccessor.loadValue( 00321 searchTuple[0], 00322 clusterBlockWriter->getOffsetPtr(columnId, valueNode->valueOffset)); 00323 00324 compareRes = colTupleDesc.compareTuples(colTuple, searchTuple); 00325 00326 /* 00327 * Prepare for next loadValue. 00328 */ 00329 searchTuple.resetBuffer(); 00330 00331 if (compareRes == 0) { 00332 numMatches++; 00333 *valOrd = valueNode->valueOrd; 00334 if (vNode) { 00335 *vNode = valueNode; 00336 } 00337 colTuple.resetBuffer(); 00338 return true; 00339 } 00340 } 00341 00342 colTuple.resetBuffer(); 00343 /* 00344 * No Match. 00345 */ 00346 return false; 00347 }
void LcsHash::init | ( | PBuffer | hashBlockInit, | |
SharedLcsClusterNodeWriter | clusterBlockWriterInit, | |||
TupleDescriptor const & | colTupleDescInit, | |||
uint | columnIdInit, | |||
uint | blockSizeInit | |||
) |
Initializes the LcsHash object.
[in] | hashBlockInit | block to hold the hash table |
[in] | clusterBlockWriterInit | reference to the node writer |
[in] | colTupleDescInit | tuple descriptor for the column tuple |
[in] | columnIdInit | column ID for which this LcsHash is compressing. |
[in] | blockSizeInit | block size |
Definition at line 71 of file LcsHash.cpp.
References attrAccessor, clusterBlockWriter, colTuple, colTupleBuffer, colTupleDesc, columnId, compareInst, UnalignedAttributeAccessor::compute(), TupleDataWithBuffer::computeAndAllocate(), FixedBuffer, UnalignedAttributeAccessor::getMaxByteCount(), hash, LcsHashTable::init(), maxValueSize, searchTuple, and valCnt.
00077 { 00078 /* 00079 * clears and sets up hash block 00080 */ 00081 memset(hashBlockInit,0,blockSizeInit); 00082 hash.init(hashBlockInit, blockSizeInit); 00083 00084 columnId = columnIdInit; 00085 valCnt = 0; 00086 maxValueSize = 0; 00087 clusterBlockWriter = clusterBlockWriterInit; 00088 00089 assert(colTupleDescInit.size() == 1); 00090 colTupleDesc = colTupleDescInit; 00091 attrAccessor.compute(colTupleDesc[0]); 00092 00093 /* 00094 * Temporary in-memory representation for the tuple this LcsHash 00095 * is working on. 00096 */ 00097 colTuple.computeAndAllocate(colTupleDesc); 00098 searchTuple.computeAndAllocate(colTupleDesc); 00099 00100 /* 00101 * colTupleBuffer provides storage for storing the length as well the data 00102 * buffer in a compat format described in TupleData.h. The length could take 00103 * up to 2 bytes. 00104 */ 00105 colTupleBuffer.reset( 00106 new FixedBuffer[attrAccessor.getMaxByteCount()]); 00107 00108 compareInst = SharedLcsCompareColKeyUsingOffsetIndex( 00109 new LcsCompareColKeyUsingOffsetIndex( 00110 clusterBlockWriter, &hash, colTupleDesc, columnId, attrAccessor)); 00111 }
void LcsHash::insert | ( | TupleDatum & | colTupleDatum, | |
LcsHashValOrd * | valOrd, | |||
bool * | undoInsert | |||
) |
Inserts a single column tuple into the hash table.
It also causes the column value to be inserted into the cluster block if needed.
[in] | colTupleDatum | column tuple to insert |
[out] | valOrd | hash value node ordinal |
[out] | undoInsert | true if this insert should be undone |
Definition at line 113 of file LcsHash.cpp.
References attrAccessor, colTupleBuffer, and UnalignedAttributeAccessor::storeValue().
00117 { 00118 /* 00119 * gets the data buffer with length encoded 00120 */ 00121 PBuffer dataWithLen = colTupleBuffer.get(); 00122 attrAccessor.storeValue(colTupleDatum, dataWithLen); 00123 00124 insert(dataWithLen, valOrd, undoInsert); 00125 }
void LcsHash::insert | ( | PBuffer | dataWithLen, | |
LcsHashValOrd * | valOrd, | |||
bool * | undoInsert | |||
) |
Inserts a data buffer of a column into the hash table.
It also causes the column value to be inserted into the cluster block if needed.
[in] | dataWithLen | data buffer of column tuple to insert |
[out] | valOrd | hash value node ordinal |
[out] | undoInsert | true if this insert should be undone |
Definition at line 128 of file LcsHash.cpp.
References attrAccessor, clusterBlockWriter, columnId, computeKey(), LcsHashTable::getNewValueNode(), UnalignedAttributeAccessor::getStoredByteCount(), hash, LcsHashTable::insertNewValueNode(), LcsHashTable::isFull(), LcsHashValOrd::isValueInBatch(), maxValueSize, NEWBATCHVALUE, NEWENTRY, NOTHING, search(), LcsUndoType::set(), LcsHashValOrd::setValueInBatch(), undo, valCnt, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.
00132 { 00133 uint key; 00134 uint16_t newValueOffset; 00135 LcsHashValueNode *vPtr = 0; 00136 TupleStorageByteLength storageLength; 00137 00138 /* 00139 * Compression mode could change dynamically so we have to check everytime. 00140 * If this batch will not be compressed, then there is no reason to 00141 * generate a real hash code. Hash code generation is expensive, so 00142 * try to avoid it. 00143 */ 00144 bool noCompress = clusterBlockWriter->noCompressMode(columnId); 00145 key = noCompress ? 0 : computeKey(dataWithLen); 00146 00147 *undoInsert = false; 00148 00149 /* 00150 * If value is not in hash, or 00151 * if we are not in compress mode 00152 * (in which case we allow duplicates in the hash table), 00153 * then adds value to the hash. 00154 */ 00155 if (noCompress || !search(key, dataWithLen, valOrd, &vPtr)) { 00156 LcsHashValueNode *newNode; 00157 00158 /* 00159 * If hash table is full, or 00160 * if the cluster page is full 00161 * then returns and indicates the need to undo the insert. 00162 */ 00163 *undoInsert = 00164 hash.isFull() || 00165 !clusterBlockWriter->addValue( 00166 columnId, dataWithLen, &newValueOffset); 00167 00168 if (*undoInsert) { 00169 /* 00170 * Prepares undo action. 00171 */ 00172 undo.set(NOTHING, key, maxValueSize, 0); 00173 return; 00174 } 00175 00176 /* 00177 * Inserts a new node only when the above call does not return 00178 * undoInsert. If a new node is inserted but the undoInsert above is 00179 * true, the subsequent undoInsert() call will not roll back the new 00180 * node correctly if undo.what is not set to NEWENTRY(the default value 00181 * is NOTHING). 00182 */ 00183 newNode = hash.getNewValueNode(); 00184 newNode->valueOffset = newValueOffset; 00185 *valOrd = valCnt ++; 00186 valOrd->setValueInBatch(); 00187 newNode->valueOrd = *valOrd; 00188 00189 hash.insertNewValueNode(key, newNode); 00190 00191 /* 00192 * Prepares undo action. 00193 */ 00194 undo.set(NEWENTRY, key, maxValueSize, 0); 00195 00196 storageLength = attrAccessor.getStoredByteCount(dataWithLen); 00197 00198 if (storageLength > maxValueSize) { 00199 maxValueSize = storageLength; 00200 } 00201 } else { 00202 /* 00203 * We found the value in the hash (from the Search() call above), 00204 * so it is already in the block, 00205 * but it still may not be part of the current batch. 00206 * Whether it is or not, call addValue(), so that we can adjust 00207 * space left in the block. 00208 */ 00209 bool bFirstTimeInBatch = !valOrd->isValueInBatch(); 00210 00211 *undoInsert = 00212 !clusterBlockWriter->addValue(columnId, bFirstTimeInBatch); 00213 00214 if (*undoInsert) { 00215 /* 00216 * Prepares undo action. 00217 */ 00218 undo.set(NOTHING, key, maxValueSize, 0); 00219 return; 00220 } 00221 00222 (vPtr->valueOrd).setValueInBatch(); 00223 *valOrd = vPtr->valueOrd; 00224 00225 /* 00226 * Prepares undo action. 00227 */ 00228 if (bFirstTimeInBatch) { 00229 undo.set(NEWBATCHVALUE, key, maxValueSize, vPtr); 00230 } else { 00231 undo.set(NOTHING, key, maxValueSize, 0); 00232 } 00233 } 00234 /* 00235 * Otherwise the value is already in the hash, and the current batch 00236 * already has a pointer to that value, so don't do anything. 00237 */ 00238 }
void LcsHash::undoInsert | ( | TupleDatum & | colTupleDatum | ) |
Undoes the previous insert of a column tuple.
This will be called if we are trying to add all of the columns in a cluster and at least one can't fit we will remove previously added cluster column values.
[in] | colTupleDatum | column tuple just inserted |
Definition at line 240 of file LcsHash.cpp.
References attrAccessor, colTupleBuffer, and UnalignedAttributeAccessor::storeValue().
00241 { 00242 /* 00243 * gets the data buffer with length encoded 00244 */ 00245 PBuffer dataWithLen = colTupleBuffer.get(); 00246 attrAccessor.storeValue(colTupleDatum, dataWithLen); 00247 00248 undoInsert(dataWithLen); 00249 00250 }
void LcsHash::undoInsert | ( | PBuffer | dataWithLen | ) |
Undoes the previous insert of a column data buffer.
[in] | dataWithLen | data buffer to column tuple just inserted |
Definition at line 252 of file LcsHash.cpp.
References clusterBlockWriter, columnId, hash, LcsUndoType::key, maxValueSize, NEWBATCHVALUE, NEWENTRY, NOTHING, LcsUndoType::origMaxValueSize, LcsUndoType::reset(), undo, LcsHashTable::undoNewValueNode(), valCnt, LcsHashValueNode::valueOrd, LcsUndoType::vPtr, and LcsUndoType::what.
00253 { 00254 switch (undo.what) { 00255 case NOTHING: 00256 { 00257 /* 00258 * Value already existed in the batch. 00259 */ 00260 clusterBlockWriter->undoValue(columnId, NULL, false); 00261 break; 00262 } 00263 case NEWENTRY: 00264 { 00265 /* 00266 * First time in block. 00267 * 00268 * To remove the a new value entry 00269 * 1) decrements the total count, 00270 * 2) resets location where next value entry will gox 00271 * 3) removes entry from hash 00272 * 4) resets maximum value size 00273 * 5) removes value from block 00274 */ 00275 valCnt--; 00276 hash.undoNewValueNode(undo.key); 00277 maxValueSize = undo.origMaxValueSize; 00278 clusterBlockWriter->undoValue(columnId, dataWithLen, true); 00279 break; 00280 } 00281 case NEWBATCHVALUE: 00282 { 00283 /* 00284 * Already in block but first time in batch. 00285 * Need to remove value from batch 00286 */ 00287 clusterBlockWriter->undoValue(columnId, NULL, true); 00288 (&undo.vPtr->valueOrd)->clearValueInBatch(); 00289 break; 00290 } 00291 } 00292 undo.reset(); 00293 }
Prepares a fixed or variable batch to be written to the cluster block.
[in,out] | rowArray | upon input, this array holds value node ordinals; at output, the array holds offsets for values in a batch |
[in] | numRows | number of values in a batch |
Definition at line 407 of file LcsHash.cpp.
References hash, and LcsHashTable::valueNodes.
00410 { 00411 uint i; 00412 uint16_t *rowWORDArray=(uint16_t*)rowArray; 00413 LcsHashValueNode *pValueNodes; 00414 00415 pValueNodes = hash.valueNodes; 00416 00417 /* 00418 * Stores the offset to the column values in rowWORDArray 00419 */ 00420 for (i = 0; i < numRows; i++) { 00421 rowWORDArray[i] = pValueNodes[rowWORDArray[i]].valueOffset; 00422 } 00423 }
void LcsHash::prepareCompressedBatch | ( | uint8_t * | rowArray, | |
uint | numRows, | |||
uint16_t * | numVals, | |||
uint16_t * | offsetIndexVector | |||
) |
Prepares a compressed batch to be written to the cluster block.
[in,out] | rowArray | upon input, this array holds value nodes ordinals; at output, it holds the indices to the offset array of the column value stored on a cluster block. |
[in] | numRows | number of values in a batch |
[in] | numVals | (out) number of distinct values in a batch |
[out] | offsetIndexVector | at output, it holds the offsets of the column value stored on a cluster block. |
Definition at line 350 of file LcsHash.cpp.
References compareInst, hash, LcsHashValueNode::sortedOrd, valCnt, LcsHashTable::valueNodes, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.
00355 { 00356 uint16_t i; 00357 uint16_t *rowWORDArray=(uint16_t*)rowArray; 00358 *numVals = 0; 00359 00360 /* 00361 * Puts all value ordinals in batch in Vals array. 00362 */ 00363 for (i = 0; i < valCnt; i++) { 00364 if ((hash.valueNodes[i].valueOrd).isValueInBatch()) { 00365 hash.valueNodes[i].sortedOrd = *numVals; 00366 offsetIndexVector[(*numVals)++] = i; 00367 } 00368 } 00369 00370 /* 00371 * Sorts the value ordinals based on the key values. 00372 */ 00373 std::sort( 00374 offsetIndexVector, 00375 offsetIndexVector + (*numVals), 00376 LcsCompare(compareInst)); 00377 00378 /* 00379 * Now OffsetIndexVector is sorted. Sets sortedOrd, which is basically index 00380 * into the OffsetIndexVector, in valueNodes array. 00381 */ 00382 for (i = 0; i < *numVals; i++) { 00383 hash.valueNodes[offsetIndexVector[i]].sortedOrd = i; 00384 } 00385 00386 /* 00387 * Having stored the sortedOrd away, replaces value Ordinals in 00388 * OffsetIndexVector array with offset from the valueNodes array. Now 00389 * OffsetIndexVector will contain offsets sorted based on the values they 00390 * point to. 00391 */ 00392 for (i = 0; i < *numVals; i++) { 00393 offsetIndexVector[i] = 00394 hash.valueNodes[offsetIndexVector[i]].valueOffset; 00395 } 00396 00397 /* 00398 * Stores the index to OffsetIndexVector in the Row array. Now the 00399 * rowWORDArray contains indices to the OffsetIndexVector, which conains 00400 * offsets sorted based on the column values they point to. 00401 */ 00402 for (i = 0; i < numRows; i++) { 00403 rowWORDArray[i] = hash.valueNodes[rowWORDArray[i]].sortedOrd; 00404 } 00405 }
void LcsHash::clearFixedEntries | ( | ) |
Clears the fixed values from batch to indicate the offset is not longer useful because the key storage can be relocated between batches.
Definition at line 426 of file LcsHash.cpp.
References clusterBlockWriter, columnId, hash, valCnt, LcsHashTable::valueNodes, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.
00427 { 00428 /* 00429 * Only clears entries if the next batch is not guaranteed to be fixed mode. 00430 */ 00431 if (!clusterBlockWriter->noCompressMode(columnId)) { 00432 for (uint i = 0; i < valCnt; i++) { 00433 if ((hash.valueNodes[i].valueOrd).isValueInBatch()) { 00434 hash.valueNodes[i].valueOffset = 0; 00435 } 00436 } 00437 } 00438 }
void LcsHash::startNewBatch | ( | uint | leftOvers | ) |
Prepares LcsHash object for a new batch.
[in] | leftOvers | number of left over hash value nodes from the previous batch. The new batch needs to leave room apriori for these nodes. |
Definition at line 488 of file LcsHash.cpp.
References clusterBlockWriter, columnId, hash, LcsHashTable::isFull(), maxValueSize, LcsHashTable::resetBatch(), LcsHashTable::resetHash(), and valCnt.
00489 { 00490 /* 00491 * If the hash is full we need to start over. Otherwise just clear the 00492 * entries used in building th eprevious batch. 00493 */ 00494 if (clusterBlockWriter->noCompressMode(columnId) || 00495 hash.isFull(leftOvers)) 00496 { 00497 hash.resetHash(); 00498 valCnt = 0; 00499 maxValueSize = 0; 00500 } else { 00501 hash.resetBatch(); 00502 } 00503 }
Sets up hash with values from an existing cluster block.
This is called when appending to an existing block.
[in] | numVals | number of values for this column |
[in] | lastValOff | offset of the last value stored for this column |
Definition at line 440 of file LcsHash.cpp.
References attrAccessor, clusterBlockWriter, columnId, computeKey(), dummy(), LcsHashTable::getNewValueNode(), UnalignedAttributeAccessor::getStoredByteCount(), hash, LcsHashTable::insertNewValueNode(), LcsHashTable::isFull(), maxValueSize, search(), valCnt, LcsHashValueNode::valueOffset, and LcsHashValueNode::valueOrd.
00441 { 00442 uint i; 00443 uint key; 00444 PBuffer dataWithLen; 00445 LcsHashValueNode *newNode; 00446 LcsHashValOrd dummy; 00447 LcsHashValOrd valOrd; 00448 TupleStorageByteLength storageLength; 00449 00450 /* 00451 * Compression mode could change dynamically so we have to check everytime. 00452 * If this batch will not be compressed, then there is no reason to 00453 * generate a real hash code. Hash code generation is expensive, so 00454 * try to avoid it. 00455 */ 00456 bool noCompress = clusterBlockWriter->noCompressMode(columnId); 00457 00458 for (i = 0; i < numVals && !(hash.isFull()); i++) { 00459 dataWithLen = clusterBlockWriter->getOffsetPtr(columnId,lastValOff); 00460 key = noCompress ? 0 : computeKey(dataWithLen); 00461 00462 /* 00463 * If value is not in hash, or if we are not in compress mode (in which 00464 * case we allow duplicates in the hash table), then adds value to the 00465 * hash. 00466 */ 00467 if (noCompress || !search(key, dataWithLen, &dummy)) { 00468 newNode = hash.getNewValueNode(); 00469 00470 valOrd = valCnt++; 00471 newNode->valueOrd = valOrd; 00472 newNode->valueOffset = (uint16_t) lastValOff; 00473 00474 hash.insertNewValueNode(key, newNode); 00475 00476 storageLength = attrAccessor.getStoredByteCount(dataWithLen); 00477 if (storageLength > maxValueSize) { 00478 maxValueSize = storageLength; 00479 } 00480 } 00481 00482 lastValOff = clusterBlockWriter->getNextVal( 00483 columnId, 00484 (uint16_t)lastValOff); 00485 } 00486 }
uint LcsHash::getMaxValueSize | ( | ) | [inline] |
Gets the maximum value length.
Definition at line 893 of file LcsHash.h.
References maxValueSize.
00894 { 00895 return maxValueSize; 00896 }
bool LcsHash::isHashFull | ( | uint | leftOvers = 0 |
) | [inline] |
Checks if the hash table is full.
[in] | leftOvers | number of left over hash value nodes from the previous batch. The next batch need to leave room apriori for these nodes. |
Definition at line 898 of file LcsHash.h.
References hash, and LcsHashTable::isFull().
uint LcsHash::columnId [private] |
column for which this LcsHash structure is built.
Definition at line 454 of file LcsHash.h.
Referenced by clearFixedEntries(), init(), insert(), LcsHash(), restore(), search(), startNewBatch(), and undoInsert().
LcsHashTable LcsHash::hash [private] |
LcsHashTable object contains logic to fit the data strcture into one block as well as logic to manage the hash entries and hash value nodes in the hash table.
Definition at line 461 of file LcsHash.h.
Referenced by clearFixedEntries(), computeKey(), init(), insert(), isHashFull(), prepareCompressedBatch(), prepareFixedOrVariableBatch(), restore(), search(), startNewBatch(), and undoInsert().
Block writer object used to add new value to a cluster block.
Definition at line 466 of file LcsHash.h.
Referenced by clearFixedEntries(), init(), insert(), restore(), search(), startNewBatch(), and undoInsert().
TupleDescriptor LcsHash::colTupleDesc [private] |
TupleDataWithBuffer LcsHash::colTuple [private] |
Attribute accessor of the column.
Definition at line 481 of file LcsHash.h.
Referenced by computeKey(), init(), insert(), restore(), search(), and undoInsert().
TupleDataWithBuffer LcsHash::searchTuple [private] |
boost::scoped_array<FixedBuffer> LcsHash::colTupleBuffer [private] |
Scratch memory to store the current column value being compressed.
Definition at line 491 of file LcsHash.h.
Referenced by init(), insert(), and undoInsert().
uint16_t LcsHash::valCnt [private] |
Number of unique values in the current batch.
The type of this field should be the same as that of LcsHashValOrd.
Definition at line 497 of file LcsHash.h.
Referenced by clearFixedEntries(), init(), insert(), LcsHash(), prepareCompressedBatch(), restore(), startNewBatch(), and undoInsert().
uint LcsHash::maxValueSize [private] |
Largest value size in the current batch.
The size includes the bytes encoding the length information.
Definition at line 503 of file LcsHash.h.
Referenced by getMaxValueSize(), init(), insert(), LcsHash(), restore(), startNewBatch(), and undoInsert().
LcsUndoType LcsHash::undo [private] |
Structure containing info for undoing the most recent hash insert.
Definition at line 508 of file LcsHash.h.
Referenced by insert(), and undoInsert().
uint8_t* LcsHash::magicTable [private] |
Hash seed values.
Definition at line 513 of file LcsHash.h.
Referenced by computeKey(), and LcsHash().
uint LcsHash::numChecks [private] |
uint LcsHash::numMatches [private] |
Helper class to LcsCompare.
It stores the comparison context.
Definition at line 528 of file LcsHash.h.
Referenced by init(), and prepareCompressedBatch().