#include <FlatFileExecStream.h>
Inheritance diagram for FlatFileExecStreamParams:
Public Member Functions | |
FlatFileExecStreamParams () | |
Public Attributes | |
std::string | dataFilePath |
Path to the flat file containing tuples to be read. | |
std::string | errorFilePath |
Path to the error log used for writing errors encountered while processing tuples. | |
char | fieldDelim |
Delimiter used to separate fields in a row. | |
char | rowDelim |
Delimiter used to terminate a row. | |
char | quoteChar |
Character used to quote data values. | |
char | escapeChar |
Ignored outside of quoted values. | |
bool | header |
Specifies whether the flat file contains a header. | |
int | numRowsScan |
Specifies number of rows to scan when sampling data. | |
std::string | calcProgram |
Converts flat file text into typed data. | |
FlatFileMode | mode |
Mode in which to run the flatfile scan. | |
int | errorMax |
The maximum number of errors to allow before failing. | |
int | errorLogMax |
The maximum number of errors to log. | |
bool | lenient |
Whether to be lenient when reading flatfile columns. | |
bool | trim |
Whether to trim output columns. | |
bool | mapped |
Whether to map source columns to target columns by name. | |
std::vector< std::string > | columnNames |
Names of the target columns. | |
TupleDescriptor | outputTupleDesc |
TupleFormat | outputTupleFormat |
SharedCacheAccessor | pCacheAccessor |
CacheAccessor to use for any data access. | |
SegmentAccessor | scratchAccessor |
Accessor for segment to use for allocating scratch buffers. |
Currently, ASCII data is supported. More parameters may be needed to support internationalization.
TODO: review whether it is ok to infer parsing and storage behavior from output tuple type. Probably should parse field and row delim on the Java side.
Definition at line 53 of file FlatFileExecStream.h.
FlatFileExecStreamParams::FlatFileExecStreamParams | ( | ) | [inline, explicit] |
Definition at line 157 of file FlatFileExecStream.h.
References errorFilePath, escapeChar, fieldDelim, FLATFILE_MODE_QUERY, header, mode, numRowsScan, quoteChar, and rowDelim.
00158 { 00159 errorFilePath = ""; 00160 fieldDelim = ','; 00161 rowDelim = '\n'; 00162 quoteChar = '"'; 00163 escapeChar = '\\'; 00164 header = true; 00165 numRowsScan = 0; 00166 mode = FLATFILE_MODE_QUERY; 00167 }
std::string FlatFileExecStreamParams::dataFilePath |
Path to the flat file containing tuples to be read.
This path follows conventions of the operating system.
Definition at line 61 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
std::string FlatFileExecStreamParams::errorFilePath |
Path to the error log used for writing errors encountered while processing tuples.
If this value is empty, then no logging will be performed.
Definition at line 68 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), and ExecStreamFactory::visit().
Delimiter used to separate fields in a row.
This value is typically ',' (comma) or '\t' (tab) or zero, which signifies no delimiter.
Definition at line 74 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Delimiter used to terminate a row.
This value is typically '\n' (newline), which represents any combination of '\r' and '\n'.
Definition at line 80 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Character used to quote data values.
Quoted data values must have an opening and terminating quote character. Special characters, such as delimiter characters, may be quoted. The quote character may be empty, or may be a single character.
Definition at line 88 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Ignored outside of quoted values.
This character quotes the quote character itself, or other any character. Within the context of a quoted data value, if the escape character is the same as the quote character, then a single quote continues to represent a closing quote, but a contiguous pair is represents a quote embedded into the data value.
Definition at line 97 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Specifies whether the flat file contains a header.
If a header is specified, it is expected to take up the first line of the file, so this line is skipped. Defaults to false.
Definition at line 104 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Specifies number of rows to scan when sampling data.
Definition at line 109 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
std::string FlatFileExecStreamParams::calcProgram |
Converts flat file text into typed data.
Definition at line 114 of file FlatFileExecStream.h.
Referenced by ExecStreamFactory::visit().
Mode in which to run the flatfile scan.
Definition at line 119 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamParams(), FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
The maximum number of errors to allow before failing.
Resets when the stream is reopened. A value of -1 indicates that there is no max.
Definition at line 125 of file FlatFileExecStream.h.
The maximum number of errors to log.
Resets when the stream is reopened. A value of -1 indicates that there is no max.
Definition at line 131 of file FlatFileExecStream.h.
Whether to be lenient when reading flatfile columns.
If columns are missing at the end of a row, they are treated as null. Unexpected columns at the end of a row are ignored.
Definition at line 138 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Whether to trim output columns.
Definition at line 143 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Whether to map source columns to target columns by name.
Requires a header, to specify source column names, and target column names. Missing columns are filled in with null.
Definition at line 150 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
std::vector<std::string> FlatFileExecStreamParams::columnNames |
Names of the target columns.
Definition at line 155 of file FlatFileExecStream.h.
Referenced by FlatFileExecStreamImpl::prepare(), and ExecStreamFactory::visit().
Definition at line 37 of file SingleOutputExecStream.h.
Referenced by LcsClusterReplaceExecStreamTest::initClusterAppendParams(), LbmExecStreamTestBase::initValuesExecStream(), LcsClusterReplaceExecStreamTest::loadCluster(), LcsMultiClusterAppendTest::loadClusters(), LcsRowScanExecStreamTest::loadOneCluster(), LbmSearchTest::loadTableAndIndex(), BTreeSearchExecStream::prepare(), BTreeReadExecStream::prepare(), FlatFileExecStreamImpl::prepare(), SingleOutputExecStream::prepare(), MockProducerExecStream::prepare(), ExecStreamFactory::readTupleStreamParams(), LcsClusterReplaceExecStreamTest::replaceCluster(), LcsMultiClusterAppendTest::scanCols(), LcsClusterAppendExecStreamTest::setUpDelIndexScan(), LbmSplicerExecStreamTest::spliceInput(), ExecStreamTestSuite::testBTreeInsertExecStream(), CollectExecStreamTestSuite::testCollectCollectUncollectUncollect(), CollectExecStreamTestSuite::testCollectInts(), CollectExecStreamTestSuite::testCollectUncollect(), CalcExecStreamTestSuite::testConstant(), CorrelationJoinExecStreamTestSuite::testCorrelationJoin(), LhxAggExecStreamTest::testCountImpl(), LcsRowScanExecStreamTest::testFilterCols(), LhxAggExecStreamTest::testGroupCountImpl(), LhxJoinExecStreamTest::testImpl(), LbmLoadBitmapTest::testLoad(), LcsClusterAppendExecStreamTest::testLoadMultiCol(), LcsClusterAppendExecStreamTest::testLoadSingleCol(), ExecStreamTestSuite::testMergeImplicitPullInputs(), ExecStreamTestSuite::testNestedLoopJoinExecStream(), LbmNormalizerExecStreamTest::testNormalizer(), LbmMinusExecStreamTest::testRestartingMinus(), LcsRowScanExecStreamTest::testSampleScanCols(), LcsRowScanExecStreamTest::testScanCols(), LbmSearchTest::testScanIdx(), ExecStreamTestSuite::testSingleValueAggExecStream(), LhxAggExecStreamTest::testSingleValueImpl(), LbmSortedAggExecStreamTest::testSortedAgg(), LhxAggExecStreamTest::testSumImpl(), and LcsClusterReplaceExecStreamTest::verifyCluster().
Definition at line 39 of file SingleOutputExecStream.h.
Referenced by SingleOutputExecStream::prepare(), and SingleOutputExecStreamParams::SingleOutputExecStreamParams().
SharedCacheAccessor ExecStreamParams::pCacheAccessor [inherited] |
CacheAccessor to use for any data access.
This will be singular if the stream should not perform data access.
Definition at line 183 of file ExecStreamDefs.h.
Referenced by ExecStreamFactory::createPrivateScratchSegment(), LbmSearchTest::initBTreeExecStreamParam(), LbmLoadBitmapTest::initBTreeExecStreamParam(), LcsClusterReplaceExecStreamTest::initClusterAppendParams(), LbmSearchTest::initClusterScanDef(), LbmLoadBitmapTest::initClusterScanDef(), ExecStreamGraphEmbryo::initStreamParams(), FtrsTableWriterFactory::loadIndex(), LcsRowScanExecStreamTest::loadOneCluster(), BTreeExecStream::newWriter(), LcsRowScanBaseExecStream::prepare(), BTreeExecStream::prepare(), ExecStream::prepare(), ExecStreamFactory::readBTreeStreamParams(), ExecStreamTestSuite::testBTreeInsertExecStream(), LcsClusterAppendExecStreamTest::testLoadMultiCol(), LcsClusterAppendExecStreamTest::testLoadSingleCol(), LcsClusterAppendExecStreamTest::testScanMultiCol(), LcsClusterAppendExecStreamTest::testScanSingleCol(), and LcsClusterReplaceExecStreamTest::verifyCluster().
SegmentAccessor ExecStreamParams::scratchAccessor [inherited] |
Accessor for segment to use for allocating scratch buffers.
This will be singular if the stream should not use any scratch buffers.
Definition at line 189 of file ExecStreamDefs.h.
Referenced by ExecStreamFactory::createPrivateScratchSegment(), LbmSearchTest::initBTreeExecStreamParam(), LbmLoadBitmapTest::initBTreeExecStreamParam(), LcsClusterReplaceExecStreamTest::initClusterAppendParams(), ExecStreamGraphEmbryo::initStreamParams(), FtrsTableWriterFactory::loadIndex(), LcsRowScanExecStreamTest::loadOneCluster(), BTreeExecStream::newWriter(), LcsClusterAppendExecStream::prepare(), BTreeExecStream::prepare(), FlatFileExecStreamImpl::prepare(), MockResourceExecStream::prepare(), ExecStream::prepare(), ExecStreamTestSuite::testBTreeInsertExecStream(), LcsClusterAppendExecStreamTest::testLoadMultiCol(), and LcsClusterAppendExecStreamTest::testLoadSingleCol().