public abstract class DeltaBinaryPackingValuesWriter extends ValuesWriter
delta-binary-packing: <page-header> <block>*
page-header := <block size in values> <number of miniblocks in a block> <total value count> <first value>
block := <min delta> <list of bitwidths of miniblocks> <miniblocks>
min delta : zig-zag var int encoded
bitWidthsOfMiniBlock : 1 byte little endian
blockSizeInValues,blockSizeInValues,totalValueCount,firstValue : unsigned varint
The algorithm and format is inspired by D. Lemire's paper: http://lemire.me/blog/archives/2012/09/12/fast-integer-compression-decoding-billions-of-integers-per-second/
| Modifier and Type | Field and Description |
|---|---|
protected CapacityByteArrayOutputStream |
baos |
protected int[] |
bitWidths
bit width for each mini block, reused between flushes
|
protected org.apache.parquet.column.values.delta.DeltaBinaryPackingConfig |
config
stores blockSizeInValues, miniBlockNumInABlock and miniBlockSizeInValues
|
static int |
DEFAULT_NUM_BLOCK_VALUES |
static int |
DEFAULT_NUM_MINIBLOCKS |
protected int |
deltaValuesToFlush
a pointer to deltaBlockBuffer indicating the end of deltaBlockBuffer
the number of values in the deltaBlockBuffer that haven't flushed to baos
it will be reset after each flush
|
protected byte[] |
miniBlockByteBuffer
bytes buffer for a mini block, it is reused for each mini block.
|
protected int |
totalValueCount |
| Constructor and Description |
|---|
DeltaBinaryPackingValuesWriter(int slabSize,
int pageSize,
ByteBufferAllocator allocator) |
DeltaBinaryPackingValuesWriter(int blockSizeInValues,
int miniBlockNum,
int slabSize,
int pageSize,
ByteBufferAllocator allocator) |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Called to close the values writer.
|
long |
getAllocatedSize()
|
long |
getBufferedSize()
used to decide if we want to work to the next page
|
Encoding |
getEncoding()
called after getBytes() and before reset()
|
protected int |
getMiniBlockCountToFlush(double numberCount) |
String |
memUsageString(String prefix) |
void |
reset()
called after getBytes() to reset the current buffer and start writing the next page
|
protected void |
writeBitWidthForMiniBlock(int i) |
getBytes, resetDictionary, toDictPageAndClose, writeBoolean, writeByte, writeBytes, writeDouble, writeFloat, writeInteger, writeLongpublic static final int DEFAULT_NUM_BLOCK_VALUES
public static final int DEFAULT_NUM_MINIBLOCKS
protected final CapacityByteArrayOutputStream baos
protected final org.apache.parquet.column.values.delta.DeltaBinaryPackingConfig config
protected final int[] bitWidths
protected int totalValueCount
protected int deltaValuesToFlush
protected byte[] miniBlockByteBuffer
public DeltaBinaryPackingValuesWriter(int slabSize,
int pageSize,
ByteBufferAllocator allocator)
public DeltaBinaryPackingValuesWriter(int blockSizeInValues,
int miniBlockNum,
int slabSize,
int pageSize,
ByteBufferAllocator allocator)
public long getBufferedSize()
ValuesWritergetBufferedSize in class ValuesWriterprotected void writeBitWidthForMiniBlock(int i)
protected int getMiniBlockCountToFlush(double numberCount)
public Encoding getEncoding()
ValuesWritergetEncoding in class ValuesWriterpublic void reset()
ValuesWriterreset in class ValuesWriterpublic void close()
ValuesWriterclose in class ValuesWriterpublic long getAllocatedSize()
ValuesWritergetAllocatedSize in class ValuesWriterpublic String memUsageString(String prefix)
memUsageString in class ValuesWriterCopyright © 2019 The Apache Software Foundation. All rights reserved.