Interface PageWriter


  • public interface PageWriter
    a writer for all the pages of a given column chunk
    • Method Detail

      • writePage

        @Deprecated
        void writePage​(org.apache.parquet.bytes.BytesInput bytesInput,
                       int valueCount,
                       Statistics<?> statistics,
                       Encoding rlEncoding,
                       Encoding dlEncoding,
                       Encoding valuesEncoding)
                throws IOException
        Deprecated.
        will be removed in 2.0.0. This method does not support writing column indexes; Use writePage(BytesInput, int, int, Statistics, Encoding, Encoding, Encoding) instead
        writes a single page
        Parameters:
        bytesInput - the bytes for the page
        valueCount - the number of values in that page
        statistics - the statistics for that page
        rlEncoding - repetition level encoding
        dlEncoding - definition level encoding
        valuesEncoding - values encoding
        Throws:
        IOException - if there is an exception while writing page data
      • writePage

        void writePage​(org.apache.parquet.bytes.BytesInput bytesInput,
                       int valueCount,
                       int rowCount,
                       Statistics<?> statistics,
                       Encoding rlEncoding,
                       Encoding dlEncoding,
                       Encoding valuesEncoding)
                throws IOException
        writes a single page
        Parameters:
        bytesInput - the bytes for the page
        valueCount - the number of values in that page
        rowCount - the number of rows in that page
        statistics - the statistics for that page
        rlEncoding - repetition level encoding
        dlEncoding - definition level encoding
        valuesEncoding - values encoding
        Throws:
        IOException
      • writePageV2

        void writePageV2​(int rowCount,
                         int nullCount,
                         int valueCount,
                         org.apache.parquet.bytes.BytesInput repetitionLevels,
                         org.apache.parquet.bytes.BytesInput definitionLevels,
                         Encoding dataEncoding,
                         org.apache.parquet.bytes.BytesInput data,
                         Statistics<?> statistics)
                  throws IOException
        writes a single page in the new format
        Parameters:
        rowCount - the number of rows in this page
        nullCount - the number of null values (out of valueCount)
        valueCount - the number of values in that page (there could be multiple values per row for repeated fields)
        repetitionLevels - the repetition levels encoded in RLE without any size header
        definitionLevels - the definition levels encoded in RLE without any size header
        dataEncoding - the encoding for the data
        data - the data encoded with dataEncoding
        statistics - optional stats for this page
        Throws:
        IOException - if there is an exception while writing page data
      • getMemSize

        long getMemSize()
        Returns:
        the current size used in the memory buffer for that column chunk
      • allocatedSize

        long allocatedSize()
        Returns:
        the allocated size for the buffer ( > getMemSize() )
      • writeDictionaryPage

        void writeDictionaryPage​(DictionaryPage dictionaryPage)
                          throws IOException
        writes a dictionary page
        Parameters:
        dictionaryPage - the dictionary page containing the dictionary data
        Throws:
        IOException - if there was an exception while writing
      • memUsageString

        String memUsageString​(String prefix)
        Parameters:
        prefix - a prefix header to add at every line
        Returns:
        a string presenting a summary of how memory is used