Class Statistics<T extends Comparable<T>>
- java.lang.Object
-
- org.apache.parquet.column.statistics.Statistics<T>
-
- Type Parameters:
T- the Java type described by this Statistics instance
- Direct Known Subclasses:
BinaryStatistics,BooleanStatistics,DoubleStatistics,FloatStatistics,IntStatistics,LongStatistics
public abstract class Statistics<T extends Comparable<T>> extends Object
Statistics class to keep track of statistics in parquet pages and column chunks
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classStatistics.BuilderBuilder class to build Statistics objects.
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description PrimitiveComparator<T>comparator()Returns thePrimitiveComparatorimplementation to be used to compare two generic values in the proper way (for example, unsigned comparison for UINT_32).intcompareMaxToValue(T value)Compares max to the specified value in the proper way.intcompareMinToValue(T value)Compares min to the specified value in the proper way.abstract Statistics<T>copy()static Statistics<?>createStats(Type type)Creates an emptyStatisticsinstance for the specified type to be used for reading/writing the new min/max statistics used in the V2 format.booleanequals(Object other)Equality comparison method to compare two statistics objects.abstract TgenericGetMax()Returns the max value in the statistics.abstract TgenericGetMin()Returns the min value in the statistics.static Statistics.BuildergetBuilderForReading(PrimitiveType type)Returns a builder to create new statistics object.abstract byte[]getMaxBytes()Abstract method to return the max value as a byte arrayabstract byte[]getMinBytes()Abstract method to return the min value as a byte arraylonggetNumNulls()Returns the null countstatic StatisticsgetStatsBasedOnType(PrimitiveType.PrimitiveTypeName type)Deprecated.UsecreateStats(Type)insteadinthashCode()Hash code for the statistics objectbooleanhasNonNullValue()Returns whether there have been non-null values added to this statisticsvoidincrementNumNulls()Increments the null count by onevoidincrementNumNulls(long increment)Increments the null count by the parameter valuebooleanisEmpty()Returns a boolean specifying if the Statistics object is empty, i.e does not contain valid statistics for the page/column yetbooleanisNumNullsSet()abstract booleanisSmallerThan(long size)Abstract method to return whether the min and max values fit in the given size.protected voidmarkAsNotEmpty()Sets the page/column as having a valid non-null value kind of misnomer hereStringmaxAsString()Returns the string representation of max for debugging/logging purposes.voidmergeStatistics(Statistics stats)Method to merge this statistics object with the object passed as parameter.protected abstract voidmergeStatisticsMinMax(Statistics stats)Abstract method to merge this statistics min and max with the values of the parameter object.StringminAsString()Returns the string representation of min for debugging/logging purposes.abstract voidsetMinMaxFromBytes(byte[] minBytes, byte[] maxBytes)Deprecated.will be removed in 2.0.0.voidsetNumNulls(long nulls)Deprecated.will be removed in 2.0.0.StringtoString()PrimitiveTypetype()voidupdateStats(boolean value)updates statistics min and max using the passed valuevoidupdateStats(double value)updates statistics min and max using the passed valuevoidupdateStats(float value)updates statistics min and max using the passed valuevoidupdateStats(int value)updates statistics min and max using the passed valuevoidupdateStats(long value)updates statistics min and max using the passed valuevoidupdateStats(Binary value)updates statistics min and max using the passed value
-
-
-
Method Detail
-
getStatsBasedOnType
@Deprecated public static Statistics getStatsBasedOnType(PrimitiveType.PrimitiveTypeName type)
Deprecated.UsecreateStats(Type)insteadReturns the typed statistics object based on the passed type parameter- Parameters:
type- PrimitiveTypeName type of the column- Returns:
- instance of a typed statistics class
-
createStats
public static Statistics<?> createStats(Type type)
Creates an emptyStatisticsinstance for the specified type to be used for reading/writing the new min/max statistics used in the V2 format.- Parameters:
type- type of the column- Returns:
- instance of a typed statistics class
-
getBuilderForReading
public static Statistics.Builder getBuilderForReading(PrimitiveType type)
Returns a builder to create new statistics object. Used to read the statistics from the parquet file.- Parameters:
type- type of the column- Returns:
- builder to create new statistics object
-
updateStats
public void updateStats(int value)
updates statistics min and max using the passed value- Parameters:
value- value to use to update min and max
-
updateStats
public void updateStats(long value)
updates statistics min and max using the passed value- Parameters:
value- value to use to update min and max
-
updateStats
public void updateStats(float value)
updates statistics min and max using the passed value- Parameters:
value- value to use to update min and max
-
updateStats
public void updateStats(double value)
updates statistics min and max using the passed value- Parameters:
value- value to use to update min and max
-
updateStats
public void updateStats(boolean value)
updates statistics min and max using the passed value- Parameters:
value- value to use to update min and max
-
updateStats
public void updateStats(Binary value)
updates statistics min and max using the passed value- Parameters:
value- value to use to update min and max
-
equals
public boolean equals(Object other)
Equality comparison method to compare two statistics objects.
-
hashCode
public int hashCode()
Hash code for the statistics object
-
mergeStatistics
public void mergeStatistics(Statistics stats)
Method to merge this statistics object with the object passed as parameter. Merging keeps the smallest of min values, largest of max values and combines the number of null counts.- Parameters:
stats- Statistics object to merge with
-
mergeStatisticsMinMax
protected abstract void mergeStatisticsMinMax(Statistics stats)
Abstract method to merge this statistics min and max with the values of the parameter object. Does not do any checks, only called internally.- Parameters:
stats- Statistics object to merge with
-
setMinMaxFromBytes
@Deprecated public abstract void setMinMaxFromBytes(byte[] minBytes, byte[] maxBytes)
Deprecated.will be removed in 2.0.0. UsegetBuilderForReading(PrimitiveType)instead.Abstract method to set min and max values from byte arrays.- Parameters:
minBytes- byte array to set the min value tomaxBytes- byte array to set the max value to
-
genericGetMin
public abstract T genericGetMin()
Returns the min value in the statistics. The java natural order of the returned type defined byComparable.compareTo(Object)might not be the proper one. For example, UINT_32 requires unsigned comparison instead of the natural signed one. UsecompareMinToValue(Comparable)or the comparator returned bycomparator()to always get the proper ordering.- Returns:
- the min value
-
genericGetMax
public abstract T genericGetMax()
Returns the max value in the statistics. The java natural order of the returned type defined byComparable.compareTo(Object)might not be the proper one. For example, UINT_32 requires unsigned comparison instead of the natural signed one. UsecompareMaxToValue(Comparable)or the comparator returned bycomparator()to always get the proper ordering.- Returns:
- the max value
-
comparator
public final PrimitiveComparator<T> comparator()
Returns thePrimitiveComparatorimplementation to be used to compare two generic values in the proper way (for example, unsigned comparison for UINT_32).- Returns:
- the comparator for data described by this Statistics instance
-
compareMinToValue
public final int compareMinToValue(T value)
Compares min to the specified value in the proper way. It does the same as invokingcomparator().compare(genericGetMin(), value). The corresponding statistics implementations overload this method so the one with the primitive argument shall be used to avoid boxing/unboxing.- Parameters:
value- the value whichminis to be compared to- Returns:
- a negative integer, zero, or a positive integer as
minis less than, equal to, or greater thanvalue.
-
compareMaxToValue
public final int compareMaxToValue(T value)
Compares max to the specified value in the proper way. It does the same as invokingcomparator().compare(genericGetMax(), value). The corresponding statistics implementations overload this method so the one with the primitive argument shall be used to avoid boxing/unboxing.- Parameters:
value- the value whichmaxis to be compared to- Returns:
- a negative integer, zero, or a positive integer as
maxis less than, equal to, or greater thanvalue.
-
getMaxBytes
public abstract byte[] getMaxBytes()
Abstract method to return the max value as a byte array- Returns:
- byte array corresponding to the max value
-
getMinBytes
public abstract byte[] getMinBytes()
Abstract method to return the min value as a byte array- Returns:
- byte array corresponding to the min value
-
minAsString
public String minAsString()
Returns the string representation of min for debugging/logging purposes.- Returns:
- the min value as a string
-
maxAsString
public String maxAsString()
Returns the string representation of max for debugging/logging purposes.- Returns:
- the max value as a string
-
isSmallerThan
public abstract boolean isSmallerThan(long size)
Abstract method to return whether the min and max values fit in the given size.- Parameters:
size- a size in bytes- Returns:
- true iff the min and max values are less than size bytes
-
incrementNumNulls
public void incrementNumNulls()
Increments the null count by one
-
incrementNumNulls
public void incrementNumNulls(long increment)
Increments the null count by the parameter value- Parameters:
increment- value to increment the null count by
-
getNumNulls
public long getNumNulls()
Returns the null count- Returns:
- null count or
-1if the null count is not set
-
setNumNulls
@Deprecated public void setNumNulls(long nulls)
Deprecated.will be removed in 2.0.0. UsegetBuilderForReading(PrimitiveType)instead.Sets the number of nulls to the parameter value- Parameters:
nulls- null count to set the count to
-
isEmpty
public boolean isEmpty()
Returns a boolean specifying if the Statistics object is empty, i.e does not contain valid statistics for the page/column yet- Returns:
- true if object is empty, false otherwise
-
hasNonNullValue
public boolean hasNonNullValue()
Returns whether there have been non-null values added to this statistics- Returns:
- true if the values contained at least one non-null value
-
isNumNullsSet
public boolean isNumNullsSet()
- Returns:
- whether numNulls is set and can be used
-
markAsNotEmpty
protected void markAsNotEmpty()
Sets the page/column as having a valid non-null value kind of misnomer here
-
copy
public abstract Statistics<T> copy()
- Returns:
- a new independent statistics instance of this class.
-
type
public PrimitiveType type()
- Returns:
- the primitive type object which this statistics is created for
-
-