Class CorruptStatistics


  • public class CorruptStatistics
    extends Object
    There was a bug (PARQUET-251) that caused the statistics metadata for binary columns to be corrupted in the write path. This class is used to detect whether a file was written with this bug, and thus it's statistics should be ignored / not trusted.
    • Constructor Detail

      • CorruptStatistics

        public CorruptStatistics()
    • Method Detail

      • shouldIgnoreStatistics

        public static boolean shouldIgnoreStatistics​(String createdBy,
                                                     PrimitiveType.PrimitiveTypeName columnType)
        Decides if the statistics from a file created by createdBy (the created_by field from parquet format) should be ignored because they are potentially corrupt.
        Parameters:
        createdBy - the created-by string from a file footer
        columnType - the type of the column that this is checking
        Returns:
        true if the statistics may be invalid and should be ignored, false otherwise