Skip navigation links
A B C D E F G H I L M N O P R S T U V W 

A

ACTIVE_SUFFIX - Static variable in class org.jwat.warc.WarcFileWriter
Suffix used for open files.
addEmptyWarning(String) - Method in class org.jwat.warc.WarcFieldParsers
Add a warning diagnosis on the given entity stating that it is empty.
addErrorDiagnosis(DiagnosisType, String, String...) - Method in class org.jwat.warc.WarcHeader
Add an error diagnosis of the given type on a specific entity with optional extra information.
addErrorDiagnosis(DiagnosisType, String, String...) - Method in class org.jwat.warc.WarcRecord
Add an error diagnosis of the given type on a specific entity with optional extra information.
addHeader(HeaderLine) - Method in class org.jwat.warc.WarcHeader
Identify a (WARC) header name, validate the value and set the header.
addHeader(String, String) - Method in class org.jwat.warc.WarcHeader
Add a String header using the supplied string and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, Integer, String) - Method in class org.jwat.warc.WarcHeader
Add an Integer header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, Long, String) - Method in class org.jwat.warc.WarcHeader
Add a Long header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, WarcDigest, String) - Method in class org.jwat.warc.WarcHeader
Add an Digest header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, ContentType, String) - Method in class org.jwat.warc.WarcHeader
Add an Content-Type header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, Date, String) - Method in class org.jwat.warc.WarcHeader
Add an Date header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, InetAddress, String) - Method in class org.jwat.warc.WarcHeader
Add an InetAddress header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, Uri, String) - Method in class org.jwat.warc.WarcHeader
Add an URI header using the supplied string and object values and return a HeaderLine object corresponding to how the header would be read.
addHeader(String, String, int, Integer, Long, WarcDigest, ContentType, Date, InetAddress, Uri) - Method in class org.jwat.warc.WarcHeader
Add a header with the supplied field name, data type and value and return a HeaderLine corresponding to how the header will be read.
addInvalidExpectedError(String, String...) - Method in class org.jwat.warc.WarcFieldParsers
Add an error diagnosis on the given entity stating that it is invalid and something else was expected.
addWarningDiagnosis(DiagnosisType, String, String...) - Method in class org.jwat.warc.WarcHeader
Add a warning diagnosis of the given type on a specific entity with optional extra information.

B

bBlockDigest - Variable in class org.jwat.warc.WarcReader
Block Digest enabled/disabled.
bClosed - Variable in class org.jwat.warc.WarcRecord
Has record been closed before.
bCompression - Variable in class org.jwat.warc.WarcFileWriterConfig
Compress archive(s).
bExceptionOnContentLengthMismatch - Variable in class org.jwat.warc.WarcWriter
Configuration for throwing exception on content-length mismatch.
bIsCompliant - Variable in class org.jwat.warc.WarcReader
Compliance status for records parsed up to now.
bIsCompliant - Variable in class org.jwat.warc.WarcRecord
Is this record compliant ie.
blockDigestAlgorithm - Variable in class org.jwat.warc.WarcReader
Default block digest algorithm to use if none is present in the record.
blockDigestEncoding - Variable in class org.jwat.warc.WarcReader
Default encoding scheme used to encode block digest into a string, if none is detected from the record.
bMagicIdentified - Variable in class org.jwat.warc.WarcHeader
Was "WARC/" identified while looking for the version string.
bMandatoryMissing - Variable in class org.jwat.warc.WarcHeader
Is the header missing one of the mandatory headers.
bOverwrite - Variable in class org.jwat.warc.WarcFileWriterConfig
Overwrite existing file(s).
bPayloadClosed - Variable in class org.jwat.warc.WarcRecord
Has payload been closed before.
bPayloadDigest - Variable in class org.jwat.warc.WarcReader
Payload Digest enabled/disabled.
bufferSize - Variable in class org.jwat.warc.WarcReaderCompressed
Buffer size, if any, to use on GZip entry InputStream.
bValidVersion - Variable in class org.jwat.warc.WarcHeader
Is the version recognized.
bValidVersionFormat - Variable in class org.jwat.warc.WarcHeader
Is the version format valid.
bVersionParsed - Variable in class org.jwat.warc.WarcHeader
Did the version string include between 2 and 4 substrings delimited by ".".

C

checkFieldPolicy(int, int, Object, String) - Method in class org.jwat.warc.WarcHeader
Given a WARC record type and a WARC field looks up the policy in a matrix build from the WARC ISO standard.
checkFields() - Method in class org.jwat.warc.WarcHeader
Validate the WARC header relative to the WARC-Type and according to the WARC ISO standard.
close() - Method in class org.jwat.warc.WarcFileWriter
Close writer and release all resources.
close() - Method in class org.jwat.warc.WarcReader
Close current record resource(s) and input stream(s).
close() - Method in class org.jwat.warc.WarcReaderCompressed
 
close() - Method in class org.jwat.warc.WarcReaderUncompressed
 
close() - Method in class org.jwat.warc.WarcRecord
Close resources associated with the WARC record.
close() - Method in class org.jwat.warc.WarcWriter
Close WARC writer and free its resources.
close() - Method in class org.jwat.warc.WarcWriterCompressed
 
close() - Method in class org.jwat.warc.WarcWriterUncompressed
 
closeRecord() - Method in class org.jwat.warc.WarcWriter
Close the WARC record in an implementation specific way.
closeRecord() - Method in class org.jwat.warc.WarcWriterCompressed
 
closeRecord() - Method in class org.jwat.warc.WarcWriterUncompressed
 
closeRecord_impl() - Method in class org.jwat.warc.WarcWriter
Closes the WARC record by writing two newlines and comparing the amount of payload data streamed with the content-length supplied with the header.
computedBlockDigest - Variable in class org.jwat.warc.WarcRecord
Computed block digest.
computedPayloadDigest - Variable in class org.jwat.warc.WarcRecord
Computed payload digest.
consumed - Variable in class org.jwat.warc.WarcReader
Number of bytes consumed by this reader.
consumed - Variable in class org.jwat.warc.WarcRecord
Uncompressed bytes consumed while validating this record.
CONTENT_TYPE_FORMAT - Static variable in class org.jwat.warc.WarcConstants
Content-type format string as specified in RFC2616.
CONTENT_TYPE_METADATA - Static variable in class org.jwat.warc.WarcConstants
Suggested content-type for metadata records and others.
contentLength - Variable in class org.jwat.warc.WarcHeader
Content-Length converted to a Long object, if valid.
contentLengthStr - Variable in class org.jwat.warc.WarcHeader
Content-Length field string value.
contentType - Variable in class org.jwat.warc.WarcHeader
Content-Type converted to a ContentType object, if valid.
contentTypeStr - Variable in class org.jwat.warc.WarcHeader
Content-Type field string value.
createRecord(WarcWriter) - Static method in class org.jwat.warc.WarcRecord
Create a WarcRecord and prepare it for writing.
createWarcDigest(String, byte[], String, String) - Static method in class org.jwat.warc.WarcDigest
Create an object with the supplied parameters.
CT_APP_WARC_FIELDS - Static variable in class org.jwat.warc.WarcConstants
Suggested content-type/media-type for metadata records and others.
currentEntry - Variable in class org.jwat.warc.WarcReaderCompressed
GZip entry for the current record, if random access methods used.
currentReader - Variable in class org.jwat.warc.WarcReaderCompressed
GZip reader used for the current record, if random access methods used.
currentRecord - Variable in class org.jwat.warc.WarcReader
Current WARC record object.

D

date - Variable in class org.jwat.warc.WarcFileNamingDefault
Date component.
dateFormat - Variable in class org.jwat.warc.WarcFileNamingDefault
DateFormat to the following format 'yyyyMMddHHmmss'.
dateStr - Variable in class org.jwat.warc.WarcFileNamingDefault
Date component converted into a human readable string.
DEFAULT_MAX_FILE_SIZE - Static variable in class org.jwat.warc.WarcFileWriterConfig
Standard/default max file size.
diagnostics - Variable in class org.jwat.warc.WarcFieldParsers
Diagnostics used to report diagnoses.
diagnostics - Variable in class org.jwat.warc.WarcHeader
Diagnostics used to report diagnoses.
diagnostics - Variable in class org.jwat.warc.WarcReader
Reader level errors and warnings or when no record is available.
diagnostics - Variable in class org.jwat.warc.WarcRecord
Validation errors and warnings.
diagnostics - Variable in class org.jwat.warc.WarcWriter
Writer level errors and warnings or when writing byte headers.

E

endMark - Static variable in class org.jwat.warc.WarcConstants
End mark used after each record consisting of two newlines.
entry - Variable in class org.jwat.warc.WarcWriterCompressed
Current GZip entry.
equals(Object) - Method in class org.jwat.warc.WarcConcurrentTo
 
errors - Variable in class org.jwat.warc.WarcReader
Aggregated number of errors encountered while parsing.
exceptionOnContentLengthMismatch() - Method in class org.jwat.warc.WarcWriter
Does this writer throw an exception if the content-length does not match the payload amount written.
extension - Variable in class org.jwat.warc.WarcFileNamingDefault
Extension component (including leading ".").

F

FDT_CONTENTTYPE - Static variable in class org.jwat.warc.WarcConstants
WARC ContentType field datatype identifier.
FDT_DATE - Static variable in class org.jwat.warc.WarcConstants
WARC Date field datatype identifier.
FDT_DIGEST - Static variable in class org.jwat.warc.WarcConstants
WARC Digest field datatype identifier.
FDT_IDX_STRINGS - Static variable in class org.jwat.warc.WarcConstants
WARC field datatype id to field datatype name mapping table.
FDT_INETADDRESS - Static variable in class org.jwat.warc.WarcConstants
WARC InetAddress field datatype identifier.
FDT_INTEGER - Static variable in class org.jwat.warc.WarcConstants
WARC Integer field datatype identifier.
FDT_LONG - Static variable in class org.jwat.warc.WarcConstants
WARC Long field datatype identifier.
FDT_STRING - Static variable in class org.jwat.warc.WarcConstants
WARC String field datatype identifier.
FDT_URI - Static variable in class org.jwat.warc.WarcConstants
WARC URI field datatype identifier.
field_policy - Static variable in class org.jwat.warc.WarcConstants
A (Warc-Types x Warc-Header-Fields) matrix used for policy validation.
fieldNameIdxMap - Static variable in class org.jwat.warc.WarcConstants
Map used to identify known warc field names.
fieldNamesRepeatableLookup - Static variable in class org.jwat.warc.WarcConstants
Lookup table of Warc fields that can have multiple occurrences.
fieldParsers - Variable in class org.jwat.warc.WarcHeader
WARC field parser used.
fieldParsers - Variable in class org.jwat.warc.WarcReader
WARC field parser used.
fieldParsers - Variable in class org.jwat.warc.WarcWriter
WARC field parser used.
filename - Variable in class org.jwat.warc.WarcFileNamingSingleFile
File name to use.
filePrefix - Variable in class org.jwat.warc.WarcFileNamingDefault
Prefix component.
FN_CONTENT_LENGTH - Static variable in class org.jwat.warc.WarcConstants
Content-length field name.
FN_CONTENT_TYPE - Static variable in class org.jwat.warc.WarcConstants
Content-type field name.
FN_IDX_CONTENT_LENGTH - Static variable in class org.jwat.warc.WarcConstants
Warc reader content-length field name id.
FN_IDX_CONTENT_TYPE - Static variable in class org.jwat.warc.WarcConstants
Warc reader content-type field name id.
FN_IDX_DT - Static variable in class org.jwat.warc.WarcConstants
Array to lookup WARC field datatypes.
FN_IDX_STRINGS - Static variable in class org.jwat.warc.WarcConstants
WARC field name id to field name mapping table.
FN_IDX_WARC_BLOCK_DIGEST - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-block-digest field name id.
FN_IDX_WARC_CONCURRENT_TO - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-concurrent-to field name id.
FN_IDX_WARC_DATE - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-date field name id.
FN_IDX_WARC_FILENAME - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-filename field name id.
FN_IDX_WARC_IDENTIFIED_PAYLOAD_TYPE - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-identified-payload-type field name id.
FN_IDX_WARC_IP_ADDRESS - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-ip-address field name id.
FN_IDX_WARC_PAYLOAD_DIGEST - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-payload-digest field name id.
FN_IDX_WARC_PROFILE - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-profile field name id.
FN_IDX_WARC_RECORD_ID - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-record-id field name id.
FN_IDX_WARC_REFERS_TO - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-refers-to field name id.
FN_IDX_WARC_REFERS_TO_DATE - Static variable in class org.jwat.warc.WarcConstants
WARC-Refers-To-Date field name id.
FN_IDX_WARC_REFERS_TO_TARGET_URI - Static variable in class org.jwat.warc.WarcConstants
WARC-Refers-To-Target-URI field name id.
FN_IDX_WARC_SEGMENT_NUMBER - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-segment-number field name id.
FN_IDX_WARC_SEGMENT_ORIGIN_ID - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-segment-origin-id field name id.
FN_IDX_WARC_SEGMENT_TOTAL_LENGTH - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-segment-totalt-length field name id.
FN_IDX_WARC_TARGET_URI - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-target-uri field name id.
FN_IDX_WARC_TRUNCATED - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-truncated field name id.
FN_IDX_WARC_TYPE - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-type field name id.
FN_IDX_WARC_WARCINFO_ID - Static variable in class org.jwat.warc.WarcConstants
Warc reader warc-warcinfo-id field name id.
FN_INDEX_OF_LAST - Static variable in class org.jwat.warc.WarcConstants
Index of last WARC field (zero-indexed).
FN_NUMBER - Static variable in class org.jwat.warc.WarcConstants
Number of WARC fields.
FN_WARC_BLOCK_DIGEST - Static variable in class org.jwat.warc.WarcConstants
Warc-block-digest field name.
FN_WARC_CONCURRENT_TO - Static variable in class org.jwat.warc.WarcConstants
Warc-concurrent-to field name.
FN_WARC_DATE - Static variable in class org.jwat.warc.WarcConstants
Warc-date field name.
FN_WARC_FILENAME - Static variable in class org.jwat.warc.WarcConstants
Warc-filename field name.
FN_WARC_IDENTIFIED_PAYLOAD_TYPE - Static variable in class org.jwat.warc.WarcConstants
Warc-identified-payload-type field name.
FN_WARC_IP_ADDRESS - Static variable in class org.jwat.warc.WarcConstants
Warc-ip-address field name.
FN_WARC_PAYLOAD_DIGEST - Static variable in class org.jwat.warc.WarcConstants
Warc-payload-digest field name.
FN_WARC_PROFILE - Static variable in class org.jwat.warc.WarcConstants
Warc-profile field name.
FN_WARC_RECORD_ID - Static variable in class org.jwat.warc.WarcConstants
Warc-record-id field name.
FN_WARC_REFERS_TO - Static variable in class org.jwat.warc.WarcConstants
Warc-refers-to field name.
FN_WARC_REFERS_TO_DATE - Static variable in class org.jwat.warc.WarcConstants
WARC-Refers-To-Date field name.
FN_WARC_REFERS_TO_TARGET_URI - Static variable in class org.jwat.warc.WarcConstants
WARC-Refers-To-Target-URI field name.
FN_WARC_SEGMENT_NUMBER - Static variable in class org.jwat.warc.WarcConstants
Warc-segment-number field name.
FN_WARC_SEGMENT_ORIGIN_ID - Static variable in class org.jwat.warc.WarcConstants
Warc-segment-origin-id field name.
FN_WARC_SEGMENT_TOTAL_LENGTH - Static variable in class org.jwat.warc.WarcConstants
Warc-segment-totalt-length field name.
FN_WARC_TARGET_URI - Static variable in class org.jwat.warc.WarcConstants
Warc-target-uri field name.
FN_WARC_TRUNCATED - Static variable in class org.jwat.warc.WarcConstants
Warc-truncated field name.
FN_WARC_TYPE - Static variable in class org.jwat.warc.WarcConstants
Warc-type field name.
FN_WARC_WARCINFO_ID - Static variable in class org.jwat.warc.WarcConstants
Warc-warcinfo-id field name.

G

getBlockDigestAlgorithm() - Method in class org.jwat.warc.WarcReader
Get the default block digest algorithm.
getBlockDigestEnabled() - Method in class org.jwat.warc.WarcReader
Get the readers block digest on/off status.
getBlockDigestEncoding() - Method in class org.jwat.warc.WarcReader
Get the default block digest encoding scheme.
getConsumed() - Method in class org.jwat.warc.WarcReader
Get number of bytes consumed by this reader.
getConsumed() - Method in class org.jwat.warc.WarcReaderCompressed
Get number of bytes consumed by the WARC GzipReader.
getConsumed() - Method in class org.jwat.warc.WarcReaderUncompressed
 
getConsumed() - Method in class org.jwat.warc.WarcRecord
Return number of uncompressed bytes consumed validating this record.
getDate(String) - Static method in class org.jwat.warc.WarcDateParser
Parses the date using the format "yyyy-MM-ddTHH:mm:ssZ".
getDateFormat() - Static method in class org.jwat.warc.WarcDateParser
Return a DateFormat object which can be used to string format WARC dates.
getFile() - Method in class org.jwat.warc.WarcFileWriter
Returns the current EARC file object.
getFilename(int, boolean) - Method in interface org.jwat.warc.WarcFileNaming
Return the next file name to use.
getFilename(int, boolean) - Method in class org.jwat.warc.WarcFileNamingDefault
 
getFilename(int, boolean) - Method in class org.jwat.warc.WarcFileNamingSingleFile
 
getHeader(String) - Method in class org.jwat.warc.WarcHeader
Get a header line structure or null, if no header line structure is stored with the given header name.
getHeader(String) - Method in class org.jwat.warc.WarcRecord
Get a non-standard WARC header or null, if nothing is stored for this header name.
getHeaderList() - Method in class org.jwat.warc.WarcHeader
Get a List of all the headers found during parsing.
getHeaderList() - Method in class org.jwat.warc.WarcRecord
Get a List of all the non-standard WARC headers found during parsing.
getHttpHeader() - Method in class org.jwat.warc.WarcRecord
Returns the HttpHeader object if identified in the payload, or null.
getIteratorExceptionThrown() - Method in class org.jwat.warc.WarcReader
Gets an exception thrown in the iterator if any or null.
getNextRecord() - Method in class org.jwat.warc.WarcReader
Parses and gets the next record.
getNextRecord() - Method in class org.jwat.warc.WarcReaderCompressed
 
getNextRecord() - Method in class org.jwat.warc.WarcReaderUncompressed
 
getNextRecordFrom(InputStream, long) - Method in class org.jwat.warc.WarcReader
Parses and gets the next record from an Inputstream.
getNextRecordFrom(InputStream, long, int) - Method in class org.jwat.warc.WarcReader
Parses and gets the next record from an Inputstream wrapped by a BufferedInputStream.
getNextRecordFrom(InputStream, long) - Method in class org.jwat.warc.WarcReaderCompressed
 
getNextRecordFrom(InputStream, long, int) - Method in class org.jwat.warc.WarcReaderCompressed
 
getNextRecordFrom(InputStream, long) - Method in class org.jwat.warc.WarcReaderUncompressed
 
getNextRecordFrom(InputStream, long, int) - Method in class org.jwat.warc.WarcReaderUncompressed
 
getOffset() - Method in class org.jwat.warc.WarcReader
Get the current offset in the WARC InputStream.
getOffset() - Method in class org.jwat.warc.WarcReaderCompressed
Get the current offset in the WARC GzipReader.
getOffset() - Method in class org.jwat.warc.WarcReaderUncompressed
 
getPayload() - Method in class org.jwat.warc.WarcRecord
Return Payload object.
getPayloadContent() - Method in class org.jwat.warc.WarcRecord
Payload content InputStream getter.
getPayloadDigestAlgorithm() - Method in class org.jwat.warc.WarcReader
Get the default payload digest algorithm.
getPayloadDigestEnabled() - Method in class org.jwat.warc.WarcReader
Get the readers payload digest on/off status.
getPayloadDigestEncoding() - Method in class org.jwat.warc.WarcReader
Get the default payload digest encoding scheme.
getPayloadHeaderMaxSize() - Method in class org.jwat.warc.WarcReader
Get the max size allowed for a payload header.
getReader(InputStream, int) - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader from an InputStream wrapped by a BufferedInputStream.
getReader(InputStream) - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader from an InputStream.
getReaderCompressed() - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader without any associated InputStream for random access to GZip compressed records.
getReaderCompressed(InputStream) - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader from an InputStream primarily for random access to GZip compressed records.
getReaderCompressed(InputStream, int) - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader from an InputStream wrapped by a BufferedInputStream primarily for random access to GZip compressed records.
getReaderUncompressed() - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader without any associated InputStream for random access to uncompressed records.
getReaderUncompressed(InputStream) - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader from an InputStream primarily for random access to uncompressed records.
getReaderUncompressed(InputStream, int) - Static method in class org.jwat.warc.WarcReaderFactory
Creates a new WarcReader from an InputStream wrapped by a BufferedInputStream primarily for random access to uncompressed records.
getRecordHeaderMaxSize() - Method in class org.jwat.warc.WarcReader
Get the max size allowed for a record header.
getSequenceNr() - Method in class org.jwat.warc.WarcFileWriter
Returns the current sequence number.
getStartOffset() - Method in class org.jwat.warc.WarcHeader
Returns the starting offset of the record in the containing WARC.
getStartOffset() - Method in class org.jwat.warc.WarcReader
Get the offset of the current WARC record or -1 if none have been read.
getStartOffset() - Method in class org.jwat.warc.WarcReaderCompressed
Get the offset of the current WARC record from the GZip entry or -1 if no records have been read yet.
getStartOffset() - Method in class org.jwat.warc.WarcReaderUncompressed
 
getStartOffset() - Method in class org.jwat.warc.WarcRecord
Get the record offset relative to the start of the WARC file InputStream.
getUriProfile() - Method in class org.jwat.warc.WarcReader
Get the URI profile used to validate URIs.
getUriProfile() - Method in class org.jwat.warc.WarcWriter
Get the URI profile used to validate URIs.
getWarcTargetUriProfile() - Method in class org.jwat.warc.WarcReader
Get the URI profile used to validate WARC-Target URIs.
getWarcTargetUriProfile() - Method in class org.jwat.warc.WarcWriter
Get the URI profile used to validate WARC-Target URIs.
getWarcWriterInstance(WarcFileNaming, WarcFileWriterConfig) - Static method in class org.jwat.warc.WarcFileWriter
Returns a configured WARC file writer.
getWriter() - Method in class org.jwat.warc.WarcFileWriter
Returns the current WARC writer object.
getWriter(OutputStream, boolean) - Static method in class org.jwat.warc.WarcWriterFactory
Creates a new unbuffered WarcWriter from an OutputStream.
getWriter(OutputStream, int, boolean) - Static method in class org.jwat.warc.WarcWriterFactory
Creates a new buffered WarcWriter from an OutputStream.
getWriterCompressed(OutputStream) - Static method in class org.jwat.warc.WarcWriterFactory
Creates a new unbuffered compressing WarcWriter from an OutputStream.
getWriterCompressed(OutputStream, int) - Static method in class org.jwat.warc.WarcWriterFactory
Creates a new buffered compressing WarcWriter from an OutputStream.
getWriterUncompressed(OutputStream) - Static method in class org.jwat.warc.WarcWriterFactory
Creates a new unbuffered non compressing WarcWriter from an OutputStream.
getWriterUncompressed(OutputStream, int) - Static method in class org.jwat.warc.WarcWriterFactory
Creates a new buffered non compressing WarcWriter from an OutputStream.

H

hashCode() - Method in class org.jwat.warc.WarcConcurrentTo
 
hasPayload() - Method in class org.jwat.warc.WarcRecord
Specifies whether this record has a payload or not.
header - Variable in class org.jwat.warc.WarcRecord
WARC header.
header - Variable in class org.jwat.warc.WarcWriter
Current WARC header written.
headerBytes - Variable in class org.jwat.warc.WarcHeader
Raw WARC header byte array.
headerBytesOut - Variable in class org.jwat.warc.WarcHeader
Raw WARC header output stream.
headerContentLength - Variable in class org.jwat.warc.WarcWriter
Content-Length from the WARC header.
headerLineReader - Variable in class org.jwat.warc.WarcReader
Header line reader used to read the WARC headers.
headerList - Variable in class org.jwat.warc.WarcHeader
List of parsed header fields.
headerMap - Variable in class org.jwat.warc.WarcHeader
Map of parsed header fields.
hostname - Variable in class org.jwat.warc.WarcFileNamingDefault
Host name component.
httpHeader - Variable in class org.jwat.warc.WarcRecord
HTTP header content parsed from payload.

I

in - Variable in class org.jwat.warc.WarcReaderUncompressed
WARC file InputStream.
in - Variable in class org.jwat.warc.WarcRecord
Input stream used to read this record.
init() - Method in class org.jwat.warc.WarcReader
Method used to initialize a readers internal state.
init() - Method in class org.jwat.warc.WarcWriter
Method used to initialize a readers internal state.
initHeader(WarcWriter, Diagnostics<Diagnosis>) - Static method in class org.jwat.warc.WarcHeader
Create and initialize a new WarcHeader for writing.
initHeader(WarcReader, long, Diagnostics<Diagnosis>) - Static method in class org.jwat.warc.WarcHeader
Create and initialize a new WarcHeader for reading.
isClosed() - Method in class org.jwat.warc.WarcRecord
Check to see if the record has been closed.
isCompliant() - Method in class org.jwat.warc.WarcReader
Returns a boolean indicating if all records parsed so far are compliant.
isCompliant() - Method in class org.jwat.warc.WarcRecord
Returns a boolean indicating the ISO compliance status of this record.
isCompressed() - Method in class org.jwat.warc.WarcReader
Is this reader assuming GZip compressed input.
isCompressed() - Method in class org.jwat.warc.WarcReaderCompressed
 
isCompressed() - Method in class org.jwat.warc.WarcReaderUncompressed
 
isCompressed() - Method in class org.jwat.warc.WarcWriter
Is this writer producing compressed output.
isCompressed() - Method in class org.jwat.warc.WarcWriterCompressed
 
isCompressed() - Method in class org.jwat.warc.WarcWriterUncompressed
 
isValidBlockDigest - Variable in class org.jwat.warc.WarcRecord
Is Warc-Block-Digest valid.
isValidPayloadDigest - Variable in class org.jwat.warc.WarcRecord
Is Warc-Payload-Digest valid.
isWarcFile(ByteCountingPushBackInputStream) - Static method in class org.jwat.warc.WarcReaderFactory
Check head of PushBackInputStream for a WARC file identifier.
isWarcRecord(ByteCountingPushBackInputStream) - Static method in class org.jwat.warc.WarcReaderFactory
Check head of PushBackInputStream for a WARC record identifier.
iterator() - Method in class org.jwat.warc.WarcReader
Returns an Iterator over the records as they are being parsed.
iteratorExceptionThrown - Variable in class org.jwat.warc.WarcReader
Exception thrown while using the iterator.

L

lineReader - Variable in class org.jwat.warc.WarcReader
Line reader used to read version lines.

M

major - Variable in class org.jwat.warc.WarcHeader
Major version number from WARC header.
maxFileSize - Variable in class org.jwat.warc.WarcFileWriterConfig
Max file size used to determine when to close the current ARC file and start writing to the next one.
MEDIA_TYPE_METADATA - Static variable in class org.jwat.warc.WarcConstants
Suggested media-type for metadata records and others.
metadata - Variable in class org.jwat.warc.WarcFileWriterConfig
Array of metadata.
minor - Variable in class org.jwat.warc.WarcHeader
Minor version number from WARC header.

N

nextWriter() - Method in class org.jwat.warc.WarcFileWriter
Checks to see whether a new file needs to be created.
nlp - Variable in class org.jwat.warc.WarcRecord
Newline parser for counting/validating trailing newlines.

O

open() - Method in class org.jwat.warc.WarcFileWriter
Open new file with active prefix and prepare for writing.
org.jwat.warc - package org.jwat.warc
 
out - Variable in class org.jwat.warc.WarcWriter
Outputstream used to write WARC records.

P

P_IDX_STRINGS - Static variable in class org.jwat.warc.WarcConstants
WARC profile id to field name mapping table.
parseContentType(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Parse and validate content-type string with optional parameters.
parseDate(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Parses WARC record date.
parseDigest(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Parse and validate WARC digest string.
parseHeader(ByteCountingPushBackInputStream) - Method in class org.jwat.warc.WarcHeader
Try to parse a WARC header and return a boolean indicating the success or failure of this.
parseHeaders(ByteCountingPushBackInputStream) - Method in class org.jwat.warc.WarcHeader
Reads WARC header lines one line at a time until an empty line is encountered.
parseInteger(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Returns an Integer object holding the value of the specified string.
parseIpAddress(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Parse and validate an IP address.
parseLong(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Returns a Long object holding the value of the specified string.
parseRecord(ByteCountingPushBackInputStream, WarcReader) - Static method in class org.jwat.warc.WarcRecord
Given an InputStream it tries to read and validate a WARC header block.
parseString(String, String) - Method in class org.jwat.warc.WarcFieldParsers
Validates that the string is not null.
parseUri(String, boolean, UriProfile, String) - Method in class org.jwat.warc.WarcFieldParsers
Returns an URI object holding the value of the specified string.
parseVersion(ByteCountingPushBackInputStream) - Method in class org.jwat.warc.WarcHeader
Looks forward in the input stream for a valid WARC version line.
parseWarcDigest(String) - Static method in class org.jwat.warc.WarcDigest
Parse and validate the format of a WARC digest header value.
payload - Variable in class org.jwat.warc.WarcRecord
Payload object if any exists.
payloadClosed() - Method in class org.jwat.warc.WarcRecord
Called when the payload object is closed and final steps in the validation process can be performed.
payloadDigestAlgorithm - Variable in class org.jwat.warc.WarcReader
Default payload digest algorithm to use if none is present in the record.
payloadDigestEncoding - Variable in class org.jwat.warc.WarcReader
Default encoding scheme used to encode payload digest into a string, if none is detected from the record.
payloadHeaderMaxSize - Variable in class org.jwat.warc.WarcReader
Max size allowed for a payload header.
payloadWrittenTotal - Variable in class org.jwat.warc.WarcWriter
Total bytes written for current record payload.
POLICY_IGNORE - Static variable in class org.jwat.warc.WarcConstants
Warc header can be ignored.
POLICY_MANDATORY - Static variable in class org.jwat.warc.WarcConstants
Warc header is mandatory (equal to shall).
POLICY_MAY - Static variable in class org.jwat.warc.WarcConstants
Warc header can be present.
POLICY_MAY_NOT - Static variable in class org.jwat.warc.WarcConstants
Warc header should not be present.
POLICY_SHALL - Static variable in class org.jwat.warc.WarcConstants
Warc header must be present.
POLICY_SHALL_NOT - Static variable in class org.jwat.warc.WarcConstants
Warc header must not be present.
processComputedDigest(WarcDigest, String, String, String) - Method in class org.jwat.warc.WarcRecord
Adjust algorithm and encoding information about computed block digest.
processWarcDigest(WarcDigest, WarcDigest, String) - Method in class org.jwat.warc.WarcRecord
Auto-detect encoding used in WARC digest header and compare it to the internal one, if it has been computed.
PROFILE_IDENTICAL_PAYLOAD_DIGEST - Static variable in class org.jwat.warc.WarcConstants
Revisit WARC-Profile id for identical payload digest.
PROFILE_IDX_IDENTICAL_PAYLOAD_DIGEST - Static variable in class org.jwat.warc.WarcConstants
Warc reader id for identical payload digest profile.
PROFILE_IDX_SERVER_NOT_MODIFIED - Static variable in class org.jwat.warc.WarcConstants
Warc reader id for server not modified profile.
PROFILE_IDX_UNKNOWN - Static variable in class org.jwat.warc.WarcConstants
Warc reader id for unknown profile.
PROFILE_SERVER_NOT_MODIFIED - Static variable in class org.jwat.warc.WarcConstants
Revisit WARC-Profile id for server not modified.
profileIdxMap - Static variable in class org.jwat.warc.WarcConstants
Profile lookup map used to identify WARC-Profile values.
PUSHBACK_BUFFER_SIZE - Static variable in class org.jwat.warc.WarcReaderCompressed
Buffer size used by PushbackInputStream.
PUSHBACK_BUFFER_SIZE - Static variable in class org.jwat.warc.WarcReaderFactory
Buffer size used by PushbackInputStream.
PUSHBACK_BUFFER_SIZE - Static variable in class org.jwat.warc.WarcReaderUncompressed
Buffer size used by PushbackInputStream.

R

reader - Variable in class org.jwat.warc.WarcHeader
Associated WarcReader context.
reader - Variable in class org.jwat.warc.WarcReaderCompressed
WARC file InputStream.
reader - Variable in class org.jwat.warc.WarcRecord
Reader instance used, required for file compliance.
recordClosed() - Method in class org.jwat.warc.WarcReader
Callback method called when the payload has been processed.
recordClosed() - Method in class org.jwat.warc.WarcReaderCompressed
 
recordClosed() - Method in class org.jwat.warc.WarcReaderUncompressed
 
recordHeaderMaxSize - Variable in class org.jwat.warc.WarcReader
Max size allowed for a record header.
records - Variable in class org.jwat.warc.WarcReader
Records parsed.
recordTypeIdxMap - Static variable in class org.jwat.warc.WarcConstants
WARC-Type lookup map.
reset() - Method in class org.jwat.warc.WarcReader
Reset reader for reuse.
RT_CONTINUATION - Static variable in class org.jwat.warc.WarcConstants
WARC-Type continuation id.
RT_CONVERSION - Static variable in class org.jwat.warc.WarcConstants
WARC-Type conversion id.
RT_IDX_CONTINUATION - Static variable in class org.jwat.warc.WarcConstants
Warc reader continuation warc record type id.
RT_IDX_CONVERSION - Static variable in class org.jwat.warc.WarcConstants
Warc reader conversion warc record type id.
RT_IDX_METADATA - Static variable in class org.jwat.warc.WarcConstants
Warc reader metadata warc record type id.
RT_IDX_REQUEST - Static variable in class org.jwat.warc.WarcConstants
Warc reader request warc record type id.
RT_IDX_RESOURCE - Static variable in class org.jwat.warc.WarcConstants
Warc reader resource warc record type id.
RT_IDX_RESPONSE - Static variable in class org.jwat.warc.WarcConstants
Warc reader response warc record type id.
RT_IDX_REVISIT - Static variable in class org.jwat.warc.WarcConstants
Warc reader revisit warc record type id.
RT_IDX_STRINGS - Static variable in class org.jwat.warc.WarcConstants
WARC type id to field name mapping table.
RT_IDX_UNKNOWN - Static variable in class org.jwat.warc.WarcConstants
Warc reader unknown warc record type id.
RT_IDX_WARCINFO - Static variable in class org.jwat.warc.WarcConstants
Warc reader warcinfo warc record type id.
RT_INDEX_OF_LAST - Static variable in class org.jwat.warc.WarcConstants
Index of last WARC type (zero indexed).
RT_METADATA - Static variable in class org.jwat.warc.WarcConstants
WARC-Type metadata id.
RT_NUMBER - Static variable in class org.jwat.warc.WarcConstants
Number of WARC types.
RT_REQUEST - Static variable in class org.jwat.warc.WarcConstants
WARC-Type request id.
RT_RESOURCE - Static variable in class org.jwat.warc.WarcConstants
WARC-Type resource id.
RT_RESPONSE - Static variable in class org.jwat.warc.WarcConstants
WARC-Type response id.
RT_REVISIT - Static variable in class org.jwat.warc.WarcConstants
WARC-Type revisit id.
RT_WARCINFO - Static variable in class org.jwat.warc.WarcConstants
WARC-Type warcinfo id.

S

S_HEADER_WRITTEN - Static variable in class org.jwat.warc.WarcWriter
State after header has been written.
S_INIT - Static variable in class org.jwat.warc.WarcWriter
State after writer has been constructed and before records have been written.
S_PAYLOAD_WRITTEN - Static variable in class org.jwat.warc.WarcWriter
State after payload has been written.
S_RECORD_CLOSED - Static variable in class org.jwat.warc.WarcWriter
State after record has been closed.
seen - Variable in class org.jwat.warc.WarcHeader
Array used for duplicate header detection.
sequenceNr - Variable in class org.jwat.warc.WarcFileWriter
Current sequence number.
setBlockDigestAlgorithm(String) - Method in class org.jwat.warc.WarcReader
Tries to set the default block digest algorithm and returns a boolean indicating whether the algorithm was accepted or not.
setBlockDigestEnabled(boolean) - Method in class org.jwat.warc.WarcReader
Set the readers block digest on/off status.
setBlockDigestEncoding(String) - Method in class org.jwat.warc.WarcReader
Set the default block digest encoding scheme.
setExceptionOnContentLengthMismatch(boolean) - Method in class org.jwat.warc.WarcWriter
Tell the writer what to do in case of mismatch between content-length and amount payload written.
setPayloadDigestAlgorithm(String) - Method in class org.jwat.warc.WarcReader
Tries to set the default payload digest algorithm and returns a boolean indicating whether the algorithm was accepted or not.
setPayloadDigestEnabled(boolean) - Method in class org.jwat.warc.WarcReader
Set the readers payload digest on/off status.
setPayloadDigestEncoding(String) - Method in class org.jwat.warc.WarcReader
Set the default payload digest encoding scheme.
setPayloadHeaderMaxSize(int) - Method in class org.jwat.warc.WarcReader
Set the max size allowed for a payload header.
setRecordHeaderMaxSize(int) - Method in class org.jwat.warc.WarcReader
Set the max size allowed for a record header.
setUriProfile(UriProfile) - Method in class org.jwat.warc.WarcReader
Set the URI profile used to validate URIs.
setUriProfile(UriProfile) - Method in class org.jwat.warc.WarcWriter
Set the URI profile used to validate URIs.
setWarcTargetUriProfile(UriProfile) - Method in class org.jwat.warc.WarcReader
Set the URI profile used to validate WARC-Target URIs.
setWarcTargetUriProfile(UriProfile) - Method in class org.jwat.warc.WarcWriter
Set the URI profile used to validate WARC-Target URIs.
startOffset - Variable in class org.jwat.warc.WarcHeader
WARC record starting offset relative to the source WARC file input stream.
startOffset - Variable in class org.jwat.warc.WarcReaderCompressed
Cached start offset used after the reader is closed.
startOffset - Variable in class org.jwat.warc.WarcReaderUncompressed
Start offset of current or next valid record.
startOffset - Variable in class org.jwat.warc.WarcRecord
WARC record parsing start offset relative to the source WARC file input stream.
state - Variable in class org.jwat.warc.WarcWriter
Current state of writer.
stream_copy_buffer - Variable in class org.jwat.warc.WarcWriter
Buffer used by streamPayload() to copy from one stream to another.
streamPayload(InputStream) - Method in class org.jwat.warc.WarcWriter
Stream the content of an input stream to the payload content.
streamPayload(InputStream) - Method in class org.jwat.warc.WarcWriterCompressed
 
supportMultipleFiles() - Method in interface org.jwat.warc.WarcFileNaming
Does this naming implementation support multiple files.
supportMultipleFiles() - Method in class org.jwat.warc.WarcFileNamingDefault
 
supportMultipleFiles() - Method in class org.jwat.warc.WarcFileNamingSingleFile
 

T

targetDir - Variable in class org.jwat.warc.WarcFileWriterConfig
Target directory in which to write ARC file(s).
toString() - Method in class org.jwat.warc.WarcDigest
Returns a header representation of the class state.
toStringFull() - Method in class org.jwat.warc.WarcDigest
Returns a full textual string representation of the class state.
trailingNewlines - Variable in class org.jwat.warc.WarcRecord
Number of trailing newlines after record.
truncatedTypeIdxMap - Static variable in class org.jwat.warc.WarcConstants
Lookup map for known truncation reason id's.
TT_DISCONNECT - Static variable in class org.jwat.warc.WarcConstants
WARC-Truncated disconnect id.
TT_IDX_DISCONNECT - Static variable in class org.jwat.warc.WarcConstants
Warc reader disconnect reason id.
TT_IDX_FUTURE_REASON - Static variable in class org.jwat.warc.WarcConstants
Warc reader future reason id.
TT_IDX_LENGTH - Static variable in class org.jwat.warc.WarcConstants
Warc reader length reason id.
TT_IDX_STRINGS - Static variable in class org.jwat.warc.WarcConstants
WARC truncation reason id to field name mapping table.
TT_IDX_TIME - Static variable in class org.jwat.warc.WarcConstants
Warc reader time reason id.
TT_IDX_UNSPECIFIED - Static variable in class org.jwat.warc.WarcConstants
Warc reader unspecified reason id.
TT_LENGTH - Static variable in class org.jwat.warc.WarcConstants
WARC-Truncated length id.
TT_TIME - Static variable in class org.jwat.warc.WarcConstants
WARC-Truncated time id
TT_UNSPECIFIED - Static variable in class org.jwat.warc.WarcConstants
WARC-Truncated unspecified id.

U

URI_LTGT - Static variable in class org.jwat.warc.WarcHeader
An URI with encapsulating <> characters.
URI_NAKED - Static variable in class org.jwat.warc.WarcHeader
An URI without encapsulating <> characters.
uriProfile - Variable in class org.jwat.warc.WarcHeader
URI profile.
uriProfile - Variable in class org.jwat.warc.WarcReader
URI profile.
uriProfile - Variable in class org.jwat.warc.WarcWriter
URI profile.

V

versionArr - Variable in class org.jwat.warc.WarcHeader
Array based on the version string split by the "." delimiter and converted to integers.
versionStr - Variable in class org.jwat.warc.WarcHeader
Raw version string.

W

WARC_DATE_FORMAT - Static variable in class org.jwat.warc.WarcConstants
WARC date format string as specified by the WARC ISO standard.
WARC_DIGEST_FORMAT - Static variable in class org.jwat.warc.WarcConstants
WARC digest format string as specified by the WARC ISO standard.
WARC_MAGIC_HEADER - Static variable in class org.jwat.warc.WarcConstants
A WARC header block starts with this string including trailing version information.
WARC_MIME_TYPE - Static variable in class org.jwat.warc.WarcConstants
WARC mime type.
WARC_RECORD_TRAILING_NEWLINES - Static variable in class org.jwat.warc.WarcConstants
Trailing newlines after each record as per the WARC ISO standard.
warcBlockDigest - Variable in class org.jwat.warc.WarcHeader
WARC-Block-Digest converted to a WarcDigest object, if valid.
warcBlockDigestStr - Variable in class org.jwat.warc.WarcHeader
WARC-Block-Digest field string value.
WarcConcurrentTo - Class in org.jwat.warc
Simple wrapper for a (non) validated WARC ConcurrentTo header.
WarcConcurrentTo() - Constructor for class org.jwat.warc.WarcConcurrentTo
 
warcConcurrentToList - Variable in class org.jwat.warc.WarcHeader
List of WARC-Concurrent-To field string values and converted URI objects, if valid.
warcConcurrentToStr - Variable in class org.jwat.warc.WarcConcurrentTo
Warc-Concurrent-To string representation.
warcConcurrentToUri - Variable in class org.jwat.warc.WarcConcurrentTo
Warc-Concurrent-To Uri object.
WarcConstants - Class in org.jwat.warc
Class containing all relevant WARC constants and structures.
WarcConstants() - Constructor for class org.jwat.warc.WarcConstants
This utility class does not require instantiation.
warcDate - Variable in class org.jwat.warc.WarcHeader
WARC-Date converted to a Date object, if valid.
warcDateFormat - Variable in class org.jwat.warc.WarcHeader
WARC DateFormat as specified by the WARC ISO standard.
warcDateFormat - Variable in class org.jwat.warc.WarcWriter
WARC DateFormat as specified by the WARC ISO standard.
WarcDateParser - Class in org.jwat.warc
WARC-Date parser and format validator.
warcDateStr - Variable in class org.jwat.warc.WarcHeader
WARC-Date field string value.
WarcDigest - Class in org.jwat.warc
This class represents the parsed and format validated information provided from a WARC digest header value.
WarcDigest() - Constructor for class org.jwat.warc.WarcDigest
Package level constructor.
WarcDigest(String, String) - Constructor for class org.jwat.warc.WarcDigest
Construct an object with the supplied parameters.
WarcFieldParsers - Class in org.jwat.warc
Separate class containing all the different types of field parser.
WarcFieldParsers() - Constructor for class org.jwat.warc.WarcFieldParsers
 
warcFileConfig - Variable in class org.jwat.warc.WarcFileWriter
Overall WARC file writer configuration.
warcFilename - Variable in class org.jwat.warc.WarcHeader
WARC-Filename field string value.
WarcFileNaming - Interface in org.jwat.warc
Implementations of this interface are used to name the WARC files written by the WarcFileWriter.
warcFileNaming - Variable in class org.jwat.warc.WarcFileWriter
WARC file naming Configuration.
WarcFileNamingDefault - Class in org.jwat.warc
Default WARC file naming implementation used for writing to multiple files.
WarcFileNamingDefault(String, Date, String, String) - Constructor for class org.jwat.warc.WarcFileNamingDefault
Construct file naming instance.
WarcFileNamingSingleFile - Class in org.jwat.warc
Simple WARC file naming implementation used for writing to a single file only.
WarcFileNamingSingleFile(String) - Constructor for class org.jwat.warc.WarcFileNamingSingleFile
Construct a new instance with the filename to return.
WarcFileNamingSingleFile(File) - Constructor for class org.jwat.warc.WarcFileNamingSingleFile
Construct a new instance with the file whose filename to return.
WarcFileWriter - Class in org.jwat.warc
Simple WARC file writer wrapping some of the trivial code related to writing records.
WarcFileWriter() - Constructor for class org.jwat.warc.WarcFileWriter
Constructor for internal and unit test use.
WarcFileWriterConfig - Class in org.jwat.warc
General configuration of WarcFileWriter.
WarcFileWriterConfig() - Constructor for class org.jwat.warc.WarcFileWriterConfig
Construct instance with largely default values, except the targetDir which is null.
WarcFileWriterConfig(File, boolean, long, boolean) - Constructor for class org.jwat.warc.WarcFileWriterConfig
Construct an instance with custom values.
WarcHeader - Class in org.jwat.warc
Central class for working with WARC headers.
WarcHeader() - Constructor for class org.jwat.warc.WarcHeader
Non public constructor to allow unit testing.
warcIdentifiedPayloadType - Variable in class org.jwat.warc.WarcHeader
WARC-Identified-Payload-Type converted to a ContentType object, if valid.
warcIdentifiedPayloadTypeStr - Variable in class org.jwat.warc.WarcHeader
WARC-Identified-Payload-Type field string value.
warcInetAddress - Variable in class org.jwat.warc.WarcHeader
WARC-IP-Address converted to an InetAddress object, if valid.
warcinfoRecordId - Variable in class org.jwat.warc.WarcFileWriter
Generated WARC-Info-Record-ID for the current file.
warcIpAddress - Variable in class org.jwat.warc.WarcHeader
WARC-IP-Address field string value.
warcPayloadDigest - Variable in class org.jwat.warc.WarcHeader
WARC-Payload-Digest converted to a WarcDigest object, if valid.
warcPayloadDigestStr - Variable in class org.jwat.warc.WarcHeader
WARC-Payload-Digest field string value.
warcProfileIdx - Variable in class org.jwat.warc.WarcHeader
WARC-Profile converted to an integer id, if valid.
warcProfileStr - Variable in class org.jwat.warc.WarcHeader
WARC-Profile field string value.
warcProfileUri - Variable in class org.jwat.warc.WarcHeader
WARC-Profile field converted to an Uri object, if valid.
WarcReader - Class in org.jwat.warc
Base class for WARC reader implementations.
WarcReader() - Constructor for class org.jwat.warc.WarcReader
 
WarcReaderCompressed - Class in org.jwat.warc
WARC Reader implementation for reading GZip compressed files.
WarcReaderCompressed() - Constructor for class org.jwat.warc.WarcReaderCompressed
This constructor is used to get random access to records.
WarcReaderCompressed(GzipReader) - Constructor for class org.jwat.warc.WarcReaderCompressed
Construct reader using the supplied input stream.
WarcReaderCompressed(GzipReader, int) - Constructor for class org.jwat.warc.WarcReaderCompressed
Construct object using supplied GzipInputStream.
WarcReaderFactory - Class in org.jwat.warc
Factory used for creating WarcReader instances.
WarcReaderFactory() - Constructor for class org.jwat.warc.WarcReaderFactory
Private constructor to enforce factory methods.
WarcReaderUncompressed - Class in org.jwat.warc
WARC Reader implementation for reading uncompressed files.
WarcReaderUncompressed() - Constructor for class org.jwat.warc.WarcReaderUncompressed
This constructor is used to get random access to records.
WarcReaderUncompressed(ByteCountingPushBackInputStream) - Constructor for class org.jwat.warc.WarcReaderUncompressed
Construct reader using the supplied input stream.
WarcRecord - Class in org.jwat.warc
This class represents a parsed WARC record header block including possible validation and format warnings/errors encountered in the process.
WarcRecord() - Constructor for class org.jwat.warc.WarcRecord
Non public constructor to allow unit testing.
warcRecordIdStr - Variable in class org.jwat.warc.WarcHeader
WARC-Record-Id field string value.
warcRecordIdUri - Variable in class org.jwat.warc.WarcHeader
WARC-Record-Id converted to an Uri object, if valid.
warcRefersToDate - Variable in class org.jwat.warc.WarcHeader
WARC-Date converted to a Date object, if valid.
warcRefersToDateStr - Variable in class org.jwat.warc.WarcHeader
WARC-Refers-To-Date
warcRefersToStr - Variable in class org.jwat.warc.WarcHeader
WARC-Refers-To field string value.
warcRefersToTargetUriStr - Variable in class org.jwat.warc.WarcHeader
WARC-Refers-To-Target-URI field string value.
warcRefersToTargetUriUri - Variable in class org.jwat.warc.WarcHeader
WARC-Refers-To-Target-URI converted to an Uri object, if valid.
warcRefersToUri - Variable in class org.jwat.warc.WarcHeader
WARC-Refers-To converted to an Uri object, if valid.
warcSegmentNumber - Variable in class org.jwat.warc.WarcHeader
WARC-Segment-Number converted to an Integer object, if valid.
warcSegmentNumberStr - Variable in class org.jwat.warc.WarcHeader
WARC-Segment-Number field string value.
warcSegmentOriginIdStr - Variable in class org.jwat.warc.WarcHeader
WARC-Segment-Origin-Id field string value.
warcSegmentOriginIdUrl - Variable in class org.jwat.warc.WarcHeader
WARC-Segment-Origin-Id converted to an Uri object, if valid.
warcSegmentTotalLength - Variable in class org.jwat.warc.WarcHeader
WARC-Segment-Total-Length converted to a Long object, if valid.
warcSegmentTotalLengthStr - Variable in class org.jwat.warc.WarcHeader
WARC-Segment-Total-Length field string value.
warcTargetUriProfile - Variable in class org.jwat.warc.WarcHeader
WARC-Target-URI profile.
warcTargetUriProfile - Variable in class org.jwat.warc.WarcReader
WARC-Target-URI profile.
warcTargetUriProfile - Variable in class org.jwat.warc.WarcWriter
WARC-Target-URI profile.
warcTargetUriStr - Variable in class org.jwat.warc.WarcHeader
WARC_Target-URI field string value.
warcTargetUriUri - Variable in class org.jwat.warc.WarcHeader
WARC-TargetURI converted to an Uri object, if valid.
warcTruncatedIdx - Variable in class org.jwat.warc.WarcHeader
WARC-Truncated converted to an integer id, if valid.
warcTruncatedStr - Variable in class org.jwat.warc.WarcHeader
WARC-Truncated field string value.
warcTypeIdx - Variable in class org.jwat.warc.WarcHeader
WARC-Type converted to an integer id, if identified.
warcTypeStr - Variable in class org.jwat.warc.WarcHeader
WARC-Type field string value.
warcWarcinfoIdStr - Variable in class org.jwat.warc.WarcHeader
WARC-Warcinfo-Id field string value.
warcWarcinfoIdUri - Variable in class org.jwat.warc.WarcHeader
WARC-Warcinfo-Id converted to an Uri object, if valid.
WarcWriter - Class in org.jwat.warc
Base class for WARC writer implementations.
WarcWriter() - Constructor for class org.jwat.warc.WarcWriter
 
WarcWriterCompressed - Class in org.jwat.warc
WARC Writer implementation for writing GZip compressed files.
WarcWriterFactory - Class in org.jwat.warc
Factory used for creating WarcWriter instances.
WarcWriterFactory() - Constructor for class org.jwat.warc.WarcWriterFactory
Private constructor to enforce factory methods.
WarcWriterUncompressed - Class in org.jwat.warc
WARC Writer implementation for writing uncompressed files.
warnings - Variable in class org.jwat.warc.WarcReader
Aggregate number of warnings encountered while parsing.
writeHeader(WarcRecord) - Method in class org.jwat.warc.WarcWriter
Write a WARC header to the WARC output stream.
writeHeader(WarcRecord) - Method in class org.jwat.warc.WarcWriterCompressed
 
writeHeader(WarcRecord) - Method in class org.jwat.warc.WarcWriterUncompressed
 
writeHeader_impl(WarcRecord) - Method in class org.jwat.warc.WarcWriter
Write a WARC header to the WARC output stream.
writePayload(byte[]) - Method in class org.jwat.warc.WarcWriter
Append the content of a byte array to the payload content.
writePayload(byte[], int, int) - Method in class org.jwat.warc.WarcWriter
Append the partial content of a byte array to the payload content.
writePayload(byte[]) - Method in class org.jwat.warc.WarcWriterCompressed
 
writePayload(byte[], int, int) - Method in class org.jwat.warc.WarcWriterCompressed
 
writer - Variable in class org.jwat.warc.WarcFileWriter
Current WARC writer.
writer - Variable in class org.jwat.warc.WarcWriterCompressed
GZip Writer used.
writer_raf - Variable in class org.jwat.warc.WarcFileWriter
Current random access file.
writer_rafout - Variable in class org.jwat.warc.WarcFileWriter
Current random access output stream.
writeRawHeader(byte[], Long) - Method in class org.jwat.warc.WarcWriter
Write a raw WARC header to the WARC output stream.
writeRawHeader(byte[], Long) - Method in class org.jwat.warc.WarcWriterCompressed
 
writerFile - Variable in class org.jwat.warc.WarcFileWriter
Current WARC file.
A B C D E F G H I L M N O P R S T U V W 
Skip navigation links

Copyright © 2011–2015. All rights reserved.