public abstract class WarcWriter extends Object implements Closeable
| Modifier and Type | Field and Description |
|---|---|
protected boolean |
bExceptionOnContentLengthMismatch
Configuration for throwing exception on content-length mismatch.
|
Diagnostics<Diagnosis> |
diagnostics
Writer level errors and warnings or when writing byte headers.
|
protected WarcFieldParsers |
fieldParsers
WARC field parser used.
|
protected WarcHeader |
header
Current WARC header written.
|
protected Long |
headerContentLength
Content-Length from the WARC header.
|
protected OutputStream |
out
Outputstream used to write WARC records.
|
protected long |
payloadWrittenTotal
Total bytes written for current record payload.
|
protected static int |
S_HEADER_WRITTEN
State after header has been written.
|
protected static int |
S_INIT
State after writer has been constructed and before records have been written.
|
protected static int |
S_PAYLOAD_WRITTEN
State after payload has been written.
|
protected static int |
S_RECORD_CLOSED
State after record has been closed.
|
protected int |
state
Current state of writer.
|
protected byte[] |
stream_copy_buffer
Buffer used by streamPayload() to copy from one stream to another.
|
protected UriProfile |
uriProfile
URI profile.
|
protected DateFormat |
warcDateFormat
WARC
DateFormat as specified by the WARC ISO standard. |
protected UriProfile |
warcTargetUriProfile
WARC-Target-URI profile.
|
| Constructor and Description |
|---|
WarcWriter() |
| Modifier and Type | Method and Description |
|---|---|
abstract void |
close()
Close WARC writer and free its resources.
|
protected void |
closeRecord_impl()
Closes the WARC record by writing two newlines and comparing the amount of
payload data streamed with the content-length supplied with the header.
|
abstract void |
closeRecord()
Close the WARC record in an implementation specific way.
|
boolean |
exceptionOnContentLengthMismatch()
Does this writer throw an exception if the content-length does not match
the payload amount written.
|
UriProfile |
getUriProfile()
Get the URI profile used to validate URIs.
|
UriProfile |
getWarcTargetUriProfile()
Get the URI profile used to validate WARC-Target URIs.
|
protected void |
init()
Method used to initialize a readers internal state.
|
abstract boolean |
isCompressed()
Is this writer producing compressed output.
|
void |
setExceptionOnContentLengthMismatch(boolean enabled)
Tell the writer what to do in case of mismatch between content-length
and amount payload written.
|
void |
setUriProfile(UriProfile uriProfile)
Set the URI profile used to validate URIs.
|
void |
setWarcTargetUriProfile(UriProfile uriProfile)
Set the URI profile used to validate WARC-Target URIs.
|
long |
streamPayload(InputStream in)
Stream the content of an input stream to the payload content.
|
protected byte[] |
writeHeader_impl(WarcRecord record)
Write a WARC header to the WARC output stream.
|
abstract byte[] |
writeHeader(WarcRecord record)
Write a WARC header to the WARC output stream.
|
long |
writePayload(byte[] b)
Append the content of a byte array to the payload content.
|
long |
writePayload(byte[] b,
int offset,
int len)
Append the partial content of a byte array to the payload content.
|
void |
writeRawHeader(byte[] header_bytes,
Long contentLength)
Write a raw WARC header to the WARC output stream.
|
protected static final int S_INIT
protected static final int S_HEADER_WRITTEN
protected static final int S_PAYLOAD_WRITTEN
protected static final int S_RECORD_CLOSED
protected UriProfile warcTargetUriProfile
protected UriProfile uriProfile
protected DateFormat warcDateFormat
DateFormat as specified by the WARC ISO standard.protected WarcFieldParsers fieldParsers
protected byte[] stream_copy_buffer
protected boolean bExceptionOnContentLengthMismatch
public final Diagnostics<Diagnosis> diagnostics
protected int state
protected OutputStream out
protected WarcHeader header
protected Long headerContentLength
protected long payloadWrittenTotal
public WarcWriter()
protected void init()
public abstract boolean isCompressed()
public void setWarcTargetUriProfile(UriProfile uriProfile)
uriProfile - URI profile to usepublic UriProfile getWarcTargetUriProfile()
public void setUriProfile(UriProfile uriProfile)
uriProfile - URI profile to usepublic UriProfile getUriProfile()
public boolean exceptionOnContentLengthMismatch()
public void setExceptionOnContentLengthMismatch(boolean enabled)
enabled - boolean indicating exception throwing on/offpublic abstract void close() throws IOException
close in interface Closeableclose in interface AutoCloseableIOException - if an i/o exception occurs while closing the writerpublic abstract void closeRecord() throws IOException
IOException - if an i/o exception occurs while closing the recordprotected void closeRecord_impl() throws IOException
IOException - if an i/o exception occurs while closing the recordpublic void writeRawHeader(byte[] header_bytes, Long contentLength) throws IOException
header_bytes - raw WARC header to outputcontentLength - the expected content-length to be written and validatedIOException - if an i/o exception occurs while writing header datapublic abstract byte[] writeHeader(WarcRecord record) throws IOException
record - WARC record to outputIOException - if an i/o exception occurs while writing header dataprotected byte[] writeHeader_impl(WarcRecord record) throws IOException
record - WARC record to outputIOException - if an i/o exception occurs while writing header datapublic long streamPayload(InputStream in) throws IOException
in - input stream containing payload dataIOException - if an i/o exception occurs while writing payload datapublic long writePayload(byte[] b) throws IOException
b - byte array with data to be writtenIOException - if an i/o exception occurs while writing payload datapublic long writePayload(byte[] b, int offset, int len) throws IOException
b - byte array with partial data to be writtenoffset - offset to data to be writtenlen - length of data to be writtenIOException - if an i/o exception occurs while writing payload dataCopyright © 2011–2015. All rights reserved.