public class WarcReaderUncompressed extends WarcReader
| Modifier and Type | Field and Description |
|---|---|
protected ByteCountingPushBackInputStream |
in
WARC file
InputStream. |
static int |
PUSHBACK_BUFFER_SIZE
Buffer size used by
PushbackInputStream. |
protected long |
startOffset
Start offset of current or next valid record.
|
bBlockDigest, bIsCompliant, blockDigestAlgorithm, blockDigestEncoding, bPayloadDigest, consumed, currentRecord, diagnostics, errors, fieldParsers, headerLineReader, iteratorExceptionThrown, lineReader, payloadDigestAlgorithm, payloadDigestEncoding, payloadHeaderMaxSize, recordHeaderMaxSize, records, uriProfile, warcTargetUriProfile, warnings| Constructor and Description |
|---|
WarcReaderUncompressed()
This constructor is used to get random access to records.
|
WarcReaderUncompressed(ByteCountingPushBackInputStream in)
Construct reader using the supplied input stream.
|
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Close current record resource(s) and input stream(s).
|
long |
getConsumed()
Get number of bytes consumed by this reader.
|
WarcRecord |
getNextRecord()
Parses and gets the next record.
|
WarcRecord |
getNextRecordFrom(InputStream rin,
long offset)
Parses and gets the next record from an
Inputstream. |
WarcRecord |
getNextRecordFrom(InputStream rin,
long offset,
int buffer_size)
Parses and gets the next record from an
Inputstream wrapped
by a BufferedInputStream. |
long |
getOffset()
Get the current offset in the WARC
InputStream. |
long |
getStartOffset()
Get the offset of the current WARC record or -1 if none have been read.
|
boolean |
isCompressed()
Is this reader assuming GZip compressed input.
|
protected void |
recordClosed()
Callback method called when the payload has been processed.
|
getBlockDigestAlgorithm, getBlockDigestEnabled, getBlockDigestEncoding, getIteratorExceptionThrown, getPayloadDigestAlgorithm, getPayloadDigestEnabled, getPayloadDigestEncoding, getPayloadHeaderMaxSize, getRecordHeaderMaxSize, getUriProfile, getWarcTargetUriProfile, init, isCompliant, iterator, reset, setBlockDigestAlgorithm, setBlockDigestEnabled, setBlockDigestEncoding, setPayloadDigestAlgorithm, setPayloadDigestEnabled, setPayloadDigestEncoding, setPayloadHeaderMaxSize, setRecordHeaderMaxSize, setUriProfile, setWarcTargetUriProfilepublic static final int PUSHBACK_BUFFER_SIZE
PushbackInputStream.protected ByteCountingPushBackInputStream in
InputStream.protected long startOffset
public WarcReaderUncompressed()
public WarcReaderUncompressed(ByteCountingPushBackInputStream in)
in - WARC file input streampublic boolean isCompressed()
WarcReaderisCompressed in class WarcReaderpublic void close()
WarcReaderclose in interface Closeableclose in interface AutoCloseableclose in class WarcReaderprotected void recordClosed()
WarcReaderrecordClosed in class WarcReaderpublic long getStartOffset()
WarcReadergetStartOffset in class WarcReaderpublic long getOffset()
WarcReaderInputStream.getOffset in class WarcReaderInputStreampublic long getConsumed()
WarcReadergetConsumed in class WarcReaderpublic WarcRecord getNextRecord() throws IOException
WarcReadergetNextRecord in class WarcReaderIOException - i/o exception in parsing processpublic WarcRecord getNextRecordFrom(InputStream rin, long offset) throws IOException
WarcReaderInputstream.
This method is mainly for random access use since there are serious
side-effects involved in using multiple PushBackInputStream
instances.getNextRecordFrom in class WarcReaderrin - InputStream used to read next recordoffset - offset provided by callerIOException - i/o exception in parsing processpublic WarcRecord getNextRecordFrom(InputStream rin, long offset, int buffer_size) throws IOException
WarcReaderInputstream wrapped
by a BufferedInputStream.
This method is mainly for random access use since there are serious
side-effects involved in using multiple PushBackInputStream
instances.getNextRecordFrom in class WarcReaderrin - InputStream used to read next recordoffset - offset provided by callerbuffer_size - buffer size to useIOException - i/o exception in parsing processCopyright © 2011–2015. All rights reserved.