Class SortedDataBag<E>
- java.lang.Object
-
- org.apache.jena.atlas.data.AbstractDataBag<E>
-
- org.apache.jena.atlas.data.SortedDataBag<E>
-
- All Implemented Interfaces:
java.lang.Iterable<E>,DataBag<E>,org.apache.jena.atlas.lib.Closeable,org.apache.jena.atlas.lib.Sink<E>
- Direct Known Subclasses:
DistinctDataBag
public class SortedDataBag<E> extends AbstractDataBag<E>
This data bag will gather items in memory until a size threshold is passed, at which point it will write out all of the items to disk using the supplied serializer.
After adding is finished, call
iterator()to set up the data bag for reading back items and iterating over them. The iterator will retrieve the items in sorted order using the supplied comparator.IMPORTANT: You may not add any more items after this call. You may subsequently call
iterator()multiple times which will give you a new iterator for each invocation. If you do not consume the entire iterator, you should callIter.close(Iterator)to close any FileInputStreams associated with the iterator.Additionally, make sure to call
close()when you are finished to free any system resources (preferably in a finally block).Implementation Notes: Data is stored in an ArrayList as it comes in. When it is time to spill, that data is sorted and written to disk. An iterator will read in each file and perform a merge-sort as the results are returned.
-
-
Constructor Summary
Constructors Constructor Description SortedDataBag(ThresholdPolicy<E> policy, SerializationFactory<E> serializerFactory, java.util.Comparator<? super E> comparator)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidadd(E item)Add a tuple to the bag.voidcancel()cancel arranges that further comparisons using the supplied comparator will abandon the sort in progress.voidclose()voidflush()booleanisCancelled()isCancelled is true iff cancel has been called on this bags comparator.booleanisClosed()isClosed returns true iff this bag has been closed.booleanisDistinct()Find out if the bag is distinct.booleanisSorted()Find out if the bag is sorted.java.util.Iterator<E>iterator()Returns an iterator over a set of elements of type E.-
Methods inherited from class org.apache.jena.atlas.data.AbstractDataBag
isEmpty, send, size
-
-
-
-
Constructor Detail
-
SortedDataBag
public SortedDataBag(ThresholdPolicy<E> policy, SerializationFactory<E> serializerFactory, java.util.Comparator<? super E> comparator)
-
-
Method Detail
-
cancel
public void cancel()
cancel arranges that further comparisons using the supplied comparator will abandon the sort in progress.
-
isCancelled
public boolean isCancelled()
isCancelled is true iff cancel has been called on this bags comparator. (Used in testing.)
-
isClosed
public boolean isClosed()
isClosed returns true iff this bag has been closed. (Used in testing.)
-
isSorted
public boolean isSorted()
Description copied from interface:DataBagFind out if the bag is sorted.- Returns:
- true if this is a sorted data bag, false otherwise.
-
isDistinct
public boolean isDistinct()
Description copied from interface:DataBagFind out if the bag is distinct.- Returns:
- true if the bag is a distinct bag, false otherwise.
-
add
public void add(E item)
Description copied from interface:DataBagAdd a tuple to the bag.- Parameters:
item- tuple to add.
-
flush
public void flush()
-
iterator
public java.util.Iterator<E> iterator()
Returns an iterator over a set of elements of type E. If you do not exhaust the iterator, you should callIter.close(Iterator)to be sure any open file handles are closed.- Returns:
- an Iterator
-
close
public void close()
-
-