org.crosswire.jsword.passage
Class PassageTally

java.lang.Object
  extended by org.crosswire.jsword.passage.AbstractPassage
      extended by org.crosswire.jsword.passage.PassageTally
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Key>, Iterable<Key>, Key, Passage, VerseKey

public class PassageTally
extends AbstractPassage

Similar to a Passage, but that stores a ranking for each of the Verses that it contains.

Currently there is no well defined spec for what the rank of a verse means - it is just an int. Since this number is exposed in 2 places (getNameAndTally() and getTallyFor()) we should specify what the numbers mean. Trouble is most tallies come from searches where the numbers only have relative meaning.

This class exactly implements the Passage interface when the ordering is set to Order.BIBLICAL, however an additional setting of Order.TALLY sorts the verses by the rank in this tally.

Calling tally.add(Gen 1:1); tally.add(Gen 1:1); is redundant for a Passage however a PassageTally will increase the rank of Gen 1:1, there are additional methods unAdd() and unAddAll() that do the reverse, of decreasing the rank of the specified verse(s).

The point is to allow a search for "God loves us, and gave Jesus to die to save us" to correctly identify John 3:16. So we are using fuzzy matching big style, but I think this will be very useful.

How should we rank VerseRanges? We could use a sum of the ranks of the verses in a range or the maximum value of a range. The former would seem to be more mathematically correct, but I think that the latter is better because: the concept of max value is preserved, because a wide blurred match is generally not as good as a sharply defined one.

Should we be going for a PassageTallyFactory type approach? Of the 3 implementations of Passage, The RangedPassage does not make sense here, and a PassageTally will not have the range of uses that a Passage has, so I think there is more likely to be a correct answer. So right now the answer is no.

Memory considerations: The BitSet approach will always use a int[31000] = 128k of memory.
The Distinct approach will be n * int[4] where n is the number of verses stored. I expect most searches to have at least n=1000. Also 128k
Given this, (A Distinct style PassageTally will usually use more memory than a BitSet style PassageTally) And the intuitive result that the BitSet will be faster, I'm going to start by implementing the latter only.

To think about - I've upped the MAX_TALLY to 20000 to help the new mapper program. I'm not sure why it was originally 100?

LATER(joe): Specify how passage ranks work.

Author:
Joe Walker [joe at eireneh dot com]
See Also:
for license details.
The copyright to this program is held by it's authors.
, Serialized Form

Nested Class Summary
static class PassageTally.Order
          Indicates how this PassageTally is to order it's Verses.
private static class PassageTally.OrderedVerseIterator
          Iterate over the Verses in order of their rank in the tally
private static class PassageTally.OrderedVerseRangeIterator
          Iterate over the Ranges in order of their rank in the tally
private static class PassageTally.TalliedVerse
          Hack to make this work with J2SE 1.1 as well as J2SE 1.2 This compared 2 Integers
private static class PassageTally.TalliedVerseRange
          Hack to make this work with JDK1.1 as well as JDK1.2 This compared 2 Integers
private  class PassageTally.VerseIterator
          Iterate over the Verses in normal verse order
 
Nested classes/interfaces inherited from class org.crosswire.jsword.passage.AbstractPassage
AbstractPassage.VerseRangeIterator
 
Field Summary
protected  int[] board
          The tally board itself
private static org.slf4j.Logger log
          The log stream
private  int max
          The maximum tally possible
static int MAX_TALLY
          The highest tally possible
private  PassageTally.Order order
          The maximum tally possible
private static long serialVersionUID
          Serialization ID
private  int size
           
private  int total
           
 
Fields inherited from class org.crosswire.jsword.passage.AbstractPassage
BITWISE, DISTINCT, listeners, METHOD_COUNT, originalName, RANGED, REF_ALLOWED_DELIMS, REF_OSIS_DELIM, REF_PREF_DELIM, skipNormalization, suppressEvents
 
Constructor Summary
PassageTally(Versification v11n)
          Create an empty PassageTally
PassageTally(Versification v11n, String refs)
           
PassageTally(Versification v11n, String refs, Key basis)
          Create a Verse from a human readable string.
 
Method Summary
 void add(Key that)
          Add/Increment this verses in the rankings
 void add(Key that, int count)
          DONT USE THIS.
 void addAll(Key that)
          Adds the specified element to this set if it is not already present.
private  void alterVerseBase(Key that, int tally)
          Increment/Decrement this verses in the rankings
 void blur(int verses, RestrictionType restrict)
          Widen the range of the verses/keys in this list.
 void clear()
          Removes all of the elements from this set (optional operation).
 PassageTally clone()
          This needs to be declared here so that it is visible as a method on a derived Key.
 boolean contains(Key that)
          Does this tally contain all the specified verses?
 int countVerses()
          Returns the number of verses in this collection.
 void flatten()
          Take the verses in the tally and give them all and equal rank of 1.
 int getIndexOf(Verse verse)
          What is the index of the give verse in the current ordering scheme
 String getName()
          A Human readable version of the Key.
 String getName(int cnt)
          A Human readable version of the verse list.
 String getNameAndTally()
          A Human readable version of the PassageTally.
 String getNameAndTally(int cnt)
          A Human readable version of the PassageTally.
 PassageTally.Order getOrdering()
          Get how we sort the verses we output.
 int getTallyOf(Verse verse)
          The ranking given to a specific verse
 int getTotal()
           
private  void increment(int ord, int tally)
          Increment a verse by an amount
private  void incrementMax(int tally)
          Increment a verse by an amount
 boolean isEmpty()
          Does this Key have 0 members
 Iterator<Key> iterator()
          Iterate through the verse elements in the current sort order
private  void kill(int ord)
          Wipe the rank of the given verse to zero
 Iterator<Key> rangeIterator(RestrictionType restrict)
          Like verseElements() that iterates over VerseRanges instead of Verses.
private  void readObject(ObjectInputStream in)
          Call the support mechanism in AbstractPassage
 void remove(Key that)
          Remove these verses from the rankings, ie, set their rank to zero.
 void removeAll(Key key)
          Removes the specified elements from this set if it is present.
private  void resetMax()
          Sometimes we end up not knowing what the max is - this makes sure we know accurately.
 void setOrdering(PassageTally.Order order)
          Set how we sort the verses we output.
 void setTotal(int total)
           
 String toString()
           
 Passage trimVerses(int count)
          Ensures that there are a maximum of count Verses in this Passage.
 void unAdd(Key that)
          Remove/Decrement this verses in the rankings
 void unAddAll(Passage that)
          Remove/Decrement these verses in the rankings
private  void writeObject(ObjectOutputStream out)
          Call the support mechanism in AbstractPassage
 
Methods inherited from class org.crosswire.jsword.passage.AbstractPassage
addPassageListener, addVerses, booksInPassage, canHaveChildren, compareTo, containsAll, countRanges, equals, fireContentsChanged, fireIntervalAdded, fireIntervalRemoved, get, getCardinality, getChildCount, getName, getOsisID, getOsisRef, getOverview, getParent, getRangeAt, getRootName, getVerseAt, getVersification, hashCode, hasRanges, indexOf, lowerEventSuppressionAndTest, lowerNormalizeProtection, normalize, optimizeReads, optimizeWrites, raiseEventSuppresion, raiseNormalizeProtection, readDescription, readObjectSupport, removePassageListener, retainAll, setParent, toVerseRange, toVerseRange, trimRanges, writeDescription, writeObjectSupport
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

MAX_TALLY

public static final int MAX_TALLY
The highest tally possible

See Also:
Constant Field Values

size

private int size

total

private int total

board

protected int[] board
The tally board itself


max

private int max
The maximum tally possible


order

private PassageTally.Order order
The maximum tally possible


log

private static final org.slf4j.Logger log
The log stream


serialVersionUID

private static final long serialVersionUID
Serialization ID

See Also:
Constant Field Values
Constructor Detail

PassageTally

public PassageTally(Versification v11n)
Create an empty PassageTally

Parameters:
v11n - The Versification to which this Passage belongs.

PassageTally

public PassageTally(Versification v11n,
                    String refs,
                    Key basis)
             throws NoSuchVerseException
Create a Verse from a human readable string. The opposite of toString()

Parameters:
v11n - The Versification to which this Passage belongs.
refs - The text to interpret
basis - The basis by which to interpret refs
Throws:
NoSuchVerseException - If refs is invalid

PassageTally

public PassageTally(Versification v11n,
                    String refs)
             throws NoSuchVerseException
Throws:
NoSuchVerseException
Method Detail

isEmpty

public boolean isEmpty()
Description copied from interface: Key
Does this Key have 0 members

Specified by:
isEmpty in interface Key
Overrides:
isEmpty in class AbstractPassage
Returns:
true if this set contains no elements.

countVerses

public int countVerses()
Description copied from interface: Passage
Returns the number of verses in this collection. Like Collection.size() This does not mean the Passage needs to use Verses, just that it understands the concept.

Specified by:
countVerses in interface Passage
Overrides:
countVerses in class AbstractPassage
Returns:
the number of Verses in this collection
See Also:
Verse

setOrdering

public void setOrdering(PassageTally.Order order)
Set how we sort the verses we output. The options are:

Parameters:
order - the sort order

getOrdering

public PassageTally.Order getOrdering()
Get how we sort the verses we output.

Returns:
the sort order

getTotal

public int getTotal()
Returns:
the total

setTotal

public void setTotal(int total)
Parameters:
total - the total to set

clone

public PassageTally clone()
Description copied from interface: Key
This needs to be declared here so that it is visible as a method on a derived Key.

Specified by:
clone in interface Key
Overrides:
clone in class AbstractPassage
Returns:
A complete copy of ourselves

toString

public String toString()
Overrides:
toString in class AbstractPassage

getName

public String getName()
Description copied from interface: Key
A Human readable version of the Key. For Biblical passages this uses short books names, and the shortest sensible rendering, for example "Mat 3:1-4" and "Mar 1:1, 3, 5" and "3Jo, Jude"

Specified by:
getName in interface Key
Overrides:
getName in class AbstractPassage
Returns:
a String containing a description of the Key

getName

public String getName(int cnt)
A Human readable version of the verse list. Uses short books names, and the shortest possible rendering eg "Mat 3:1-4, 6"

Parameters:
cnt - The number of matches to return, 0 gives all matches
Returns:
a String containing a description of the verses

getNameAndTally

public String getNameAndTally()
A Human readable version of the PassageTally. Uses short books names, and the shortest possible rendering eg "Mat 3:1-4"

Returns:
a String containing a description of the verses

getNameAndTally

public String getNameAndTally(int cnt)
A Human readable version of the PassageTally. Uses short books names, and the shortest possible rendering eg "Mat 3:1-4"

Parameters:
cnt - The number of matches to return, 0 gives all matches
Returns:
a String containing a description of the verses

iterator

public Iterator<Key> iterator()
Iterate through the verse elements in the current sort order

Returns:
A verse Iterator

rangeIterator

public Iterator<Key> rangeIterator(RestrictionType restrict)
Description copied from interface: Passage
Like verseElements() that iterates over VerseRanges instead of Verses. Exactly the same data will be traversed, however using rangeIterator() will usually give less iterations (and never more)

Specified by:
rangeIterator in interface Passage
Overrides:
rangeIterator in class AbstractPassage
Parameters:
restrict - Do we break ranges over chapters
Returns:
A list enumerator

contains

public boolean contains(Key that)
Does this tally contain all the specified verses?

Specified by:
contains in interface Key
Specified by:
contains in interface Passage
Overrides:
contains in class AbstractPassage
Parameters:
that - The verses to test for
Returns:
true if all the verses exist in this tally

getTallyOf

public int getTallyOf(Verse verse)
The ranking given to a specific verse

Parameters:
verse - The verse to get the ranking of
Returns:
The rank of the verse in question

getIndexOf

public int getIndexOf(Verse verse)
What is the index of the give verse in the current ordering scheme

Parameters:
verse - The verse to get the index of
Returns:
The index of the verse or -1 if the verse was not found

add

public void add(Key that)
Add/Increment this verses in the rankings

Parameters:
that - The verses to add/increment

add

public void add(Key that,
                int count)
DONT USE THIS. It makes public something of the ratings scheme which is not generally recommended. This method is likely to be removed at a moments notice, and it only here to keep Mapper happy. Add/Increment this verses in the rankings

Parameters:
that - The verses to add/increment
count - The amount to increment by

unAdd

public void unAdd(Key that)
Remove/Decrement this verses in the rankings

Parameters:
that - The verses to remove/decrement

remove

public void remove(Key that)
Remove these verses from the rankings, ie, set their rank to zero.

Parameters:
that - The verses to remove/decrement

addAll

public void addAll(Key that)
Description copied from interface: Key
Adds the specified element to this set if it is not already present.

Specified by:
addAll in interface Key
Overrides:
addAll in class AbstractPassage
Parameters:
that - element to be added to this set.

unAddAll

public void unAddAll(Passage that)
Remove/Decrement these verses in the rankings

Parameters:
that - The verses to remove/decrement

removeAll

public void removeAll(Key key)
Description copied from interface: Key
Removes the specified elements from this set if it is present.

Specified by:
removeAll in interface Key
Overrides:
removeAll in class AbstractPassage
Parameters:
key - object to be removed from this set, if present.

clear

public void clear()
Description copied from interface: Key
Removes all of the elements from this set (optional operation). This set will be empty after this call returns (unless it throws an exception).

Specified by:
clear in interface Key
Overrides:
clear in class AbstractPassage

trimVerses

public Passage trimVerses(int count)
Ensures that there are a maximum of count Verses in this Passage. If there were more than count Verses then a new Passage is created containing the Verses from count + 1 onwards. If there was not greater than count in the Passage, then the passage remains unchanged, and null is returned.

Specified by:
trimVerses in interface Passage
Overrides:
trimVerses in class AbstractPassage
Parameters:
count - The maximum number of Verses to allow in this collection
Returns:
A new Passage containing the remaining verses or null
See Also:
Verse

flatten

public void flatten()
Take the verses in the tally and give them all and equal rank of 1. After this method has executed then both sorting methods for a.


blur

public void blur(int verses,
                 RestrictionType restrict)
Description copied from interface: Key
Widen the range of the verses/keys in this list. This is primarily for "find x within n verses of y" type applications.

Specified by:
blur in interface Key
Overrides:
blur in class AbstractPassage
Parameters:
verses - The number of verses/keys to widen by
restrict - How should we restrict the blurring?
See Also:
Passage

resetMax

private void resetMax()
Sometimes we end up not knowing what the max is - this makes sure we know accurately. Same with size.


alterVerseBase

private void alterVerseBase(Key that,
                            int tally)
Increment/Decrement this verses in the rankings

Parameters:
that - The verses to add/increment
tally - The amount to increment/decrement by

increment

private void increment(int ord,
                       int tally)
Increment a verse by an amount

Parameters:
ord - The verse to increment
tally - The amount to increase by

incrementMax

private void incrementMax(int tally)
Increment a verse by an amount

Parameters:
tally - The amount to increase by

kill

private void kill(int ord)
Wipe the rank of the given verse to zero

Parameters:
ord - The verse to increment

writeObject

private void writeObject(ObjectOutputStream out)
                  throws IOException
Call the support mechanism in AbstractPassage

Parameters:
out - The stream to write our state to
Throws:
IOException - if the read fails
See Also:
AbstractPassage.writeObjectSupport(ObjectOutputStream)

readObject

private void readObject(ObjectInputStream in)
                 throws IOException,
                        ClassNotFoundException
Call the support mechanism in AbstractPassage

Parameters:
in - The stream to read our state from
Throws:
IOException - if the read fails
ClassNotFoundException - If the read data is incorrect
See Also:
AbstractPassage.readObjectSupport(ObjectInputStream)

Copyright ? 2003-2011