public class DocLengthTable4B extends Object implements DocLengthTable
Object that keeps track of the length of each document in the collection as a four-byte integers (ints). Document lengths are measured in number of terms.
Document length data is stored in a serialized data file, in the following format:
Since the documents are numbered sequentially starting at d + 1, each short corresponds unambiguously to a particular document.
| Constructor and Description |
|---|
DocLengthTable4B(Path file)
Creates a new
DocLengthTable4B. |
DocLengthTable4B(Path file,
FileSystem fs)
Creates a new
DocLengthTable4B. |
| Modifier and Type | Method and Description |
|---|---|
float |
getAvgDocLength()
Returns the average document length.
|
int |
getDocCount()
Returns number of documents in the collection.
|
int |
getDocLength(int docno)
Returns the length of a document.
|
int |
getDocnoOffset()
Returns the first docno in this collection.
|
public DocLengthTable4B(Path file) throws IOException
DocLengthTable4B.file - document length data fileIOExceptionpublic DocLengthTable4B(Path file, FileSystem fs) throws IOException
DocLengthTable4B.file - document length data filefs - FileSystem to read fromIOExceptionpublic float getAvgDocLength()
DocLengthTablegetAvgDocLength in interface DocLengthTablepublic int getDocCount()
DocLengthTablegetDocCount in interface DocLengthTablepublic int getDocLength(int docno)
DocLengthTablegetDocLength in interface DocLengthTablepublic int getDocnoOffset()
DocLengthTablegetDocnoOffset in interface DocLengthTable