public class SampleTermDocVectors extends Configured implements Tool
A program that samples from a collection of key,value pairs according to a given frequency.
User needs to modify the source file to change the key and value class type. Change input and output class type of the mapper, and modify the 3 static fields accordingly.
Here's a sample invocation:
usage: [input] [output-dir] [number-of-mappers] [sample-freq] ([sample-docnos-path])hadoop jar ivory.jar ivory.util.SampleDocVectors /umd-lin/fture/pwsim/medline/wt-term-doc-vectors /umd-lin/fture/pwsim/medline/wt-term-doc-vectors-sample 100If there is a text file containing docnos to be sampled (one docno per line), this should be specified as the fifth and last argument. In this case, the sample frequency argument can be anything since it will be ignored.
Modifier and Type | Class and Description |
---|---|
static class |
SampleTermDocVectors.MyReducer |
Constructor and Description |
---|
SampleTermDocVectors() |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
int |
run(String[] args) |
getConf, setConf
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getConf, setConf