Ivory

A Hadoop toolkit for web-scale information retrieval research

These regression runs represent experiments presented in Wang et al.'s SIGIR 2010 paper on Learning to Efficiently Rank. Note that Bendersky et al.'s WSD model (WSDM 2010) is a special case of the ESD model (without any time constraints). These runs represent results in Tables 5 and 6 of the paper.

Wt10g results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2010.xml data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
etc/trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-ql.txt
etc/trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-sd.txt
etc/trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-wsd-sd.txt

# junit
etc/junit.sh ivory.regression.sigir2010.Wt10g_ESD
description tag MAP P10
Dirichlet, full independence wt10g-ql 0.2151 0.3560
Dirichlet, sequential dependence wt10g-dir-sd 0.2242 0.3640
Dirichlet, ESD, no time constraints (=WSD) wt10g-wsd-sd 0.2411 0.3820

Gov2 results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2010.xml data/gov2/queries.gov2.title.776-850.xml
    
# evaluating effectiveness
etc/trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-ql.txt
etc/trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-sd.txt
etc/trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-wsd-sd.txt
    
# junit
etc/junit.sh ivory.regression.sigir2010.Gov2_ESD
description tag MAP P10
Dirichlet, full independence gov2-ql 0.3195 0.5573
Dirichlet, sequential dependence gov2-dir-sd 0.3357 0.5813
Dirichlet, ESD, no time constraints (=WSD) gov2-wsd-sd 0.3435 0.5827

Web09 category B results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.SIGIR2010.xml data/clue/queries.web09.26-50.xml

# evaluating effectiveness
etc/trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-ql.txt
etc/trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-sd.txt
etc/trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt

perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-ql.txt
perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-sd.txt
perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt

# junit
etc/junit.sh ivory.regression.sigir2010.Clue_ESD
description tag MAP P10
Dirichlet, full independence clue-ql 0.2098 0.2075
Dirichlet, sequential dependence clue-sd 0.2208 0.2168
Dirichlet, ESD, no time constraints (=WSD) clue-wsd-sd 0.2212 0.2243

(Note that results reported in the SIGIR 2010 paper are StatMAP values, not standard MAP values.)