A Hadoop toolkit for web-scale information retrieval research
These regression runs represent experiments presented in Wang et al.'s SIGIR 2010 paper on Learning to Efficiently Rank. Note that Bendersky et al.'s WSD model (WSDM 2010) is a special case of the ESD model (without any time constraints). These runs represent results in Tables 5 and 6 of the paper.
# command-line etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2010.xml data/wt10g/queries.wt10g.501-550.xml # evaluating effectiveness etc/trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-ql.txt etc/trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-sd.txt etc/trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-wsd-sd.txt # junit etc/junit.sh ivory.regression.sigir2010.Wt10g_ESD
| description | tag | MAP | P10 |
| Dirichlet, full independence | wt10g-ql | 0.2151 | 0.3560 |
| Dirichlet, sequential dependence | wt10g-dir-sd | 0.2242 | 0.3640 |
| Dirichlet, ESD, no time constraints (=WSD) | wt10g-wsd-sd | 0.2411 | 0.3820 |
# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2010.xml data/gov2/queries.gov2.title.776-850.xml
# evaluating effectiveness
etc/trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-ql.txt
etc/trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-sd.txt
etc/trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-wsd-sd.txt
# junit
etc/junit.sh ivory.regression.sigir2010.Gov2_ESD
| description | tag | MAP | P10 |
| Dirichlet, full independence | gov2-ql | 0.3195 | 0.5573 |
| Dirichlet, sequential dependence | gov2-dir-sd | 0.3357 | 0.5813 |
| Dirichlet, ESD, no time constraints (=WSD) | gov2-wsd-sd | 0.3435 | 0.5827 |
# command-line etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.SIGIR2010.xml data/clue/queries.web09.26-50.xml # evaluating effectiveness etc/trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-ql.txt etc/trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-sd.txt etc/trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-ql.txt perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-sd.txt perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt # junit etc/junit.sh ivory.regression.sigir2010.Clue_ESD
| description | tag | MAP | P10 |
| Dirichlet, full independence | clue-ql | 0.2098 | 0.2075 |
| Dirichlet, sequential dependence | clue-sd | 0.2208 | 0.2168 |
| Dirichlet, ESD, no time constraints (=WSD) | clue-wsd-sd | 0.2212 | 0.2243 |
(Note that results reported in the SIGIR 2010 paper are StatMAP values, not standard MAP values.)