Experimental Results

This page describes a number of experiments on standard test collections using Ivory. Each experiment is associated with a command-line invocation to perform the experiment, command-line invocations to evaluate the results, and a JUnit test case that performs the experiment and checks to see if the results are as expected (in terms of standard metrics of effectiveness). The purpose of enumerating these experiments is for others to be able to replicate our results, thereby providing a solid foundation for building on our work.

Here are the experiments categorized:

  • Experiments on TREC Disks 4-5: baselines, weighted sequential dependence model, latent concept expansion
  • Experiments on Wt10g: baselines
  • Experiments on Gov2: baselines
  • Experiments on first segment of ClueWeb09 (i.e., TREC Web09 track, category B): baselines, incorporating Waterloo spam scores
  • Experiments from Wang et al.'s SIGIR 2010 paper: the efficient sequential dependence model (ESD) on Wt10g, Gov2, and Clue
  • Experiments from Wang et al.'s CIKM 2010 paper: temporally constrained linear models on Wt10g, Gov2, and Clue
  • Experiments from Wang et al.'s SIGIR 2011 paper: cascade model on Wt10g, Gov2, and Clue

TREC Disks 4-5

Basic models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.basic.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-base.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-fd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-bm25-base.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-bm25-sd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-bm25-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_Basic
description tag MAP P10
Dirichlet, full independence robust04-dir-base 0.3063 0.4424
Dirichlet, sequential dependence robust04-dir-sd 0.3194 0.4485
Dirichlet, full dependence robust04-dir-fd 0.3253 0.4576
bm25, full independence robust04-bm25-base 0.3033 0.4283
bm25, sequential dependence robust04-bm25-sd 0.3212 0.4505
bm25, full dependence robust04-bm25-fd 0.3213 0.4545

WSD models

WSD refers to Bendersky et al.'s Weighted Sequential Dependence model (WSDM 2010).

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.wsd.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-wsd-sd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-wsd-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_WSD
description tag MAP P10
Dirichlet, WSD, sequential dependence robust04-dir-wsd-sd 0.3246 0.4626
Dirichlet, WSD, full dependence robust04-dir-wsd-fd 0.3283 0.4667

Basic + LCE models

LCE refers to Metzler et al's Latent Concept Expansion model (SIGIR 2007).

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.basic.lce.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-rm3-f.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-rm3-s.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd-lce-f.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd-lce-s.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd-lce-bigram.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_Basic_LCE
description tag MAP P10
Dir., full indep., LCE (unigrams) ["RM3"] (fast) robust04-dir-rm3-f 0.3558 0.4596
Dir., full indep., LCE (unigrams) ["RM3"] (slow) robust04-dir-rm3-s 0.3557 0.4596
Dir., SD, LCE (unigrams) (fast) robust04-dir-sd-lce-f 0.3789 0.4808
Dir., SD, LCE (unigrams) (slow) robust04-dir-sd-lce-s 0.3753 0.4657
Dir., SD, LCE (bigrams) robust04-dir-sd-lce-bigram 0.3510 0.4535

WSD + LCE models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.wsd.lce.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-wsd-lce.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_WSD_LCE
description tag MAP P10
Dir., WSD, LCE (unigrams) (fast) robust04-dir-wsd-lce 0.3941 0.4980

Wt10g

Basic models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal \
  data/wt10g/run.wt10g.basic.xml data/wt10g/queries.wt10g.451-500.xml data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-dir-base.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-dir-sd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-dir-fd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-bm25-base.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-bm25-sd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-bm25-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Wt10g_Basic
description tag MAP P10
Dirichlet, full independence wt10g-dir-base 0.2093 0.3131
Dirichlet, sequential dependence wt10g-dir-sd 0.2187 0.3192
Dirichlet, full dependence wt10g-dir-fd 0.2205 0.3242
bm25, full independence wt10g-bm25-base 0.2105 0.3202
bm25, sequential dependence wt10g-bm25-sd 0.2248 0.3333
bm25, full dependence wt10g-bm25-fd 0.2226 0.3394

Gov2

Basic models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal \
  data/gov2/run.gov2.basic.xml data/gov2/gov2.title.701-775 data/gov2/gov2.title.776-850

# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.gov2-dir-base.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-dir-sd.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-dir-fd.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-bm25-base.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-bm25-sd.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-bm25-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Gov2_Basic
description tag MAP P10
Dirichlet, full independence gov2-dir-base 0.3077 0.5631
Dirichlet, sequential dependence gov2-dir-sd 0.3239 0.6007
Dirichlet, full dependence gov2-dir-fd 0.3237 0.5933
bm25, full independence gov2-bm25-base 0.2999 0.5846
bm25, sequential dependence gov2-bm25-sd 0.3294 0.6081
bm25, full dependence gov2-bm25-fd 0.3295 0.6094

Web09 category B results

Baseline models

These are the same as our runs submitted to the TREC 2009 web track.

# command-line 
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.xml \
   data/clue/queries.web09.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB-bm25.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB-ql.txt

# junit
etc/junit.sh ivory.regression.basic.Web09catB_Baseline
description tag MAP P10
bm25 UMHOO-BM25-catB 0.2051 0.3720
QL UMHOO-QL-catB 0.1931 0.3380

Dependence Models

These runs contrast baseline models with dependence models (Dirichlet vs. bm25 term weighting). SD is Metzler and Croft's Sequential Dependence model (SIGIR 2005), and WSD is Bendersky et al.'s Weighted Sequential Dependence model (WSDM 2010). Note that the SD model is not trained, since it has hard-coded parameters. On the other hand, the WSD model is trained with all queries from TREC 2009 (optimizing StatMAP), which makes the WSD figures unrealistically high, since we're testing on the training set.

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.all.xml \
 data/clue/queries.web09.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.ql.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.ql.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.ql.wsd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.bm25.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.bm25.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.bm25.wsd.txt

# junit
etc/junit.sh ivory.regression.basic.Web09catB_All
description tag MAP P10
Dirichlet ql-base 0.1931 0.3380
Dirichlet + SD ql-sd 0.2048 0.3620
Dirichlet + WSD ql-wsd 0.2120 0.3580
bm25 bm25-base 0.2051 0.3720
bm25 + SD bm25-sd 0.2188 0.3920
bm25 + WSD bm25-wsd 0.2205 0.3940

Dependence Models + Waterloo spam scores

These runs are the same as the set above, except they include Waterloo spam scores. Training process started with the above models, and then parameter space was explored for the spam weight. Note that these figures are all unrealistically high, since we're testing on the training set.

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.all.spam.xml \
 data/clue/queries.web09.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.ql.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.ql.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.ql.wsd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.bm25.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.bm25.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.bm25.wsd.txt

# junit
etc/junit.sh ivory.regression.basic.Web09catB_All_Spam
description tag MAP P10
Dirichlet ql-base 0.2134 0.4540
Dirichlet + SD ql-sd 0.2223 0.4560
Dirichlet + WSD ql-wsd 0.2283 0.4160
bm25 bm25-base 0.2167 0.4220
bm25 + SD bm25-sd 0.2280 0.4420
bm25 + WSD bm25-wsd 0.2290 0.4340

Results from SIGIR 2010

These regression runs represent experiments presented in:

Lidan Wang, Jimmy Lin, and Donald Metzler. Learning to Efficiently Rank. Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2010), pages 138-145, July 2010, Geneva, Switzerland.

Note that Bendersky et al.'s WSD model (WSDM 2010) is a special case of the ESD model (without any time constraints). These runs represent results in Tables 5 and 6 of the paper.

Wt10g results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2010.xml \
  data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-ql.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-sd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-wsd-sd.txt

# junit
etc/junit.sh ivory.regression.sigir2010.Wt10g_ESD
description tag MAP P10
Dirichlet, full independence wt10g-ql 0.2151 0.3560
Dirichlet, sequential dependence wt10g-dir-sd 0.2242 0.3640
Dirichlet, ESD, no time constraints (=WSD) wt10g-wsd-sd 0.2411 0.3820

Gov2 results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2010.xml  \
  data/gov2/gov2.title.776-850
    
# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-ql.txt
trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-sd.txt
trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-wsd-sd.txt
    
# junit
etc/junit.sh ivory.regression.sigir2010.Gov2_ESD
description tag MAP P10
Dirichlet, full independence gov2-ql 0.3195 0.5573
Dirichlet, sequential dependence gov2-dir-sd 0.3357 0.5813
Dirichlet, ESD, no time constraints (=WSD) gov2-wsd-sd 0.3435 0.5827

Web09 category B results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.SIGIR2010.xml \
  data/clue/queries.web09.26-50.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-ql.txt
trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt

perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-ql.txt
perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-sd.txt
perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt

# junit
etc/junit.sh ivory.regression.sigir2010.Clue_ESD
description tag MAP statMAP
Dirichlet, full independence clue-ql 0.2098 0.2075
Dirichlet, sequential dependence clue-sd 0.2208 0.2168
Dirichlet, ESD, no time constraints (=WSD) clue-wsd-sd 0.2212 0.2243

(Note that results reported in the SIGIR 2010 paper are StatMAP values, not standard MAP values.)

Results from CIKM 2010

These regression runs represent experiments presented in:

Lidan Wang, Donald Metzler, and Jimmy Lin. Ranking under Temporal Constraints. Proceedings of 19th International Conference on Information and Knowledge Management (CIKM 2010), pages 79-88, October 2010, Toronto, Canada.

Wt10g results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.title.indep.xml \
  data/wt10g/wt10g_queries_501-550.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.title.joint.xml \
  data/wt10g/wt10g_queries_501-550.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.desc.indep.xml \
  data/wt10g/wt10g_queries_501-550_desc.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.desc.joint.xml \
  data/wt10g/wt10g_queries_501-550_desc.xml
 
# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x5.0.txt

trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x5.0.txt

trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x5.0.txt

trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x5.0.txt

# junit
etc/junit.sh ivory.regression.cikm2010.Wt10g_Title_Indep
etc/junit.sh ivory.regression.cikm2010.Wt10g_Title_Joint
etc/junit.sh ivory.regression.cikm2010.Wt10g_Desc_Indep
etc/junit.sh ivory.regression.cikm2010.Wt10g_Desc_Joint
MAP Title Description
Title queries Indep Joint Indep Joint
QL x 1.0 0.16950.2200 0.17530.1778
QL x 1.5 0.21430.2230 0.18590.1937
QL x 2.0 0.22920.2325 0.19570.2048
QL x 2.5 0.22990.2275 0.20480.2136
QL x 3.0 0.23330.2307 0.20940.2126
QL x 3.5 0.23230.2358 0.21010.2133
QL x 4.0 0.23660.2366 0.21550.2201
QL x 4.5 0.24130.2378 0.21780.2226
QL x 5.0 0.24220.2387 0.21730.2175

Gov2 results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.title.indep.xml \
  data/gov2/gov2.title.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.title.joint.xml \
  data/gov2/gov2.title.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.desc.indep.xml \
  data/gov2/gov2.desc.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.desc.joint.xml \
  data/gov2/gov2.desc.776-850

# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x5.0.txt

trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x5.0.txt

trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x5.0.txt

trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x5.0.txt

# junit
etc/junit.sh ivory.regression.cikm2010.Gov2_Title_Indep
etc/junit.sh ivory.regression.cikm2010.Gov2_Title_Joint
etc/junit.sh ivory.regression.cikm2010.Gov2_Desc_Indep
etc/junit.sh ivory.regression.cikm2010.Gov2_Desc_Joint
MAP Title Description
Title queries Indep Joint Indep Joint
QL x 1.0 0.17610.3174 0.25050.3067
QL x 1.5 0.27240.3201 0.28890.3089
QL x 2.0 0.32960.3361 0.30280.3115
QL x 2.5 0.33250.3364 0.31040.3104
QL x 3.0 0.33790.3506 0.31610.3277
QL x 3.5 0.34210.3522 0.32110.3260
QL x 4.0 0.35120.3524 0.32380.3335
QL x 4.5 0.35400.3524 0.32630.3302
QL x 5.0 0.35500.3524 0.32850.3294

Clue results


# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.title.indep.xml \
  data/clue/queries.web09.26-50.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.title.joint.xml \
  data/clue/queries.web09.26-50.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.desc.indep.xml \
  data/clue/queries.web09.26-50.desc.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.desc.joint.xml \
  data/clue/queries.web09.26-50.desc.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x5.0.txt

trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x5.0.txt

trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x5.0.txt

trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x5.0.txt

# junit
etc/junit.sh ivory.regression.cikm2010.Web09catB_Title_Indep
etc/junit.sh ivory.regression.cikm2010.Web09catB_Title_Joint
etc/junit.sh ivory.regression.cikm2010.Web09catB_Desc_Indep
etc/junit.sh ivory.regression.cikm2010.Web09catB_Desc_Joint
MAP Title Description
Title queries Indep Joint Indep Joint
QL x 1.0 0.11340.2256 0.13930.1683
QL x 1.5 0.19160.2256 0.16250.1563
QL x 2.0 0.20970.2306 0.14460.1573
QL x 2.5 0.21180.2315 0.14340.1558
QL x 3.0 0.21590.2324 0.14610.1561
QL x 3.5 0.21520.2348 0.15020.1567
QL x 4.0 0.22180.2360 0.14990.1587
QL x 4.5 0.22480.2360 0.14920.1567
QL x 5.0 0.22500.2360 0.14960.1561

(Note that results reported in the CIKM 2010 paper are StatMAP values, not standard MAP values reported in the table above.)

Results from SIGIR 2011

These regression runs represent experiments presented in:

Lidan Wang, Jimmy Lin, and Donald Metzler. A Cascade Ranking Model for Efficient Ranked Retrieval. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), page 105-114, July 2011, Beijing, China.

Wt10g results

Main results in Table 2:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2011.xml \
  data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-QL.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-AdaRank.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-FeaturePrune.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-Cascade.txt

# junit
etc/junit.sh ivory.regression.sigir2011.Wt10g_Cascade
description tag NDCG20 P20
Baseline query-likelihood (Dirichlet scoring) Wt10g-QL 0.3407 0.3240
AdaRank Wt10g-AdaRank 0.3549 0.3350
Feature pruning (SIGIR 2010) Wt10g-FeaturePrune 0.3486 0.3310
Cascade Wt10g-Cascade 0.3560 0.3380

Results in Figure 3:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2011.varying.tradeoff.featureprune.xml \
  data/wt10g/queries.wt10g.501-550.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2011.varying.tradeoff.cascade.xml \
  data/wt10g/queries.wt10g.501-550.xml

# junit
etc/junit.sh ivory.regression.sigir2011.Wt10g_VaryingTradeoff_FeaturePrune
etc/junit.sh ivory.regression.sigir2011.Wt10g_VaryingTradeoff_Cascade

Gov2 results

Main results in Table 2:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2011.xml \
  data/gov2/gov2.title.776-850

# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-QL.txt
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-AdaRank.txt
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-FeaturePrune.txt
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-Cascade.txt

# junit
etc/junit.sh ivory.regression.sigir2011.Gov2_Cascade
description tag NDCG P20
Baseline query-likelihood (Dirichlet scoring) Gov2-QL 0.4457 0.5093
AdaRank Gov2-AdaRank 0.4737 0.5360
Feature pruning (SIGIR 2010) Gov2-FeaturePrune 0.4716 0.5187
Cascade Gov2-Cascade 0.4744 0.5447

Results in Figure 3:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2011.varying.tradeoff.featureprune.xml \
  data/gov2/gov2.title.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2011.varying.tradeoff.cascade.xml \
  data/gov2/gov2.title.776-850

# junit
etc/junit.sh ivory.regression.sigir2011.Gov2_VaryingTradeoff_FeaturePrune
etc/junit.sh ivory.regression.sigir2011.Gov2_VaryingTradeoff_Cascade

Web09 category B results (i.e. Clue)

Main results in Table 2:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.SIGIR2011.xml \
  data/clue/queries.web09.26-50.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-QL.txt
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-AdaRank.txt
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-FeaturePrune.txt
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-Cascade.txt

# junit
etc/junit.sh ivory.regression.sigir2011.Clue_Cascade
description tag NDCG P20
Baseline query-likelihood (Dirichlet scoring) Clue-QL 0.2750 0.3420
AdaRank Clue-AdaRank 0.3094 0.3740
Feature pruning (SIGIR 2010) Clue-FeaturePrune 0.2966 0.3620
Cascade Clue-Cascade 0.3060 0.3740

Results in Figure 3:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.SIGIR2011.varying.tradeoff.featureprune.xml \
  data/clue/queries.web09.26-50.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.SIGIR2011.varying.tradeoff.cascade.xml \
  data/clue/queries.web09.26-50.xml

# junit
etc/junit.sh ivory.regression.sigir2011.Clue_VaryingTradeoff_FeaturePrune
etc/junit.sh ivory.regression.sigir2011.Clue_VaryingTradeoff_Cascade