Ivory: Experimental Results

Experimental Results

This page describes a number of experiments on standard test collections using Ivory. Each experiment is associated with a command-line invocation to perform the experiment, command-line invocations to evaluate the results, and a JUnit test case that performs the experiment and checks to see if the results are as expected (in terms of standard metrics of effectiveness). The purpose of enumerating these experiments is for others to be able to replicate our results, thereby providing a solid foundation for building on our work.

Here are the experiments categorized:

Experiments on TREC Disks 4-5: baselines, weighted sequential dependence model, latent concept expansion
Experiments on Wt10g: baselines
Experiments on Gov2: baselines
Experiments on first segment of ClueWeb09 (i.e., TREC Web09 track, category B): baselines, incorporating Waterloo spam scores
Experiments from Wang et al.'s SIGIR 2010 paper: the efficient sequential dependence model (ESD) on Wt10g, Gov2, and Clue
Experiments from Wang et al.'s CIKM 2010 paper: temporally constrained linear models on Wt10g, Gov2, and Clue
Experiments from Wang et al.'s SIGIR 2011 paper: cascade model on Wt10g, Gov2, and Clue

TREC Disks 4-5

Basic models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.basic.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-base.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-fd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-bm25-base.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-bm25-sd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-bm25-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_Basic

description	tag	MAP	P10
Dirichlet, full independence	robust04-dir-base	0.3063	0.4424
Dirichlet, sequential dependence	robust04-dir-sd	0.3194	0.4485
Dirichlet, full dependence	robust04-dir-fd	0.3253	0.4576
bm25, full independence	robust04-bm25-base	0.3033	0.4283
bm25, sequential dependence	robust04-bm25-sd	0.3212	0.4505
bm25, full dependence	robust04-bm25-fd	0.3213	0.4545

WSD models

WSD refers to Bendersky et al.'s Weighted Sequential Dependence model (WSDM 2010).

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.wsd.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-wsd-sd.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-wsd-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_WSD

description	tag	MAP	P10
Dirichlet, WSD, sequential dependence	robust04-dir-wsd-sd	0.3246	0.4626
Dirichlet, WSD, full dependence	robust04-dir-wsd-fd	0.3283	0.4667

Basic + LCE models

LCE refers to Metzler et al's Latent Concept Expansion model (SIGIR 2007).

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.basic.lce.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-rm3-f.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-rm3-s.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd-lce-f.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd-lce-s.txt
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-sd-lce-bigram.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_Basic_LCE

description	tag	MAP	P10
Dir., full indep., LCE (unigrams) ["RM3"] (fast)	robust04-dir-rm3-f	0.3558	0.4596
Dir., full indep., LCE (unigrams) ["RM3"] (slow)	robust04-dir-rm3-s	0.3557	0.4596
Dir., SD, LCE (unigrams) (fast)	robust04-dir-sd-lce-f	0.3789	0.4808
Dir., SD, LCE (unigrams) (slow)	robust04-dir-sd-lce-s	0.3753	0.4657
Dir., SD, LCE (bigrams)	robust04-dir-sd-lce-bigram	0.3510	0.4535

WSD + LCE models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/trec/run.robust04.wsd.lce.xml data/trec/queries.robust04.xml

# evaluating effectiveness
trec_eval data/trec/qrels.robust04.noCRFR.txt ranking.robust04-dir-wsd-lce.txt

# junit
etc/junit.sh ivory.regression.basic.Robust04_WSD_LCE

description	tag	MAP	P10
Dir., WSD, LCE (unigrams) (fast)	robust04-dir-wsd-lce	0.3941	0.4980

Wt10g

Basic models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal \
  data/wt10g/run.wt10g.basic.xml data/wt10g/queries.wt10g.451-500.xml data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-dir-base.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-dir-sd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-dir-fd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-bm25-base.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-bm25-sd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.wt10g-bm25-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Wt10g_Basic

description	tag	MAP	P10
Dirichlet, full independence	wt10g-dir-base	0.2093	0.3131
Dirichlet, sequential dependence	wt10g-dir-sd	0.2187	0.3192
Dirichlet, full dependence	wt10g-dir-fd	0.2205	0.3242
bm25, full independence	wt10g-bm25-base	0.2105	0.3202
bm25, sequential dependence	wt10g-bm25-sd	0.2248	0.3333
bm25, full dependence	wt10g-bm25-fd	0.2226	0.3394

Gov2

Basic models

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal \
  data/gov2/run.gov2.basic.xml data/gov2/gov2.title.701-775 data/gov2/gov2.title.776-850

# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.gov2-dir-base.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-dir-sd.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-dir-fd.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-bm25-base.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-bm25-sd.txt
trec_eval data/gov2/qrels.gov2.all ranking.gov2-bm25-fd.txt

# junit
etc/junit.sh ivory.regression.basic.Gov2_Basic

description	tag	MAP	P10
Dirichlet, full independence	gov2-dir-base	0.3077	0.5631
Dirichlet, sequential dependence	gov2-dir-sd	0.3239	0.6007
Dirichlet, full dependence	gov2-dir-fd	0.3237	0.5933
bm25, full independence	gov2-bm25-base	0.2999	0.5846
bm25, sequential dependence	gov2-bm25-sd	0.3294	0.6081
bm25, full dependence	gov2-bm25-fd	0.3295	0.6094

Web09 category B results

Baseline models

These are the same as our runs submitted to the TREC 2009 web track.

# command-line 
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.xml \
   data/clue/queries.web09.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB-bm25.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB-ql.txt

# junit
etc/junit.sh ivory.regression.basic.Web09catB_Baseline

description	tag	MAP	P10
bm25	UMHOO-BM25-catB	0.2051	0.3720
QL	UMHOO-QL-catB	0.1931	0.3380

Dependence Models

These runs contrast baseline models with dependence models (Dirichlet vs. bm25 term weighting). SD is Metzler and Croft's Sequential Dependence model (SIGIR 2005), and WSD is Bendersky et al.'s Weighted Sequential Dependence model (WSDM 2010). Note that the SD model is not trained, since it has hard-coded parameters. On the other hand, the WSD model is trained with all queries from TREC 2009 (optimizing StatMAP), which makes the WSD figures unrealistically high, since we're testing on the training set.

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.all.xml \
 data/clue/queries.web09.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.ql.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.ql.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.ql.wsd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.bm25.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.bm25.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.all.bm25.wsd.txt

# junit
etc/junit.sh ivory.regression.basic.Web09catB_All

description	tag	MAP	P10
Dirichlet	ql-base	0.1931	0.3380
Dirichlet + SD	ql-sd	0.2048	0.3620
Dirichlet + WSD	ql-wsd	0.2120	0.3580
bm25	bm25-base	0.2051	0.3720
bm25 + SD	bm25-sd	0.2188	0.3920
bm25 + WSD	bm25-wsd	0.2205	0.3940

Dependence Models + Waterloo spam scores

These runs are the same as the set above, except they include Waterloo spam scores. Training process started with the above models, and then parameter space was explored for the spam weight. Note that these figures are all unrealistically high, since we're testing on the training set.

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.all.spam.xml \
 data/clue/queries.web09.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.ql.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.ql.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.ql.wsd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.bm25.base.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.bm25.sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.web09catB.spam.bm25.wsd.txt

# junit
etc/junit.sh ivory.regression.basic.Web09catB_All_Spam

description	tag	MAP	P10
Dirichlet	ql-base	0.2134	0.4540
Dirichlet + SD	ql-sd	0.2223	0.4560
Dirichlet + WSD	ql-wsd	0.2283	0.4160
bm25	bm25-base	0.2167	0.4220
bm25 + SD	bm25-sd	0.2280	0.4420
bm25 + WSD	bm25-wsd	0.2290	0.4340

Results from SIGIR 2010

These regression runs represent experiments presented in:

Lidan Wang, Jimmy Lin, and Donald Metzler. Learning to Efficiently Rank. Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2010), pages 138-145, July 2010, Geneva, Switzerland.

Note that Bendersky et al.'s WSD model (WSDM 2010) is a special case of the ESD model (without any time constraints). These runs represent results in Tables 5 and 6 of the paper.

Wt10g results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2010.xml \
  data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-ql.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-sd.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.sigir2010-wt10g-wsd-sd.txt

# junit
etc/junit.sh ivory.regression.sigir2010.Wt10g_ESD

description	tag	MAP	P10
Dirichlet, full independence	wt10g-ql	0.2151	0.3560
Dirichlet, sequential dependence	wt10g-dir-sd	0.2242	0.3640
Dirichlet, ESD, no time constraints (=WSD)	wt10g-wsd-sd	0.2411	0.3820

Gov2 results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2010.xml  \
  data/gov2/gov2.title.776-850
    
# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-ql.txt
trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-sd.txt
trec_eval data/gov2/qrels.gov2.all ranking.sigir2010-gov2-wsd-sd.txt
    
# junit
etc/junit.sh ivory.regression.sigir2010.Gov2_ESD

description	tag	MAP	P10
Dirichlet, full independence	gov2-ql	0.3195	0.5573
Dirichlet, sequential dependence	gov2-dir-sd	0.3357	0.5813
Dirichlet, ESD, no time constraints (=WSD)	gov2-wsd-sd	0.3435	0.5827

Web09 category B results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.web09catB.SIGIR2010.xml \
  data/clue/queries.web09.26-50.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-ql.txt
trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-sd.txt
trec_eval data/clue/qrels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt

perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-ql.txt
perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-sd.txt
perl etc/statAP_MQ_eval_v3.pl data/clue/prels.web09catB.txt ranking.sigir2010-web09catB-wsd-sd.txt

# junit
etc/junit.sh ivory.regression.sigir2010.Clue_ESD

description	tag	MAP	statMAP
Dirichlet, full independence	clue-ql	0.2098	0.2075
Dirichlet, sequential dependence	clue-sd	0.2208	0.2168
Dirichlet, ESD, no time constraints (=WSD)	clue-wsd-sd	0.2212	0.2243

(Note that results reported in the SIGIR 2010 paper are StatMAP values, not standard MAP values.)

Results from CIKM 2010

These regression runs represent experiments presented in:

Lidan Wang, Donald Metzler, and Jimmy Lin. Ranking under Temporal Constraints. Proceedings of 19th International Conference on Information and Knowledge Management (CIKM 2010), pages 79-88, October 2010, Toronto, Canada.

Wt10g results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.title.indep.xml \
  data/wt10g/wt10g_queries_501-550.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.title.joint.xml \
  data/wt10g/wt10g_queries_501-550.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.desc.indep.xml \
  data/wt10g/wt10g_queries_501-550_desc.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.CIKM2010.desc.joint.xml \
  data/wt10g/wt10g_queries_501-550_desc.xml
 
# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-indep-x5.0.txt

trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-title-joint-x5.0.txt

trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-indep-x5.0.txt

trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x1.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x1.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x2.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x2.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x3.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x3.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x4.0.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x4.5.txt
trec_eval data/wt10g/qrels.wt10g ranking.cikm2010-wt10g-desc-joint-x5.0.txt

# junit
etc/junit.sh ivory.regression.cikm2010.Wt10g_Title_Indep
etc/junit.sh ivory.regression.cikm2010.Wt10g_Title_Joint
etc/junit.sh ivory.regression.cikm2010.Wt10g_Desc_Indep
etc/junit.sh ivory.regression.cikm2010.Wt10g_Desc_Joint

MAP	Title		Description
Title queries	Indep	Joint	Indep	Joint
QL x 1.0	0.1695	0.2200	0.1753	0.1778
QL x 1.5	0.2143	0.2230	0.1859	0.1937
QL x 2.0	0.2292	0.2325	0.1957	0.2048
QL x 2.5	0.2299	0.2275	0.2048	0.2136
QL x 3.0	0.2333	0.2307	0.2094	0.2126
QL x 3.5	0.2323	0.2358	0.2101	0.2133
QL x 4.0	0.2366	0.2366	0.2155	0.2201
QL x 4.5	0.2413	0.2378	0.2178	0.2226
QL x 5.0	0.2422	0.2387	0.2173	0.2175

Gov2 results

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.title.indep.xml \
  data/gov2/gov2.title.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.title.joint.xml \
  data/gov2/gov2.title.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.desc.indep.xml \
  data/gov2/gov2.desc.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.CIKM2010.desc.joint.xml \
  data/gov2/gov2.desc.776-850

# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-indep-x5.0.txt

trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-title-joint-x5.0.txt

trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-indep-x5.0.txt

trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x1.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x1.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x2.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x2.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x3.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x3.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x4.0.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x4.5.txt
trec_eval data/gov2/qrels.gov2.all ranking.cikm2010-gov2-desc-joint-x5.0.txt

# junit
etc/junit.sh ivory.regression.cikm2010.Gov2_Title_Indep
etc/junit.sh ivory.regression.cikm2010.Gov2_Title_Joint
etc/junit.sh ivory.regression.cikm2010.Gov2_Desc_Indep
etc/junit.sh ivory.regression.cikm2010.Gov2_Desc_Joint

MAP	Title		Description
Title queries	Indep	Joint	Indep	Joint
QL x 1.0	0.1761	0.3174	0.2505	0.3067
QL x 1.5	0.2724	0.3201	0.2889	0.3089
QL x 2.0	0.3296	0.3361	0.3028	0.3115
QL x 2.5	0.3325	0.3364	0.3104	0.3104
QL x 3.0	0.3379	0.3506	0.3161	0.3277
QL x 3.5	0.3421	0.3522	0.3211	0.3260
QL x 4.0	0.3512	0.3524	0.3238	0.3335
QL x 4.5	0.3540	0.3524	0.3263	0.3302
QL x 5.0	0.3550	0.3524	0.3285	0.3294

Clue results


# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.title.indep.xml \
  data/clue/queries.web09.26-50.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.title.joint.xml \
  data/clue/queries.web09.26-50.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.desc.indep.xml \
  data/clue/queries.web09.26-50.desc.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.CIKM2010.desc.joint.xml \
  data/clue/queries.web09.26-50.desc.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-indep-x5.0.txt

trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-title-joint-x5.0.txt

trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-indep-x5.0.txt

trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x1.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x1.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x2.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x2.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x3.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x3.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x4.0.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x4.5.txt
trec_eval data/clue/qrels.web09catB.txt ranking.cikm2010-clue-desc-joint-x5.0.txt

# junit
etc/junit.sh ivory.regression.cikm2010.Web09catB_Title_Indep
etc/junit.sh ivory.regression.cikm2010.Web09catB_Title_Joint
etc/junit.sh ivory.regression.cikm2010.Web09catB_Desc_Indep
etc/junit.sh ivory.regression.cikm2010.Web09catB_Desc_Joint

MAP	Title		Description
Title queries	Indep	Joint	Indep	Joint
QL x 1.0	0.1134	0.2256	0.1393	0.1683
QL x 1.5	0.1916	0.2256	0.1625	0.1563
QL x 2.0	0.2097	0.2306	0.1446	0.1573
QL x 2.5	0.2118	0.2315	0.1434	0.1558
QL x 3.0	0.2159	0.2324	0.1461	0.1561
QL x 3.5	0.2152	0.2348	0.1502	0.1567
QL x 4.0	0.2218	0.2360	0.1499	0.1587
QL x 4.5	0.2248	0.2360	0.1492	0.1567
QL x 5.0	0.2250	0.2360	0.1496	0.1561

(Note that results reported in the CIKM 2010 paper are StatMAP values, not standard MAP values reported in the table above.)

Results from SIGIR 2011

These regression runs represent experiments presented in:

Lidan Wang, Jimmy Lin, and Donald Metzler. A Cascade Ranking Model for Efficient Ranked Retrieval. Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), page 105-114, July 2011, Beijing, China.

Wt10g results

Main results in Table 2:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2011.xml \
  data/wt10g/queries.wt10g.501-550.xml

# evaluating effectiveness
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-QL.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-AdaRank.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-FeaturePrune.txt
trec_eval data/wt10g/qrels.wt10g.all ranking.SIGIR2011-Wt10g-Cascade.txt

# junit
etc/junit.sh ivory.regression.sigir2011.Wt10g_Cascade

description	tag	NDCG20	P20
Baseline query-likelihood (Dirichlet scoring)	Wt10g-QL	0.3407	0.3240
AdaRank	Wt10g-AdaRank	0.3549	0.3350
Feature pruning (SIGIR 2010)	Wt10g-FeaturePrune	0.3486	0.3310
Cascade	Wt10g-Cascade	0.3560	0.3380

Results in Figure 3:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2011.varying.tradeoff.featureprune.xml \
  data/wt10g/queries.wt10g.501-550.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/wt10g/run.wt10g.SIGIR2011.varying.tradeoff.cascade.xml \
  data/wt10g/queries.wt10g.501-550.xml

# junit
etc/junit.sh ivory.regression.sigir2011.Wt10g_VaryingTradeoff_FeaturePrune
etc/junit.sh ivory.regression.sigir2011.Wt10g_VaryingTradeoff_Cascade

Gov2 results

Main results in Table 2:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2011.xml \
  data/gov2/gov2.title.776-850

# evaluating effectiveness
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-QL.txt
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-AdaRank.txt
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-FeaturePrune.txt
trec_eval data/gov2/qrels.gov2.all ranking.SIGIR2011-Gov2-Cascade.txt

# junit
etc/junit.sh ivory.regression.sigir2011.Gov2_Cascade

description	tag	NDCG	P20
Baseline query-likelihood (Dirichlet scoring)	Gov2-QL	0.4457	0.5093
AdaRank	Gov2-AdaRank	0.4737	0.5360
Feature pruning (SIGIR 2010)	Gov2-FeaturePrune	0.4716	0.5187
Cascade	Gov2-Cascade	0.4744	0.5447

Results in Figure 3:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2011.varying.tradeoff.featureprune.xml \
  data/gov2/gov2.title.776-850
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/gov2/run.gov2.SIGIR2011.varying.tradeoff.cascade.xml \
  data/gov2/gov2.title.776-850

# junit
etc/junit.sh ivory.regression.sigir2011.Gov2_VaryingTradeoff_FeaturePrune
etc/junit.sh ivory.regression.sigir2011.Gov2_VaryingTradeoff_Cascade

Web09 category B results (i.e. Clue)

Main results in Table 2:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.SIGIR2011.xml \
  data/clue/queries.web09.26-50.xml

# evaluating effectiveness
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-QL.txt
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-AdaRank.txt
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-FeaturePrune.txt
trec_eval data/clue/qrels.web09catB.txt ranking.SIGIR2011-Clue-Cascade.txt

# junit
etc/junit.sh ivory.regression.sigir2011.Clue_Cascade

description	tag	NDCG	P20
Baseline query-likelihood (Dirichlet scoring)	Clue-QL	0.2750	0.3420
AdaRank	Clue-AdaRank	0.3094	0.3740
Feature pruning (SIGIR 2010)	Clue-FeaturePrune	0.2966	0.3620
Cascade	Clue-Cascade	0.3060	0.3740

Results in Figure 3:

# command-line
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.SIGIR2011.varying.tradeoff.featureprune.xml \
  data/clue/queries.web09.26-50.xml
etc/run.sh ivory.smrf.retrieval.RunQueryLocal data/clue/run.clue.SIGIR2011.varying.tradeoff.cascade.xml \
  data/clue/queries.web09.26-50.xml

# junit
etc/junit.sh ivory.regression.sigir2011.Clue_VaryingTradeoff_FeaturePrune
etc/junit.sh ivory.regression.sigir2011.Clue_VaryingTradeoff_Cascade