Tutorial @ ECML PKDD 2018

This is the home page of the “Efficiency/Effectiveness Trade-offs in Learning to Rank” tutorial by Claudio Lucchese and Franco Maria Nardini at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2018.

email: ltrtutorial [AT] isti.cnr.it

Abstract

In the last years, learning to rank (LtR) had a significant influence on several data mining tasks and in particular in the Information Retrieval field,with large research efforts coming both from the academia and the industry. Indeed, efficiency requirements must be fulfilled in order to make an effective research product deployable within an industrial environment. The evaluation of a model can be too expensive due to its size, the features used and several other factors. This tutorial discusses the recent solutions that allow to build an effective ranking model that satisfies temporal budget constrains at evaluation time.

Speakers

Claudio Lucchese is associate Professor with the Università Ca’ Foscari di Venezia. Until 2017 he was a researcher with the I.S.T.I. “A. Faedo” – C.N.R. working with the High Performance Computing Lab. His main research activities are in the areas of data mining techniques for information retrieval and large-scale data processing. He has published more than 100 papers on these topics in peer reviewed international journals, conferences and other venues. He taught courses on data mining and parallel computing in the Ms.C. Computer Science courses at University of Florence. He taught cloud computing and data management in Ph.D. courses at the University of Pisa and Institute for Advanced Studies in Lucca. He is currently teaching high performance computing in the Ms.C. Computer Science courses at University of
Venice, and Web Search and Data Mining in the Ph.D. course of the same university.

Franco Maria Nardini is a researcher with the Italian National Research Council. He received the Ph.D. in Information Engineering from the University of Pisa in 2011. His research interests focus on Web Information Retrieval (IR), Data Mining (DM), and Machine Learning. He served as program committee member of several top-level conferences of IR and DM. He authored more than 50 papers in peer-reviewed international journal, conferences and other venues. He taught courses on Web mining and parallel computing in the Ms.C. Computer Science and Telecommunication Engineering courses at the University of Pisa. He also taught cloud computing in Ph.D. courses at the University of Pisa and Institute for Advanced Studies in Lucca. He is currently teaching computer architectures in the Ms.C. Telecommunication Engineering at the University of Pisa.

Claudio Lucchese and Franco Maria Nardini received the Best Paper Award at SIGIR 2015 for their work on efficient traversal of Learning-to-Rank models. ACM DL.

email: ltrtutorial [AT] isti.cnr.it

General Software Requirements

To ease the use of the material we will supply during the tutorial we strongly encourage you to install the following software:

  • GCC 5.0 or above.
  • CMake 2.9 or above.
  • Python 2.7 (with Ipython and Jupyter notebook)
  • QuickRank (open source, instructions to install it can be found here)
  • RankEval (open source, instructions to install it can be found here)
    • NOTE: you need to install RankEval v0.5.1
      • with git: git clone https://github.com/hpclab/rankeval/releases/tag/v0.5.1
      • or download the zip file (source code) here.

Specific Software Requirements for QuickScorer

QuickScorer is the state-of-the-art algorithm for scoring forests of regression trees. QuickScorer is undergoing a patent process. The source code of QuickScorer is made available under NDA with Tiscali S.p.A. The attendees of the tutorial can access the source code of QuickScorer by signing this NDA. After receiving a signed copy of the NDA we will send you the source code of QuickScorer.

The software requirements needed to compile and run QuickScorer are:

  • GCC 5.0 or above.
  • CMake 2.9 or above.
  • Boost 1.60 or above.

Useful Software

Our HandsOn sessions use also:

  • perf
  • source code of VPred by Asadi et al., available here.

Useful Software

  • Slides, part 1: here.
  • Slides, part 2: here.
  • Slides, part 3: here.
  • Hands-On 1, Jupyter Notebook: GitHub.
  • Hands-On 2, Jupyter Notebook: GitHub.
  • Hands-On 1/2, QuickRank models used: here.