Course on inverted index

2017-01-17  |   |  computer science   conference  

I gave a three hours course on inverted index to students from Telecom SudParis an engineering school here in... Paris :) It was fun to refresh my knowledge on all the fundamental structures that make Lucene what it is.

I covered quite some ground for this three hours course (a bit to much to be honest). Amongst other things: b-tree, inverted index, how analyzers and filters do most of the magic (synonym, n-gram, phonetic approximation, stemming, etc.), how fuzzy search work in Lucene (state machine based), scoring, log-structured merge and the actual physical representation of a Lucene index and a few of the tricks the Lucene developers came up with. My list of reference link is pretty rich too.

Without further ado, here is the presentation. I tend to be sparse on my slides so make sure to press s to see the speaker notes. The presentation is released under Creative Commons and sources are on GitHub.

It is a first revision and can definitely benefit from a few improvements but there is only so much time per day :)


Name: Emmanuel Bernard
Bio tags: French, Open Source actor, Hibernate, (No)SQL, JCP, JBoss, Snowboard, Economy
Employer: JBoss by Red Hat
Resume: LinkedIn
Team blog: in.relation.to
Personal blog: No relation to
Microblog: Twitter, Google+
Geoloc: Paris, France

Tags