Timeline

What we did
First, we did literature research to find out what visualization type is best for displaying frequencies. We found out that a bar chart is most suitable for this, so that's why we've used that. We used D3.JS to create a bar chart that shows how many documents were retrieved for each year. We decided to group documents by year, since our collection was too small to group them by month. Also, we made the bars clickable, so the user can easily refine his query. This is one example of faceted search in our engine (other examples are the date range and the sender frequency table).

What works well
We're happy that we did research on this, because the bar chart turned out to indeed be very effective for this visualization. Instead of just displaying a table, the user can immediately see which year has the most documents, and it's also easy to visually compare years, without having the read the actual numbers. This all wouldn't have been possible with a simple table. Also, the visualization works fast (thanks to D3's efficient framework), the bar chart is accurate and the clickable bars add a nice faceted search aspect to the search engine.

What has to be improved
The dates were extracted from the XML files. However, some XML files didn't contain a date. Because the indexer required us to specify a date, we set the date for these files to null, so ElasticSearch would exclude these from the counts. This does mean however that not all document dates are displayed in the bar chart.

Evaluation of quality
The bar chart is a great way to visualize the date distribution of the documents. It's easy to understand for the user and the clickable bars make it easy to refine the query. A drawback is that not all documents are included, because not all documents have dates specified. However, this is out of our hands, since it's an error in the XML files.