Swiss Transport in Real Time: Tribulations in the Big Data Stack


A lot of data are available in realtime on Swiss public transportation. Vehicles positions, station board (with delays) etc.

We use these data to illustrate a common pattern and build a proof of concept project. The idea is to address the question: "Is it possible to build a simple scalable infrastructure, to dispatch, transform and visualize
 'near real time' massive data and achieve a posteriori analysis?"

We will describe such an infrastructure, focusing on the different bricks:

  • streaming events with Kafka and Logstash;
  • flow transformation with Akka or Play Streaming;
  • storage in Elasticsearch;
  • real time visualization with ReactJS and d3.js;
  • a posteriori analysis with Python and Jupyter;
  • not to forget DevOps with Docker, GCE and AWS.

realtime viz example