Cruising in a Data Lake: From zero to scale

As part of Highly Automated Driving (HAD) group at HERE Technologies we build High-Definition Map (HDMap) of the real world to make autonomous driving possible. Given the complexity of pipelines for data enrichment and the petabyte scale of rich and unstructured content, there is a need for a mechanism to avoid data silos and to have one centralized way to access, evaluate and analyze the data across multiple systems. In this talk we will outline the principles and the technology behind our approach for building a data lake to address these challenges. We will provide guidelines for implementing and scaling up the data lake using Apache Spark in the cloud.