How to leverage vehicle data

Internet of Things: There are lots of things, but they’re not talking to each other, headlined Heise recently.

As a data lover with project experience in the automotive sector, I have extensively dealt with concepts to connect vehicles. However in this blog article, I want to go one step back and think first about the connectedness within a vehicle, the originating data, and if / how automakers and their suppliers already use the web to collect comprehensive data from cars for profiling.

As it happens, the issue is much more complicated than you might think. These days it’s not uncommon for a new car to feature 70 or more ECUs (electronic control units), each of them constantly collecting data. And it’s this huge variety of control units that complicates matters: on the one hand they are made by various manufacturers and operate autonomously, but on the other hand they share various in-vehicle bus systems ranging from tried-and-trusted CAN buses to FlexRay, MOST, and modern Ethernet-based systems for transmitting camera data. These bus systems in turn connect via gateways controlled by the automaker.

Data storage: Then as now, a trade-off with losses

Another factor is the cost of permanent data storage. Since solutions such as E2PROM were always expensive in the past, ECUs were given very little storage capacity to keep costs down. Consequently, data had to be – and still is – stored in a highly aggregated form, in which the information loses much of its granularity and context. This means that often only the most recent fault can be identified, and data such as how often the vehicle has performed an emergency stop is stored with no relationship to other events and no time stamp. What’s more, each ECU stores its data in a proprietary format dictated by the software it is running.

This heterogeneity leads to an almost unfathomable wealth of differently coded data formats, each of which can usually only be decoded by the relevant department of the ECU software manufacturer in question. Only the statutory diagnostics readings laid out in ISO 15031 are generally accessible – but these make up just a fraction of all the stored data.

So ECU suppliers cannot access all the data coded into other suppliers’ ECUs, while vehicle manufacturers have even less access – despite the fact that it is they who are the central gateway for all a vehicle’s ECUs. While there are historical technological reasons for this, it’s also the case that full access to all ECU data is a valuable prize – and one that I suspect no one is keen to share at the present time. I think it will be a while before both sides recognize that this could be a win-win situation.

Now I would like to briefly present the specific way in which the Bosch Group deals with this issue. I helped develop this approach in a project on systematic field data retrieval, and my hope is that it might ultimately enable us to increase our reliability in the field and offer even better customer service.

Challenge: Heterogeneous data and anonymization

Our first step was to look for ways to solve the problems posed by the heterogeneity of existing ECU data. ODX (Open Diagnostic Exchange) seemed an appropriate solution; developed by the Association for Standardization of Automation and Measuring Systems (ASAM), this format specifies exactly how to describe decoding and structure rules. But a description is not enough on its own; an ‘interpreter’ is needed to translate the memory dumps from the ECUs into readable objects using the decoding and structure rules described in ODX. So we developed a code generator that turns ODX definitions into java classes for data structures and generates executable decoding code with which to fill these data structures. In this way, we were able to create a kind of lingua franca for any kind of data structure of the sort commonly found in microcontrollers. This means objects can be stored along with their metadata in a MongoDB-based database without altering their structure.

Unlike proprietary solutions that are not tied to the vehicle, such as those that work via a smartphone connection to the car’s OBD box, our solution is not limited to reading the statutory – and hence generally accessible – static diagnostics data. We also developed a way to guarantee that each car’s unique, internationally recognized vehicle identification number (VIN) remains anonymous: with our system, the VIN is recoded using various algorithms to make it impossible to trace any data back to the vehicle’s owner.

Future challenge: Gaining added value by using data

We have created a technical solution for reading and saving any data from any ECU. But now we face new challenges: ECU suppliers and automakers have recognized that ECU data is extremely valuable. Use case #1: Connected validation as this could save automakers cost-intensive product recalls. Use case #2: Access to comprehensive, tamper-proof vehicle profiles (such as automatically collected mileage data) would be also a great unique selling point for a used-car portal, say. But a functioning data market for this type of application has yet to develop in the Internet of Things. By that I don’t mean there should be a trade in this data to benefit Big Brother, but rather to create added value and increase reliability for the automotive ecosystem for more reliability in the field and superior customer service and experience. One possible solution might be to form a joint organization to act as a data broker.

Do you have any other suggestions?


About the author

Alexander Rieger

Alexander Rieger

In 1997 I was among the six computer scientists co-founding Bosch Software Innovations. Right from the beginning, I was responsible for the development of the supply chain management system of Europe´s 3rd largest retailer, dealing with billions of data sets. I was also responsible for the initial development of Singapore´s eMobility charging infrastructure platform. I have many years of experience in software engineering and architecture, especially when it comes to artificial intelligence and data management issues. I was engaged with programming genetic algorithms and neuronal networks, ever since I started studying. After graduating I was involved in the development of a data mining application for one of the leading integrated financial services providers worldwide. Currently I am developing an enterprise-wide concept for NoSQL data storage topped by a KDD process (Knowledge Discovery in Database).