Annotell, one of the first startups invited to MobilityXlab, trains computers by producing millions of images describing how a vehicle should interpret the surrounding world. This is a substantial part in order for autonomous, self-driving, vehicles to really function in the future of mobility. I meet with Oscar Petersson, one of the co-founders of Annotell, at MobilityXlab’s office in the center of Lindholmen, which is one of the most expanding automotive clusters in the world. He’s a very busy man nowadays, but it was quite easy to steal some of his time when I said we would be talking with him about his passion – building a global leading supplier of consistent high-quality trainings data.
Every entrepreneur has his/her passion in what they do. How did training data become your passion?
I have always been interested in finding a better solution to things that doesn't work well. There is a special feeling when you come across a problem and realize that there is room for improvement - your creativity kicks in and you start to create your own solution.
One of these problems were that training data is a major bottleneck for most machine learning teams. Training data is what you use to train your algorithms, it’s basically examples of how the vehicle should interpret the world, almost like a picture book for children.
The problem is that you need a lot of training data, millions of images, and it is very hard to cost efficiently produce such volume of training data with high consistency and high quality. After using different global providers of training data ourselves, we couldn’t find any that we were satisfied with. Either the quality was too low, or the price was too high. The reason is that the approach they use is wrong. Most providers of training data focus on having access to large amounts of people or having the quickest annotation tool to deal with the large volume of data. These aspects are very important, but the core problem that they miss is to ensure consistent interpretations among a large group of people. It is very hard to ensure that 500 people interpret e.g. what is drivable snow in a consistent way. But if you want to create an application, like an autonomous vehicle, which should behave in a consistent way, then consistency in the training data is very important. Realizing that this is the core problem and how to solve it is the reason to why training data has become a passion and why Annotell was founded.
Do you ever think broader than ”just” training data?
Of course. Annotell’s mission is to provide the world’s best platform for agreement at scale, which extends beyond just training data for the automotive industry. In addition, a lot of our customers ask if we can provide more parts of the value chain, like data gathering, algorithms and end user applications, but to become really good at something, you need to focus and conquer one thing at a time. There are so many other companies that are much better at providing algorithms and we don’t want to compete with them. However, it’s important that you understand the entire value chain. Everything from collection of data to validation of the application are tightly connected, which we take into account when providing the training data.
You have started Annotell’s career in the Automotive sector. Why?
The automotive industry is one of the most mature industries when it comes to understanding the requirements of developing high performing machine learning applications. The requirements on the training data is also very high, since training data determine the performance of your application. If your mobile phone does not unlock because it does not recognize your face, that’s an inconvenience, but if your vehicle drives off the road, the consequences could be fatal. So, the focus on safety and quality of the training data goes hand-in-hand. In additions, the volume of data that needs to be annotated to create a fully autonomous vehicle is tremendously high, so it’s a natural industry for us to be in.
Artificial Intelligence, machine learning, deep learning and training data, new words for many people, but what are your insights?
First of all, I think there is an inﬂation in using these words. The buzz that is created when you use these words has made it hard to distinguish the true interesting AI applications from those that just want to be part of the hype. That’s why we chose to not use AI in our company name, even though we use machine learning in our platform. Secondly, there are also extremely high expectations of what AI is and what it can accomplish. AI can refer to so many things and for someone not working in the field of AI, it’s easy to think that AI means “human like”, but to be honest, computers are still quite stupid compared to us humans. It’s a great tool for many applications, but it’s very hard to simulate what we humans are good at. There are so many pieces that need to be in place in order to create really advanced applications. Access to high quality training data is one of them.
Let’s dream a bit. Where will you be in five years?
I want to look back at the first five years and see that our approach to produce training data has made a big impact on the overall development and performance of machine learning applications. That also means that we are a true global provider of world-leading consistent high-quality training data and that we have expanded into new industries in addition to automotive.
If you started today, what would you do more or less of?
Well, we are still a quite young company and we learn new things every day. However, what has been a key success factor so far is that we try to engage and talk to our potential customers early. When you start a company, it’s easy to fall into the trap of thinking that you know what the customers want, so you start programming for six months and then meet customers with a “finished product”. It’s dangerous approach because you never know what the customer wants, unless you ask them. So we ask them directly before we develop anything. Hence, everything we do is based on an accumulated view of what our customers actually want. This is something we would absolutely do again and that’s why we take every chance there is to ask questions!
Annotell is a part of MobilityXlab’s community. What is your greatest value?
The best part of being part of MobilityXLab’s community is being part of the network. I think it’s a really interesting and a rather unique set-up to have multiple companies in the same field managing an initiative like this together. In addition, since the community is focused around automotive, it means it’s highly relevant for us where at least 5 of the 6 partners are potential customers to us. We are truly grateful for the opportunity to be part of MobilityXLab that is so much more than just a very nice office space. I doubt that we would have come so far in creating our company without being part of the MobilityXLab community.
MobilityXlab’s mission is, among other things, to facilitate meetings between startups and partners. What do you see as the greatest value in these meetings?
As a small company, it can be difficult to get attention and navigate within the big organization. Being a part of MobilityXlab gives us a spontaneous and natural dialogue with these companies. It starts as a dialogue rather than a sales pitch. I also think MobilityXLab is a great platform for the partner companies to learn and inspire each other how they can work with start-ups. Working with a start-up is quite different compared to working with a big organization, so there needs to be an internal structure in place to deal with that. At the same time, we learn how large organizations work. So, the dialogue is not all about training data, it’s about learning from each other and how to create something great together.
Headquarter: Gothenburg, Sweden
Business: Delivers high quality training data for machine learning
Employees: 8 + a network of full time annotation experts
Co-Founders: Oscar Petersson, Daniel Langkilde