I believe that in a few years a new business will emerge, similar to today’s cryptocurrency mining, but focused on recognizing data from real-world sensors, such as video images or audio clips. Some Google or Yandex will be able to perform primary classification of objects, using their vast cluster with a neural network to determine that a picture shows a crocodile and a face. Third-party “contractors” will be able to focus on recognizing types of crocodiles or facial expressions, earning money from this specialization. Of course, for the first example, the neural network will need to be trained on crocodiles, and nothing else. Since there are very many object classes, a single, even large network owned by a giant like Google or Yandex will not be capable of handling everything, but it can “outsource” correctly: to another company that has a narrow specialization, and together they will provide the client with the best result.
Theoretically, recognition tasks can be assigned to more than one “contractor” and the results can be merged. For example, face recognition might be performed on different networks, trained on different races and nationalities. If the prediction of the contractor’s network turns out to be incorrect, their rating decreases, if it is confirmed – it increases.
Most likely, the system will not be structured as a tree with Google or Yandex at the top, and smaller companies below, but also as a network, where one company enriches the input stream with new knowledge, which helps the neighboring network to perform its specialized work better. For instance, understanding a crocodile’s breed might narrow down the recognition of the face of the aboriginal standing next to it, since breeds are tied to localities.
What do you think?
