The Human-Powered Companies That Make AI Work

Making AI WorkThe hidden secret of artificial intelligence is that much of it is actually powered by humans. Well, to be specific, the supervised learning algorithms that have gained much of the attention recently are dependent on humans to provide well-labeled training data that can be used to train machine learning algorithms. Since machines have to first be taught, they can’t teach themselves (yet), so it falls upon the capabilities of humans to do this training. This is the secret achilles heel of AI: the need for humans to teach machines the things that they are not yet able to do on their own.

Machine learning is what powers today’s AI systems. Organizations are implementing one or more of the seven patterns of AI, including computer vision, natural language processing, predictive analytics, autonomous systems, pattern and anomaly detection, goal-driven systems, and hyperpersonalization across a wide range of applications. However, in order for these systems to be able to create accurate generalizations, these machine learning systems must be trained on data. The more advanced forms of machine learning, especially deep learning neural networks, require significant volumes of data to be able to create models with desired levels of accuracy. It goes without saying then, that the machine learning data needs to be clean, accurate, complete, and well-labeled so the resulting machine learning models are accurate. Whereas it has always been the case that garbage in is garbage out in computing, it is especially the case with regards to machine learning data.

According to analyst firm Cognilytica, over 80% of AI project time is spent preparing and labeling data for use in machine learning projects:


Percentage of time allocated to machine learning tasks (Source: Cognilytica) COGNILYTICA
(Disclosure: I’m a principal analyst at Cognilytica)

Fully one quarter of this time is spent providing the necessary labels on data so that supervised machine learning approaches will actually achieve their learning objectives. Customers have the data, but they don’t have the resources to label large data sets, nor do they have a mechanism to insure accuracy and quality. Raw labor is easy to come by, but it’s much harder to guarantee any level of quality from a random, mostly transient labor force. Third party managed labeling solution providers address this gap by providing the labor force to do the labeling combined with the expertise in large-scale data labeling efforts and an infrastructure for managing labeling workloads and achieving desired quality levels.

Read more at Cognitive World

Ronald Schmelzer
Author: Ronald SchmelzerWebsite: http://www.cognilytica.com

Ronald Schmelzer, columnist, is senior analyst and founder of the Artificial Intelligence-focused analyst and advisory firm Cognilytica, and is also the host of the AI Today podcast, SXSW Innovation Awards Judge, founder and operator of TechBreakfast demo format events, and an expert in AI, Machine Learning, Enterprise Architecture, venture capital, startup and entrepreneurial ecosystems, and more. Prior to founding Cognilytica, Ron founded and ran ZapThink, an industry analyst firm focused on Service-Oriented Architecture (SOA), Cloud Computing, Web Services, XML, & Enterprise Architecture, which was acquired by Dovel Technologies in August 2011.

Ron is the lead author of XML And Web Services Unleashed (SAMS 2002) and co-author of Service-Orient or Be Doomed (Wiley 2006) with Jason Bloomberg. Ron received a B.S. degree in Computer Science and Engineering from Massachusetts Institute of Technology (MIT) and MBA from Johns Hopkins University.