Branches of Artificial Intelligence
Like ice cream, artificial intelligence comes in a variety of flavors. Different people have different favorite flavors of ice cream, and it’s important to choose the right flavor (or flavors) of AI you’ll need to use for your AI business transformation project. Let’s explore the various branches of artificial intelligence and how you might use each to solve real-world business problems.
Expert Systems
Expert systems, first fashionable in the 1990s though still in use in some areas today, use a database of encoded rules and knowledge to capture expertise. For example, MYCIN was an expert system built to diagnose bacterial infections and recommend antibiotics. It’s unlikely that an expert system is the right solution for whatever business problem you might have as they require laborious, exhaustive programming of rules and knowledge that’s very labor intensive. So, let’s move on down the list.
Machine Learning
In contrast to expert systems, machine learning is an AI technique that enables machines to learn from data rather than relying on a set of human-coded, pre-programmed rules. Machine learning systems create their own rules and build knowledge by using algorithms and statistical models to draw inferences from patterns in data. They learn from the data itself.
Machine learning is a broad umbrella term that covers a wide range of AI techniques. They are used for everything from recommendation systems (e.g., Netflix “watch next” content or Amazon purchase suggestions) to image recognition and natural language processing (think Alexa, Siri, and Google Assistant). Machine learning is also used to optimize the operation of complex business processes or systems. For example, an oil company might use machine learning to optimize the gas uplift of oil from a well, precisely controlling the amount of gas being pumped into the well at various depths to optimize production. Machine learning is also used for route optimization (both on roads and in computer networks), demand forecasting, dynamic price optimization, inventory management, and predictive maintenance.
The same way that vanilla ice cream comes in sub-flavors—old-fashioned vanilla, French vanilla, vanilla bean—so there are several flavors of machine learning. The flavor you choose will depend on your application and the data you have available to you.
Supervised, Semi-supervised, and Unsupervised Learning
These are all subtypes of machine learning. Supervised data operates on labeled data, while unsupervised learning operates on unlabeled data. For example, you might train a spam filter AI using supervised learning with a data set of historical emails where undesired emails are labeled as spam. Over time, the spam AI learns to categorize new emails as either spam or safe messages. This is how the spam filter works on your email program.
Supervised learning finds a mapping or relationship between inputs and outputs so you can make predictions (using a technique known as regression) or classify new input data. Examples of supervised learning applications are classification (spam detection, fraud detection, image classification, medical diagnosis) and regression (predicting future sales revenue based on historical data and market signals, predicting patient outcomes).
Unsupervised learning extracts patterns and features within a data set to enable clustering, dimensional reduction, or identify patterns in transaction data. Examples of unsupervised learning applications are clustering (Customer segmentation, content personalization, and document grouping, for example, organizing news articles based on common topics), association (security anomaly detection or suggesting related products in an e-commerce application—this blouse would go nicely with the pants you just put in your shopping cart), and dimensionality reduction which is used to compress data will preserving essential information.
Semi-supervised learning is like supervised learning, but instead of exclusively labeled data for training, it combines labeled and unlabeled data. It’s often used in situations where there is plenty of unlabeled data and only a small amount of labeled data available. Semi-supervised learning is used for image and video analysis, speech recognition, web content classification, and labeling unlabeled data.
Reinforcement Learning
Another flavor of machine learning is reinforcement learning, a technique where an AI agent learns to make decisions by interacting with its environment and learning from each encounter. Reinforcement learning is used in robotics, industrial control, recommendation engines, and some autonomous vehicles. DeepMind used a version of reinforcement learning, deep reinforcement learning, to build AlphaGo, the first AI agent able to beat the world champion in the ancient game of Go.
An agent learns over time by trial and error. A digital signal known as a ‘reward function’ is generated each time the agent takes an action. Examples of actions might be: turning the steering wheel left or right, placing a stone on the board, or adjusting a setting on a complex control system. The reward function’s parameters are set so that the agent develops desired behaviors and abilities. The agent’s goal is to find a policy that maximizes the cumulative reward over time. For example, an autonomous driving agent would receive a digital reward for keeping the car on the road, following driving etiquette, and providing a smooth ride for passengers. It would receive a big digital penalty for hitting another object. For obvious reasons, many reinforcement learning AIs are trained inside simulators before they are unleashed in the real world. The learning process of reinforcement learning AI involves two key strategies: Exploration, which tries out different actions to discover their effects, and Exploitation, which uses known information to make the best decision to maximize the reward.
Example use cases for reinforcement learning include predictive maintenance, energy management (for example, optimizing grid operations), city traffic flow optimization, portfolio optimization, dynamic resource allocation, and gaming. DeepMind collaborated with the Swiss Plasma Center at EPFL to develop a reinforcement learning system for their experimental fusion reactor. The AI controls 19 powerful magnets inside the reactor to contain and optimize the shape of the high-temperature plasma and maximize energy production.
Natural Language Processing
Natural language processing (NLP) is a broad, catch-all term that describes technologies that facilitate the interaction of humans and computers using natural language. The field includes speech recognition, natural language understanding, and language generation. So, voice assistants, language translators, sentiment analysis, and text classification would all fall under the NLP umbrella. NLP systems are built using many of the branches of artificial intelligence: machine learning (supervised, semi-supervised, and unsupervised), reinforcement learning, and generative AI. NLP is like an ice cream sundae, I suppose.
Deep Learning
Deep learning is a subset of machine learning that uses an artificial neural network to learn complex relationships in data sets. Machine learning systems may or may not be built on neural networks—though most are today—but all deep learning systems have a neural network at their core. And usually a big one. What makes a deep learning system ‘deep’? It has to do with the number of layers in its neural network. Deep neural networks have many layers of interconnected nodes—artificial neurons—between their input and output layers, which makes them particularly good at learning from large amounts of data and identifying complex patterns within that data. Examples of deep learning applications include image recognition, fraud detection, algorithmic trading, quality assurance, yield enhancement, language translation, recommendation engines, financial market analysis, medical diagnosis from imaging, and the prediction of patient outcomes from medical data.
Convolutional Neural Networks
Convolutional neural networks (CNNs) are specialized neural networks mostly designed to process image data and are used extensively in the field of computer vision. CNNs are considered a branch of deep learning, and their architecture is inspired by the human visual cortex. CNNs are used for image and video recognition, object detection, facial recognition, medical image analysis, and recommender systems that analyze visual information. CNNs are also used in brain-computer interfaces and financial time series analysis to predict stock prices or analyze market trends.
Generative AI
We began this post with the vanilla spectrum of AI ice cream flavors. Now we move on to the chocolate flavors: Belgian chocolate, malted chocolate, and double chocolate chip. Until now, all the AI flavors we have covered are in the broad category known as discriminative AI. Discriminative AI is used for sensing, optimization, and recommendation. They use input data to classify or cluster that data or optimize a process. For example, AIs classify emails as spam or safe; they also control jet engine settings to maximize thrust and minimize fuel consumption.
Since late 2022, people have become really excited about the other major category of AI: generative AI. When ChatGPT crashed onto the scene in November of that year, people started losing their minds about the possibilities and the potential future impact on work, education, healthcare, and society. They were probably right to do so—Generative AI will usher in tremendous business transformation and societal change in the next decades. But generative AI is not a new thing. Early chatbots date back to the 1960s, though they were nothing compared with the impressive chatbots available today. Generative AI last gained attention back in 2014 with the invention of generative adversarial networks.
Generative Adversarial Networks
Generative adversarial networks (GANs) were invented by Ian Goodfellow, a Stanford PhD who has worked at Google Brain, OpenAI, and Apple. He currently works at Google DeepMind. GANs are built from two neural networks that compete against each other (hence the ‘adversarial’ word in the name) to generate new synthetic data that resembles real-world data. For example, a GAN can be used to generate images that resemble photographs.
A GAN has two main parts: a discriminator and a generator. A discriminator is an AI classifier trained to identify real versus fake data. In our image-generator example, the discriminator is trained with a labeled data set of images, some of which are real photographs and some of which are not. The generator produces fake instances, images in this case, and feeds them to the discriminator. The discriminator provides feedback to the generator, which updates its process to produce better fakes. This iterative process of generation, discrimination, and feedback continues until the generator becomes so good at producing realistic data that it defeats the discriminator. At this stage, the GAN can produce realistic synthetic data. GANs are used for image generation, super-resolution of images, style transfer, face aging, audio generation, weather prediction, and generating synthetic data sets that are then used to augment real-world data and train machine learning models. GANs have begun to be replaced by large language models and diffusion techniques, though GANs offer a far more energy-efficient way to create some types of content.
Large Language Models
Large Language Models (LLMs) are the double chocolate chip fudge brownie chocolate chocolate chunk of AI ice cream flavors. They underpin all the major frontier and foundational models on the market today—ChatGPT, GPT-4o, Claude, Gemini, Llama, Mistral, etc. These LLMs are built around an AI technology known as a transformer, which Google invented and described in a famous research paper released in 2017. Large language models are trained on vast amounts of data and require enormous computing resources and crazy amounts of energy to train and operate.
An important concept in the field of LLMs is the notion of emergent properties. Fundamentally, LLMs operate by accurately guessing the next word (or token) in a sequence. What’s amazing about LLMs is that powerful capabilities emerge as you scale them up. They can answer questions, write short essays, summarize information, and solve simple problems. As LLMs scale further (with more parameters in their deep neural networks) and are trained on more data, their capabilities continue to improve. Over the last decade or so, researchers have scaled the frontier LLMs by a factor of 10 every couple of years. Leading researchers believe that further scaling will yield continued improvements in accuracy and intelligence.
Multimodal Models
Early LLMs were trained on text data gathered from the internet. More recently, so-called multimodal models have been trained on a wide range of data types, including text, images, videos, depth maps, and audio. As AI researchers continue to push the scale of models in search of more emergence and increased levels of intelligence, they are turning to new data types to ‘feed the beast.’ Training models with new data types also gives them a better understanding of the world. There’s only so much one can learn through text information alone. Some models, like DeepMind’s Gato, are also trained with physical/haptic data from robotic systems, including sensor and actuator data. Other multimodal models are being trained with olfactory, gustatory, medical imaging, physiological, and other data types.
Agentic AI
The next major step in AI is the development of autonomous AI agents. These agents can take independent actions and use tools to achieve specific goals or objectives. To do this, agentic AI must understand and manage complex workflows, break objectives down into tasks, and choose and use tools to perform these tasks. Agentic AIs will understand the intent of their users and integrate with a wide range of third-party systems and platforms to perform tasks. As humans, we assess the intelligence of other species by their ability to choose and use tools to perform tasks. Chimpanzees, gorillas, elephants, dolphins, and crows have all been observed using tools. By this measure, agentic AI can be seen as a major step forward from traditional chatbot AIs. Agentic AIs will become personal assistants, research assistants, financial advisors, and coworkers. Worker productivity will increase as agentic AI takes on more repetitive workflows, elevating employee capabilities and allowing them to focus on higher-value and more enjoyable work. Agentic AI is seen as the future of AI and the first agentic products (that actually work well) will likely hit the market in 2025.
Physical AI
Physical intelligence is another major category of AI. Advanced robotic systems like those from Figure, Agility Robotics, Boston Dynamics, and Sanctuary AI use a range of AI models to allow their robots to navigate the world safely, understand human instructions, perform smooth movements, and execute tasks. Some AI researchers believe that the best and perhaps only path to artificial general intelligence is through ‘embodiment,’ placing an AI inside a physical body so that it can experience, navigate, and interact with its environment and learn about the world in much the same way that a child does. Figure is partnering with OpenAI to integrate its LLM technology into a humanoid robot, enabling the robot to understand commands and explain its decisions and actions. Most physical AI has been a bespoke product created for a specific physical ‘body’ it inhabits. Startup Skild AI is developing a general robot brain model that is body-agnostic.
Artificial General Intelligence
Seen as the raison d’être of AI research labs like DeepMind and OpenAI, artificial general intelligence (AGI) has many definitions. Perhaps the easiest way to think about AGI is that it’s an AI that can perform any economically valuable task that a human can. The path to reaching AGI is unclear—some believe that scaling large models will be enough, some believe that embodiment is the best path, and some believe that deep reinforcement learning must play a role. It’s likely that more research breakthroughs are required and that a combination of existing AI techniques will be needed. Industry insiders expect that AGI-level intelligence will be reached in the 2027-2029 timeframe, meaning that some of the massive AI data centers (more accurately, compute centers) being planned and built today may house the very first AGIs.
Artificial Superintelligence
What’s beyond AGI? Artificial superintelligence (ASI) is the term that defines AI that is more intelligent than a human in all tasks. That might be 1% more intelligent or 1 million times more intelligent. One of the tasks imagined for future AGIs is that of AI research. Once an AGI can automate the process of AI research, it could rapidly evolve into an ASI in a process referred to as an intelligence explosion, where the intelligence of machines increases very rapidly.
Our world is changing so fast. But the AI revolution has only just begun.