AI Policy Guide

Glossary

This section of the AI Policy Guide provides clear definitions for common terms in artificial intelligence.

Updated September 2024 

A–I

A   |   B   |   C   |   D   |   E   |   F   |   G   |   H   |   I

J–R

J   |   K   |   L   |   M   |   N   |   O   |   P   |   Q   |   R

S–Z

S   |   T   |   U   |   V   |   W   |   X   |   Y   |   Z

A

Accuracy: An evaluation metric that measures the reliability of a system’s inferences.

Activation function: The mathematical function that transforms data inputs into outputs. This shapes the final predictions that are made and serves as the algorithmic trigger that needs to be tripped by input data for a given prediction to be made. 

Adversarial examples: Data inputs maliciously designed to trick AI systems during inference.

Adversarial machine learning: Refers generally to the study and design of machine learning cyberattacks and defenses.

AI alignment: In the context of artificial general intelligence, alignment of AI systems refers to their correspondence with generally accepted human values (do not harm, do not kill, protect the vulnerable, allocate human rights equitably, and so on).

AI chips or AI accelerators: A range of chips designed specifically for the unique processing needs of AI.

AI triad: The three primary input technologies that yield artificial intelligence: microchips, data, and algorithms. 

Algorithm: A logical sequence of steps to accomplish a task such as solving a problem. For a more complete definition, see “Algorithms.”

Alignment imbalance: A state in which AI is generally misaligned with human values. This imbalance supposes that AI systems can possibly be balanced with human values. However, imbalance may be inherent to all AI systems and baked into their design. 

Application-specific integrated circuits (ASICs): The fastest and least flexible form of AI chip. ASICS are single-purpose chips and cannot be rewritten; the algorithms they use are hard wired into their silicon.

Model architecture: An AI model design scheme that dictates how data interact with and flow through a model.

Artificial general intelligence: A general-purpose AI system that can adapt and learn any task. It is not designed for a specific narrow purpose or set of purposes.

Artificial intelligence (AI): The goal of automating tasks normally performed by humans. To reach this goal, one uses “machine-based system[s] that can, for a given set of human-defined objectives, make predictions, recommendations or decisions influencing real or virtual environments.” For a more complete definition, see “What is AI?.”

Artificial narrow intelligence: AI built for a narrow purpose such as a specific application. This AI can do one or a few tasks and do so with high accuracy, but it cannot transfer to other applications outside of its design mandate. 

Artificial neural network (ANN): A type of model formed from networks of interconnected artificial neurons. Neurons take in data, divide that data, and parse these divisions to discover patterns. Patterns are then assembled to form increasingly advanced patterns and ultimately inform the network’s final predictions. 

Attention mechanism: A component of certain neural networks that allows the model to “pay attention” to key features in data and remember how those features in the data relate to others. 

Artificial neurons: Individual components of ANNs that take in data and look for specific patterns in such data that they have learned are significant during the training process. 
 

Scroll to top

B

Bayesian methods: Models that are coded with previous information that provides context and shrinks the overall learning task and, by extension, the required training data. 

Benchmarks: Common datasets paired with evaluation metrics that can allow researchers to compare the quality of models. 

Bias: Defined generally as the difference between desired outcomes and measured outcomes. Often it refers to human biases inherited in AI systems through model or data design choices. 

Bias value: The threshold that the weighted data must surpass for a neuron to activate. Mathematically, this serves as the intercept that orients the activation function toward the “shape” of reality. 

Big data: Big data AI systems are trained on large, representative, and diverse datasets that are expected to capture all the corner cases and details of a given problem. The theory is that by training an AI system on such a dataset, the system should, it is hoped, capture and learn all the required details of a given problem. 

Binary: A numerical system that represents values in series of just 1s and 0s. Most data in computer science and AI are represented in this form. 

Bit: The smallest unit of data that represents a binary choice between a 1 and a 0. 

Black box: The often-opaque decision-making processes behind deep neural networks.

Byte: A data unit the size of 8 bits.
 

Scroll to top

C

Central processing units (CPUs): A type of general-purpose chip designed to handle all standard computation. 

C: A class of AI technology that takes language-based prompts and responds in a language or text-driven, often conversational manner. Examples include ChatGPT and even AI assistants such as Amazon’s Alexa. Chatbots have been around for decades; however, today’s chatbots often wield large language models (LLMs) such as GPT-4 to ensure advanced flexibility, fluidity, and generalization. For sensitive or regulated applications, such as finance, more stringent technologies such as inflexible decision trees remain common. 

Circuits: Electronic components linked together to enable certain computational functions such as addition, subtraction, or memory storage.

Cloud computing: A general computing concept in which computing resources (both memory and processors) are stored remotely.

Code: The set of instructions given to a computer system.

Computer program or software: Code for the operation of a computer application. 

Convolutional neural networks (CNNs): A form of neural network that uses convolutional layers, which act as data filters trained to spot and separate patterns that are highly correlated with a specific result. These layers simplify data and accentuate the most important features. CNNs can be useful in many applications such as image analysis, financial time series analysis, and natural language processing.
 

Scroll to top

D

Data: In the context of computer science, data are pieces of discrete information that can be encoded, stored, and computed. For a more complete definition, see “Data.”

Data cleaning: The process by which data are prepared for use by an AI algorithm. 

Data poisoning attacks: Attacks on AI systems caused by the malicious manipulation of data. 

Data standards: Industry and application-specific standards that dictate in certain circumstances what data must be recorded and how that data must be recorded. 

Data warehouses: Large, centralized warehouses holding hundreds of servers on which vast lakes of data are stored and large-scale computations are run. 

Deep learning: A type of machine learning that specifically uses deep, multilayered neural networks.

Deposition: A process used in chip fabrication that blankets chips with materials to add components.

Dopants: Intentional impurities that lace the silicon in transistors, changing when and how transistors switch between conducting or insulating electric current. 
 

Scroll to top

E

Electronic design automation (EDA): The software used by hardware engineers to design computer systems and chips. 

Etching: A process used in chip fabrication that uses chemicals to remove unwanted material and shape the design of the chip.

Evaluation metrics: Metrics that can be used to assess AI system quality. These are diverse, and the metrics selected should match application needs and engineering goals.

Execution units: Microprocessor subsystems that package related circuits together with memory and other tools to enable basic functions. 

Explainable AI or white box AI: An emerging class of AI that seeks to provide explanations of how the system’s decisions and predictions are made. 
 

Scroll to top

F

F1 score: An evaluation metric that assesses how well a model minimizes both false negatives and false positives. 

Feed-forward neural network: A type of machine learning in which data flow in one direction through the network’s layers.

Federated learning: A training technique that trains AI models on a web of disconnected servers or processors, rather than on a centralized server, often to eliminate data aggregation and preserve privacy. 

Few-shot learning: The ability of a model to form accurate inferences trained on only a few explicit examples of the problem at hand.

Field-programmable gate arrays (FPGAs): Task-specific chips that can be written and rewritten for a single-purpose algorithm. Given their task specificity, FPGAs are faster than GPUs. They are still slower than ASICs, because their ability to be rewritten comes with certain speed costs. 

File formats: A type of data standard that defines how data are digitally represented. 

Fine-tuning: The process of refining a general-purpose foundation model toward specific goals or tasks. Fine-turning often involves additional training on task- or domain-specific data, training in controls to limit certain undesirable behaviors, or further training to align models toward certain desired behaviors. 

Floating point operations per second (FLOPS): A measure of computational speed and performance that clocks the floating point operations, the number of mathematical operations a processor can compete in a second. Confusingly, floating point operations, (FLOPs with a lowercase “s”) are also used to measure model size based on how many operations that model requires. 

Foundation models: Large-scale machine learning models trained on broad sets of data that can be easily adapted to a wide range of downstream tasks.
 

Scroll to top

G

Generalization: A system’s ability to “adapt properly to new, previously unseen data.” Generalization is highly desirable and a marker of AI quality. 

General-purpose technology: Innovations that “[have] the potential to affect the entire economic system.”

Generative adversarial neural networks (GANs): A form of neural network in which competing agents seek to outcompete each other. Through competition, each party improves, ultimately improving its overall predictive qualities. GANs are noted for their generative modeling, or creative, abilities. This specifically means that they use pattern recognition to predict how to best generate novel output content such as images. 

Generative AI: AI systems trained to create high-quality text, media, or other data. Generative AI is not limited to media. Protein folding systems, materials discovery systems, code generation, and other science, technology, engineering, and mathematics (STEM) applications can be considered generative AI.  

Graphics processing units (GPUs): Limited-purpose processors that were originally designed for graphics processing but that have been reappropriated for AI. GPUs excel at matrix multiplication, a function central to AI, giving them speed advantage over traditional CPUs. 
 

Scroll to top

H

Hallucinations: Generative AI outputs that are incorrect, unrelated to the prompt, or inconsistent with reality. 

Hyperparameters: High-level settings that can be adjusted by engineers to control the model’s functions. 
 

Scroll to top

I

Inference: A probabilistic guess made by an AI system on the basis of patterns or trends observed in data.

Inherently interpretable: Models that by design are simple to interpret or understand.

Integrated circuits (ICs) or microprocessors: Devices that can perform basic operations of software commands. 

Internet of things (IoT): Networks of diverse internet-connected devices. IoT devices often act as key data inputs to AI systems. 
 

Scroll to top

L

Large language models (LLMs): Generative models trained to understand, generate, and process human language. Machine translation and chatbots are common LLMs. To enable prompting, LLMs can also be integrated into systems such as image generators. 

Layers: Collections of neurons that data must pass through simultaneously in a network. 

Libraries: Databases of functions that can be plugged into computer programs. There are many free-to-use libraries of machine learning models that are commonly appropriated for AI. 

Loss: In machine learning, this is the mathematical difference between the correct outcome and the desired outcome. 
 

Scroll to top

M

Machine learning: A method for iteratively refining the process a model uses to form inferences through feeding it stored or real-time data.

Memory units: Devices that use transistors and other components to store information. Memory units can be subcomponents of a chip or standalone chips depending on their size and function. 

Microchip architecture: The “blueprint” configuration of chip components, including circuits, execution units, and input/output devices. AI chips depend on architectural changes for performance gains. 

Model: The software configuration that results from machine learning. Once fed new data, the model can produce inferences in the form of predictions, decisions, and other outputs.

Moore’s law: An observation stating that the number of transistors per chip doubles roughly every two years. More than an empirical observation, it was an expectation that came to organize the efforts of the microchip industry and was a self-fulfilling prophecy for a long time.

Multimodality: The ability of a model to understand multiple types of data, often including text, image, audio, and various computer file types. 

 

Scroll to top

O

Overfitting: A situation in which a model is tuned so precisely to the training data that it cannot adequately account for new data. 
 

Scroll to top

P

Parallelism: The ability of a chip to perform certain functions in parallel rather than sequentially, allowing faster processing.

Parameters: The values that shape a model’s analytical processes.

Photolithography: A process used in chip fabrication by which light is shined through a “circuit stencil” known as a photomask, printing the design onto the chip’s wafer.

Precision: An evaluation metric that evaluates how many positive results are true positives.
 

Scroll to top

R

Recall: An evaluation metric that states the percentage of a model’s negative results that are true negatives. 

Recurrent neural networks (RNNs): Neural networks defined by their ability to remember past information and connect that information to future data. This “memory” is necessary in complex, time-dependent data such as video analysis, natural language processing, and other applications.

Reinforcement learning: A type of machine learning that uses trial and error to learn the best process to achieve a given goal. To learn, an AI model is given a scenario and tasked with maximizing a reward or achieving a goal. When its process improves, it receives a rewards signal that instructs it to reinforce the processes that led to that improvement.

Reinforcement learning from human feedback (RLHF): A prominent fine-tuning technique geared at aligning models with human preferences. During RLHF, systems are tested on or produce outputs for human users; when those users react positively, a reward signal is sent to the system, thereby helping it improve its outputs.

Representation: The concept of translating observable objects (images, words, sounds) into digital code.
 

Scroll to top

S

Semiconductor devices: A class of devices that uses the unique switching properties of semiconductor materials to alert the flow of electricity. Example devices include LEDs and transistors. Microchips, ICs, and microprocessors are all made of semiconductor materials.

Semiconductor materials: Materials such as silicon that can act as either insulators or conductors of electricity.

Semi-supervised learning: A hybrid of unsupervised and supervised learning in which a portion of labeled data are provided to the model on top of a larger amount of unlabeled data. This approach can provide a light touch of supervision.

Small data: An alternative strategy to big data approaches that uses a variety of techniques to train AI algorithms on smaller datasets when information is poor, lacking, or unavailable. 

Stale data: Outdated data that are no longer representative of a given problem. 

Superintelligence: An AI system that is smarter than humans in almost every domain.

Supervised learning: A type of machine learning that uses a guess-and-check methodology by which the model takes in data, makes a prediction about those data, and compares that prediction to a labeled answer key. If the inference is incorrect, the algorithm adjusts itself to improve performance. 

Symbolic methods: An alternative and complementary technique to machine learning. Under symbolic AI, engineers try to build intelligence by treating knowledge as a collection of symbols—essentially core definitions, objects, labels, and representations that describe the world in human terms.

Synthetic data: Data that are artificially created either by human or machine generation but still thought to be generally representative of a problem. Training AI on artificial data can supplement real-world data when quality data resource are limited.
 

Scroll to top

T

Test data: The unique set of data reserved for testing the model for final accuracy and effectiveness used in machine learning. Test data must be separate from the training data. 

Three V’s: Key characteristics that define the quality of a dataset. Variety refers to the diversity of the data. Volume refers to the size of the dataset. Velocity refers to the usability and speed by which the data can be applied. Other publications may list four, five, or even six Vs. The term tends to vary depending on context and purpose. 

Training: The process by which models take in stored or real-time data to refine their processes and improve their inferences. 

Training data: The unique set of data reserved for the model training process in machine learning. 

Transfer learning: One small data approach that allows models to inherit learning from previously trained big data models. 

Transformers: An emerging class of neural networks that uses a so-called attention mechanism that allows the model to pay attention to key features and remember how those features in the data relate to others.

Transistor: A device built from a combination of silicon and dopants, impurities that alter the properties of conductivity to provide discrete control by engineers over electric currents.
 

Scroll to top

U

Underfitting: A situation in which a model has not been properly tuned to the problem because of poor design or data quality.

Unsupervised learning: A type of machine learning that focuses on sorting unlabeled, unsorted data and discovering patterns in those data. This method does not focus on specific outcomes but rather on discovering the meaning and patterns in data. 
 

Scroll to top

V

Validation: The process by which the engineer uses a dedicated validation dataset to tune the hyperparameters of the model. Generally, this is done after training but before testing.

Validation data: The unique set of data used during machine learning validation. These data are used specifically to tune the model’s hyperparameters. 
 

Scroll to top

W

Wafer: The thin disk of semiconductor materials that acts as the base of a computer chip. 

Weight: A numerical value that amplifies or suppresses the importance of a pattern found in data. 

 

Scroll to top

Z

Zero-shot learning: The ability of a model to form accurate inferences without having been trained on explicit examples of the problem at hand. 

 

 

Scroll to top

About the Author

Matthew Mittelsteadt is a technologist and research fellow at the Mercatus Center whose work focuses on artificial intelligence policy. Prior to joining Mercatus, Matthew was a fellow at the Institute of Security, Policy, and Law where he researched AI judicial policy and AI arms control verification mechanisms. Matthew holds an MS in Cybersecurity from New York University, an MPA from Syracuse University, and a BA in both Economics and Russian Studies from St. Olaf College.

Read Matt's Substack on AI Policy