AI Policy Guide


This section of the AI Policy Guide provides clear definitions for common terms in artificial intelligence.


A   |   B   |   C   |   D   |   E   |   F   |   G   |   H   |   I


J   |   K   |   L   |   M   |   N   |   O   |   P   |   Q   |   R


S   |   T   |   U   |   V   |   W   |   X   |   Y   |   Z


Accuracy: An evaluation metric that measures the reliability of a system’s inferences.

Activation function: The mathematical function that transforms data inputs into outputs. This both shapes the final predictions that are made and serves as the algorithmic trigger that needs to be tripped by input data for a given prediction to be made.

Adversarial examples: Data inputs maliciously designed to trick AI systems during inference.

Adversarial machine learning: Refers generally to the study and design of machine learning cyberattacks and defenses.

AI alignment: In the context of artificial general intelligence, alignment of AI systems refers to their correspondence with generally accepted human values (do not harm, do not kill, protect the vulnerable, allocate human rights equitably, and so on).

AI chips or AI accelerators: A range of chips designed specifically for the unique processing needs of AI.

AI triad: The three primary “input” technologies that yield artificial intelligence: microchips, data, and algorithms.

Algorithm: A logical sequence of steps to accomplish a task such as solving a problem.

Alignment imbalance: A state in which AI is generally misaligned with human values. This imbalance supposes that AI systems can possibly be balanced with human values. However, imbalance may be inherent to all AI systems and baked into their design.

Application-specific integrated circuits (ASICs): The fastest and least flexible form of AI chip. ASICS are single-purpose chips and cannot be rewritten; the algorithms they use are hard wired into their silicon.

Artificial data: Data that are artificially created but still thought to be generally representative of a problem. Training AI on artificial data can supplement real-world when data are poor. Artificial general intelligence: A general-purpose AI system that can adapt and learn any task. It is not designed for a specific narrow purpose or set of purposes.

Artificial intelligence (AI): The goal of automating tasks normally performed by humans. To reach this goal, one uses “machine-based system[s] that can, for a given set of human-defined objectives, make predictions, recommendations or decisions influencing real or virtual environments.”

Artificial narrow intelligence: AI built for a narrow purpose such as a specific application. This AI can do one or a few tasks with high accuracy, but it cannot transfer to other applications outside of its design mandate. Artificial neural network (ANN): A type of model formed from networks of interconnected artificial neurons. Neurons take in data, divide that data, and parse these divisions to discover patterns. Patterns are then assembled to form increasingly advanced patterns and ultimately inform the network’s final predictions. Artificial neurons: Individual components of ANNs that take in data and look for specific patterns in that data that they have learned are significant during the training process.

Scroll to top


Bayesian methods: Models that are coded with prior information that provides context and shrinks the overall learning task and, by extension, the needed training data.

Benchmarks: Common datasets paired with evaluation metrics that can allow researchers to compare the quality of models.

Bias: Defined generally as the difference between desired outcomes and measured outcomes. Often it refers to human biases inherited in AI systems through model or data design choices.

Bias value: The threshold that the weighted data must surpass for a neuron to activate. Mathematically, this serves as the intercept that orients the activation function toward the “shape” of reality.

Big data: Big-data AI systems are trained on large, representative, and diverse datasets that are expected to capture all the corner cases and details of a given problem. The theory is that by training an AI system on such a dataset, the system should hopefully capture and learn all the needed details of a given problem.

Binary: A numerical system that represents values in series of just 1s and 0s. Most data in computer science and artificial intelligence are represented in this form.

Bit: The smallest unit of data that represents a binary choice between a 1 and a 0. Black box: A term that refers to the often-opaque decision-making processes behind deep neural networks.

Byte: A data unit the size of 8 bits.

Scroll to top


Central processing units (CPUs): A type of general-purpose chip designed to handle all standard computation.

Circuits: Electronic components linked together to enable certain computational functions such as addition, subtraction, or memory storage.

Cloud computing: A general computing concept in which computing resources (both memory and processors) are stored remotely.

Code: The set of instructions given to a computer system.

Computer program or software: Code for the operation of a computer application.

Convolutional neural networks (CNNs): A form of neural network that uses convolutional layers, which act as data filters trained to spot and separate patterns that are highly correlated with a specific result. These layers simplify data and accentuate the most important features. CNNs can be useful in many applications such as image analysis, financial time series analysis, and natural language processing.

Scroll to top


Data: In the context of computer science, data are pieces of discrete information that can be encoded, stored, and computed. Data cleaning: The process by which data are prepared for use by an AI algorithm.

Data poisoning attacks: Attacks on AI systems caused by the malicious manipulation of data.

Data standards: Industry and applicationspecific standards that dictate in certain circumstances what data must be recorded and how that data must be recorded.

Data warehouses: Large, centralized warehouses holding hundreds of servers on which vast lakes of data are stored and large-scale computations are run.

Deep learning: A type of machine learning that specifically uses deep, multilayered neural networks.

Deposition: A process used in chip fabrication that blankets chips with materials to add components.

Dopants: Intentional impurities that lace the silicon in transistors, changing when and how transistors switch between conducting or insulating electric current.

Scroll to top


Electronic design automation: The software used by hardware engineers to design computer systems and chips.

Etching: A process used in chip fabrication that uses chemicals to remove unwanted material and shape the design of the chip.

Evaluation metrics: Metrics that can be used to assess AI system quality. These are diverse and the metrics selected should match application needs and engineering goals.

Execution units: Microprocessor subsystems that package related circuits together with memory and other tools to enable basic functions.

Explainable AI or white box AI: An emerging class of AI that seeks to provide explanations of how the system’s decisions and predictions are made.

Scroll to top


F1 score: An evaluation metric that assesses how well a model minimizes both false negatives and false positives.

Feed-forward neural network: A type of machine learning in which data flow in one direction through the network’s layers.

Field-programmable gate arrays (FPGAs): Task-specific chips that can be written and rewritten for a single-purpose algorithm. Given their task specificity, FPGAs are faster than GPUs. They are still slower than application-specific integrated circuits, because their ability to be rewritten comes with certain speed costs.

File formats: A type of data standard that defines how data are digitally represented.

Foundation models: Large-scale machine learning models trained on broad sets of data that can be easily adapted to a wide range of downstream tasks.

Scroll to top


General-purpose technology: Innovations that “[have] the potential to affect the entire economic system.”

Generative adversarial neural networks (GANs): A form of neural network in which competing agents seek to outcompete each other. Through competition, each party improves, ultimately improving its overall predictive qualities. GANs are noted for their generative modeling, or creative, abilities. This means specifically they use pattern recognition to predict how to best generate novel output content such as images.

Graphics processing units (GPUs): Limitedpurpose processors that were originally designed for graphics processing but that have been reappropriated for AI. GPUs excel at matrix multiplication, a function central to AI, giving them speed advantage over traditional CPUs.

Scroll to top


Hyperparameters: High-level settings that can be adjusted by engineers to control the model’s functions. 

Scroll to top


Inference: A probabilistic guess made by an AI system on the basis of patterns or trends observed in data.

Inherently interpretable: Models that by design are simple to interpret or understand.

Integrated circuits or microprocessors: Devices that can perform basic operations of software commands.

Internet of things (IoT): Networks of diverse internet-connected devices. IoT devices often act as key data inputs to AI systems.

Scroll to top


Layers: Collections of neurons that data must pass through simultaneously in a network.

Libraries: Databases of functions that can be plugged into computer programs. There are many free-to-use libraries of machine-learning models that are commonly appropriated for AI.

Loss: In machine learning, this is the mathematical difference between the correct outcome and the desired outcome.

Scroll to top


Machine learning: A method for iteratively refining the process a model uses to form inferences through feeding it stored or real-time data.

Memory units: Devices that use transistors and other components to store information. Memory units can be subcomponents of a chip or standalone chips depending on their size and function.

Model: The software configuration that results from machine learning. Once fed new data, the model can produce inferences in the form of predictions, decisions, and other outputs.

Moore’s law: An observation stating that the number of transistors per chip doubles roughly every two years. More than an empirical observation, it was an expectation that came to organize the efforts of the microchip industry and was a self-fulfilling prophecy for a long time.

Scroll to top


Overfitting: A situation where a model is tuned so precisely to the training data that it cannot adequately account for new data.

Scroll to top


Parallelism: The ability of a chip to perform certain functions in parallel rather than sequentially, allowing faster processing.

Parameters: The values that shape a model’s analytical processes.

Photolithography: A process used in chip fabrication by which light is shined through a “circuit stencil” known as a photomask, printing the design onto the chip’s wafer.

Precision: An evaluation metric that evaluates how many positive results are true positives.

Scroll to top


Recall: An evaluation metric that states the percentage of a model’s negative results that are true negatives.

Recurrent neural networks: Neural networks defined by their ability to remember past information and connect that information to future data. This “memory” is necessary in complex, time-dependent data such as video analysis, natural language processing, and other applications.

Reinforcement learning: A type of machine learning that uses trial and error to learn the best process to achieve a given goal. To learn, an AI is placed in a scenario and tasked with maximizing a reward or achieving a goal. When its process improves, it receives a rewards signal that instructs it to reinforce the processes that led to that improvement.

Representation: The concept of translating observable objects (images, words, sounds) into digital code.

Scroll to top


Semiconductor devices: A class of devices that uses the unique switching properties of semiconductor materials to alert the flow of electricity. Example devices include LEDs and transistors. Microchips, integrated circuits, and microprocessors are all made of semiconductor materials.

Semiconductor materials: Materials such as silicon that can act as either insulators or conductors of electricity.

Semi-supervised learning: A hybrid of unsupervised and supervised learning in which a portion of labeled data are provided to the model on top of a larger amount of unlabeled data. This approach can provide a light touch of supervision.

Small data: An alternative strategy to big data approaches that uses a variety of techniques to train AI algorithms on smaller datasets when information is poor, lacking, or unavailable.

Stale data: Outdated data that are no longer representative of a given problem. Stochastic parrots: A term that describes AI systems that randomly rearrange and regurgitate learned data rather than provide true insight or understanding.

Superintelligence: An AI system that is smarter than humans in almost every domain

Supervised learning: A type of machine learning that uses a guess-and-check methodology by which the model takes in data, makes a prediction about that data, and compares that prediction to a labeled answer key. If the inference is incorrect, the algorithm adjusts itself to improve performance.

Scroll to top


Test data: The unique set of data reserved for testing the model for final accuracy and effectiveness used in machine learning. Test data must be separate from the training data.

Three Vs: Key characteristics that define the quality of a dataset. Variety refers to the diversity of the data. Volume refers to the size of the dataset. And velocity refers to the usability and speed by which the data can be applied. Other publications may list four, five, or even six Vs. The term tends to vary depending on context and purpose.

Training: The process by which models take in stored or real-time data to refine their processes and improve their inferences.

Training data: The unique set of data reserved for the model training process in machine learning.

Transfer learning: One small-data approach that allows models to inherit learning from previously trained big-data models.

Transformers: An emerging class of neural networks that uses a so-called attention mechanism that allows the model to pay attention to key features and remember how those features in the data relate to others.

Transistor: A device built from a combination of silicon and dopants, impurities that alter the properties of conductivity to enable engineers’ discrete control over electric currents.

Scroll to top


Underfitting: A situation where a model has not been properly tuned to the problem because of poor design or data quality.

Unsupervised learning: A type of machine learning that focuses on sorting unlabeled, unsorted data and discovering patterns in those data. This method does not focus on specific outcomes but rather on discovering the meaning and patterns in data.

Scroll to top


Validation: The process by which the engineer uses a dedicated validation dataset to tune the hyperparameters of the model. Generally, this is done after training but before testing.

Validation data: The unique set of data used during machine-learning validation. These data are used specifically to tune the model’s hyperparameters.

Scroll to top


Wafer: The thin disk of semiconductor materials that acts as the base of a computer chip.

Weight: A numerical value that amplifies or suppresses the importance of a pattern found in data.

Scroll to top

About the Author

Matthew Mittelsteadt is a technologist and research fellow at the Mercatus Center whose work focuses on artificial intelligence policy. Prior to joining Mercatus, Matthew was a fellow at the Institute of Security, Policy, and Law where he researched AI judicial policy and AI arms control verification mechanisms. Matthew holds an MS in Cybersecurity from New York University, an MPA from Syracuse University, and a BA in both Economics and Russian Studies from St. Olaf College.

Read Matt's Substack on AI Policy