AI Glossary | 人工智能词汇表

A

Algorithm (算法)

English: A set of step-by-step instructions or rules designed to perform a specific task. In AI, algorithms process data and learn patterns that enable decision-making or predictions.
中文：一组分步骤的指令或规则，用于执行特定任务。在人工智能中，算法对数据进行处理并学习模式，从而实现决策或预测。

Artificial Intelligence (AI) (人工智能)

English: A field of computer science focused on creating machines or software capable of exhibiting human-like intelligence—such as reasoning, learning, and problem-solving.
中文：计算机科学的一个领域，旨在创建能够表现出类似人类智能（如推理、学习和解决问题）能力的机器或软件。

Artificial Neural Network (ANN) (人工神经网络)

English: A computational model inspired by biological neural networks in the human brain. ANNs are composed of interconnected nodes (“neurons”) that process information in layers to learn patterns.
中文：一种受人类大脑生物神经网络启发的计算模型。人工神经网络由相互连接的节点（“神经元”）组成，通过分层处理信息来学习模式。

Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks. This contrasts with narrow AI, which is limited to specific tasks.^[1] Artificial superintelligence (ASI), on the other hand, refers to AGI that greatly exceeds human cognitive capabilities. AGI is considered one of the definitions of strong AI.

Agentic AI

Agentic AI uses sophisticated reasoning and iterative planning to autonomously solve complex, multi-step problems.

B

Backpropagation (反向传播)

English: An algorithm for training neural networks by propagating the error backward from the output layer to the input layer, adjusting the weights to minimize error.
中文：一种用于训练神经网络的算法，通过将误差从输出层向输入层反向传播，并调整权重以最小化误差。

Batch (批次)

English: A subset of data used to train a model at one time. In mini-batch gradient descent, the training set is divided into small batches for updates to model parameters.
中文：一次用于训练模型的数据子集。在小批量梯度下降中，训练集被分割成小批次，以便对模型参数进行更新。

Bias (偏差)

Model bias (模型偏差): Systematic error that causes a model to consistently deviate from the true values.
中文：导致模型持续偏离真实值的系统性误差。
Bias term (in neurons) (神经元中的偏置项): A constant value added to the weighted input of a neuron to shift the activation function.
中文：加到神经元加权输入中的一个常数值，用于对激活函数进行平移。

C

Classification (分类)

English: A supervised learning task where the goal is to predict a discrete label (e.g., “spam” vs. “not spam” or identifying which category an image belongs to).
中文：一种监督学习任务，目标是预测离散标签（例如，“垃圾邮件”与“非垃圾邮件”或识别图像所属的类别）。

Clustering (聚类)

English: An unsupervised learning task that involves grouping data points such that points in the same group (cluster) are more similar to each other than to those in other clusters.
中文：一种无监督学习任务，将数据点进行分组，使同一组（簇）内的点彼此之间比与其他簇的点更加相似。

Convolutional Neural Network (CNN) (卷积神经网络)

English: A type of neural network especially suited for tasks involving image and spatial data. It uses filters or “kernels” to detect patterns such as edges and textures in images.
中文：一种特别适合处理图像和空间数据的神经网络。它使用滤波器或“卷积核”来检测图像中的边缘、纹理等模式。

Cosmos/WFM model

Cosmos is a family of world foundation models(WFM), or neural networks that can predict and generate physics-aware virtual environments. Speaking at the headline keynote at CES 2025, Huang said Nvidia Cosmos models will be open sourced. "The ChatGPT moment for robotics is coming. Like large language models, world foundation models are fundamental to advancing robot and AV development," said Huang. "We created Cosmos to democratize physical AI and put general robotics in reach of every developer."

world foundation models (WFMs) will be as important as large language models, but physical AI developers have been underserved. WFMs will use data, text, images, video and movement to generate and simulate virtual worlds that accurately models environments and physical interactions. Nvidia said 1X, Agility Robotics and XPENG, and autonomous developers Uber, Waabi and Wayve are among the companies using Cosmos.

D

Data Mining (数据挖掘)

English: The process of discovering patterns and insights from large data sets, using methods like clustering, classification, and association rule mining.
中文：从大型数据集中发现模式和洞见的过程，通常使用聚类、分类和关联规则挖掘等方法。

Dataset (数据集)

English: A collection of data points or examples used to train, validate, and/or test machine learning models.
中文：用于训练、验证和/或测试机器学习模型的一组数据点或示例的集合。

Deep Learning (深度学习)

English: A subset of machine learning that involves neural networks with multiple layers (deep neural networks). These deeper architectures can automatically learn high-level representations from data.
中文：机器学习的一个分支，使用具有多层结构的神经网络（深度神经网络）。这些深层架构能够自动从数据中学习高层次表示。

Dimensionality Reduction (降维)

English: Techniques like Principal Component Analysis (PCA) or t-SNE that reduce the number of input features while preserving as much information as possible.
中文：例如主成分分析（PCA）或 t-SNE 等技术，用于在尽可能保留信息的前提下减少输入特征的数量。

E

Epoch (轮次)

English: One complete pass through the entire training dataset during the training process of a machine learning model.
中文：在机器学习模型的训练过程中，对整个训练数据集进行一次完整遍历的过程。

Evaluation Metrics (评估指标)

English: Methods or formulas used to measure the performance of a model. Examples include accuracy, precision, recall, F1 score, and ROC AUC.
中文：用于衡量模型性能的方法或公式。例如准确率（accuracy）、精确率（precision）、召回率（recall）、F1 值（F1 score）以及 ROC AUC。

F

Feature (特征)

English: An individual measurable property or characteristic of a phenomenon being observed (e.g., the pixel intensity in an image, or the temperature reading in a climate dataset).
中文：对所观察现象的可测量属性或特性（例如图像中的像素强度，或气候数据集中的温度读数）。

Feature Engineering (特征工程)

English: The process of using domain knowledge to create new features from existing raw data to improve model performance.
中文：运用领域知识从原始数据中创建新特征以提升模型性能的过程。

Fine-tuning (微调)

English: The process of taking a pretrained model (often on a large dataset) and adjusting its parameters on a new, related task or dataset.
中文：对已经在大型数据集上预训练过的模型进行参数调整，以适应新的、相关任务或数据集的过程。

G

Generalization (泛化)

English: A model’s ability to perform accurately on new, unseen data, rather than just on the training data.
中文：指模型在未见过的新数据上仍能准确表现的能力，而不仅仅是在训练数据上表现良好。

Generative Adversarial Network (GAN) (生成对抗网络)

English: A system of two neural networks: a generator that creates synthetic data and a discriminator that tries to distinguish real data from fake. They train each other in a competitive process.
中文：由两个神经网络组成的系统：一个生成器用于生成合成数据，另一个判别器用于区分真实数据与虚假数据。它们在竞争的过程中相互训练。

Gradient Descent (梯度下降)

English: An optimization algorithm that iteratively adjusts parameters to minimize a loss function by moving in the direction of the steepest descent in the parameter space.
中文：一种优化算法，通过在参数空间沿最陡下降方向迭代调整参数，来最小化损失函数。

Generative AI?

Generative AI enables users to quickly generate new content based on a variety of inputs. Inputs and outputs to these models can include text, images, sounds, animation, 3D models, or other types of data.

H

Hyperparameters (超参数)

English: Configuration settings external to a model that are not learned from the training data (e.g., learning rate, number of layers, or number of neurons in each layer).
中文：模型外部的配置设置，不从训练数据中学习（例如学习率、层数或每层神经元的数量）。

Heuristic (启发式方法)

English: A practical method or approach to problem-solving that isn’t guaranteed to be optimal but is fast and good enough for many situations.
中文：一种解决问题的实用方法或策略，无法保证绝对最优，但在很多情况下速度快且效果足够好。

I

Inference (推理)

English: The process of using a trained model to make predictions or decisions on new, unseen data.
中文：使用训练好的模型对新的、未见过的数据进行预测或决策的过程。

Interpretability (可解释性)

English: Refers to how understandable the decisions or reasoning of a model are to humans. Interpretable models help users trust the outputs.
中文：指模型的决策或推理对于人类而言的可理解程度。可解释的模型能帮助用户信任其输出。

L

Label (标签)

English: The correct answer or category assigned to a data point in supervised learning (e.g., the numeric value in a regression problem or class category in classification).
中文：在监督学习中分配给数据点的正确答案或类别（例如回归问题中的数值，或分类中的类别标签）。

Latent Space (潜在空间)

English: A compressed, abstract representation of the data typically learned by deep neural networks. It captures the underlying patterns and structure in a lower-dimensional space.
中文：深度神经网络通常学习到的对数据的压缩和抽象表示，以较低维度来捕捉数据的内在模式和结构。

Learning Rate (学习率)

English: A hyperparameter that controls how much to adjust the model’s parameters in response to the estimated error each time model weights are updated.
中文：一种超参数，用于控制在每次模型权重更新时，针对估计误差调整模型参数的幅度。

Loss Function (Cost Function) (损失函数/代价函数)

English: A function that measures the disparity between a model’s predictions and the actual ground truth labels. Training aims to minimize this function.
中文：衡量模型预测值与实际真实标签之间差异的函数。训练的目标就是使该函数的值最小化。

LLM - Large Language Model

An AI large language model (LLM) is a type of artificial intelligence system designed to understand and generate human-like text. These models learn statistical patterns from massive amounts of text data (e.g., webpages, books, articles, etc.) and use these patterns to predict and generate coherent text responses.

Large language models are revolutionizing how we interact with computers. They serve as the foundation for chatbots, virtual assistants, code-generation tools, content creation, and more. By understanding natural language and generating human-like responses, LLMs promise a future of more intuitive and sophisticated AI applications.

M

Machine Learning (ML) (机器学习)

English: A subset of AI focusing on algorithms that learn from data to make predictions or decisions without being explicitly programmed.
中文：人工智能的一个分支，侧重于通过数据学习来做出预测或决策，而不是基于显式编程。

Model (模型)

English: A representation of the learned patterns or relationships in data—often implemented as a mathematical function or computational architecture.
中文：对数据中所学到的模式或关系的表示，通常以数学函数或计算架构的形式实现。

Momentum (动量)

English: In gradient-based optimization, a method that helps to accelerate gradient descent by moving more consistently in the direction of minima, reducing oscillations.
中文：在基于梯度的优化中，一种通过在朝最小值方向更连续地移动来加速梯度下降并减少振荡的方法。

N

Natural Language Processing (NLP) (自然语言处理)

English: A field of AI that deals with understanding, interpreting, and generating human language. Applications include text classification, machine translation, and chatbots.
中文：人工智能的一个领域，涉及对人类语言的理解、解释和生成。应用包括文本分类、机器翻译和聊天机器人等。

Normalization (归一化/标准化)

English: Transforming data to have specific statistical properties (like a mean of 0 and a standard deviation of 1) or scaling data to a specific range. This can improve training stability and speed.
中文：将数据转换为具有特定统计特性的形式（如均值为0、标准差为1）或将数据缩放至特定范围，可提高训练的稳定性和速度。

O

Overfitting (过拟合)

English: When a model learns the training data too well, including noise or random fluctuations, and performs poorly on new, unseen data.
中文：模型对训练数据（包括噪声或随机波动）学习得过于透彻，导致在新数据上的表现不佳。

One-Hot Encoding (独热编码)

English: A method of representing categorical variables as binary vectors, where exactly one element is “1” and all others are “0.”
中文：将类别型变量表示为二进制向量的方法，其中仅有一个元素为“1”，其他均为“0”。

P

Parameter (参数)

English: A variable in a model (like a weight or bias in a neural network) that is learned during training.
中文：模型中的一个变量（如神经网络中的权重或偏置），在训练过程中被学习。

Pretraining (预训练)

English: Initial training on a large, generic dataset (such as billions of words of text), allowing the model to learn general features. This model can then be fine-tuned on a task-specific dataset.
中文：在大型、通用数据集（如海量文本）上进行的初始训练，使模型学习到通用特征。随后可以在特定任务的数据集上进行微调。

Precision (精确率)

English: In a classification context, the proportion of positive predictions that are actually correct. Formula:

$Precision=True PositivesTrue Positives+False Positivestext{Precision} = frac{text{True Positives}}{text{True Positives} + text{False Positives}}$

中文：在分类任务中，模型预测为正类的实例中真正为正类所占的比例。公式：

$Precision=真正例真正例+假正例text{Precision} = frac{text{真正例}}{text{真正例} + text{假正例}}$

Q

Q-Learning (Q学习)

English: A type of Reinforcement Learning algorithm that learns an action-value function $Q$ indicating the expected reward for taking a given action in a particular state.
中文：一种强化学习算法，通过学习一个动作价值函数 $Q$ ，来指示在特定状态下执行给定动作的期望回报。

R

Recall (Sensitivity) (召回率/敏感度)

English: In a classification context, the proportion of actual positives that are correctly identified. Formula:

$Recall=True PositivesTrue Positives+False Negativestext{Recall} = frac{text{True Positives}}{text{True Positives} + text{False Negatives}}$

中文：在分类任务中，实际为正类的实例中被正确识别为正类的比例。公式：

$Recall=真正例真正例+假负例text{Recall} = frac{text{真正例}}{text{真正例} + text{假负例}}$

Regularization (正则化)

English: Techniques to prevent overfitting by penalizing large weights or complex models (e.g., L1 and L2 regularization, dropout in neural networks).
中文：通过惩罚过大的权重或过于复杂的模型来防止过拟合的技术（例如 L1 与 L2 正则化，以及神经网络中的 dropout）。

Reinforcement Learning (RL) (强化学习)

English: A learning paradigm where an agent learns by interacting with an environment, receiving rewards or penalties for actions, and aims to maximize the cumulative reward.
中文：一种学习范式，智能体通过与环境交互并根据行动获得奖励或惩罚来学习，目标是最大化累积奖励。

S

Semi-Supervised Learning (半监督学习)

English: A learning approach that uses a small amount of labeled data and a large amount of unlabeled data to build better models.
中文：一种学习方法，使用少量有标签数据和大量无标签数据来构建更好的模型。

Stochastic Gradient Descent (SGD) (随机梯度下降)

English: A variant of gradient descent where parameters are updated for each training example (or a small batch) rather than the entire dataset, speeding up computations.
中文：梯度下降的一种变体，每个训练样本（或小批量）都会更新参数，而不是等待处理整个数据集，从而加快计算速度。

Supervised Learning (监督学习)

English: A machine learning task that uses labeled data to train models. The algorithm learns by comparing its predictions with the known ground truth.
中文：一种使用带标签数据来训练模型的机器学习任务。算法通过将预测结果与已知真实值进行比较来学习。

T

Tensor (张量)

English: A multi-dimensional array used in numerical computing frameworks like TensorFlow or PyTorch for representing data.
中文：一种多维数组，用于在诸如 TensorFlow 或 PyTorch 等数值计算框架中表示数据。

Transfer Learning (迁移学习)

English: A technique where a model trained on one task is repurposed on a second, related task, leveraging the learned features.
中文：将已经在某一任务上训练好的模型应用于另一个相关任务，并利用其已学到的特征的技术。

Transformer (Transformer模型)

English: A deep learning architecture primarily used in NLP (e.g., BERT, GPT). It relies on self-attention mechanisms rather than convolutions or recurrent modules.
中文：一种主要用于自然语言处理（如 BERT、GPT）领域的深度学习架构，依赖于自注意力机制，而不是卷积或循环结构。

Training (训练)

English: The process of feeding data to a model so it can learn the relationships or patterns within that data.
中文：将数据输入模型的过程，以便模型学习数据中的关系或模式。

U

Underfitting (欠拟合)

English: When a model is too simple or has not learned enough structure from the data, resulting in poor performance even on training data.
中文：模型过于简单或没有充分学习数据的结构，即使在训练数据上也表现不佳。

Unsupervised Learning (无监督学习)

English: A family of algorithms that discover patterns in data without labeled examples, such as clustering or dimensionality reduction.
中文：无需标签示例而在数据中发现模式的一类算法，如聚类或降维。

V

Validation Set (验证集)

English: A portion of data held out from training to tune model hyperparameters and prevent overfitting.
中文：从训练中留出的一部分数据，用来调节模型的超参数并防止过拟合。

Variance (方差)

English: How much a model’s predictions for a given point vary across different training sets. High variance may indicate the model is overfitting.
中文：模型对同一个数据点在不同训练集中预测结果的变化程度。高方差可能表明模型出现过拟合。

Vectorization (向量化)

English: Expressing computations (such as model operations) in terms of vectors and matrices for efficient parallel processing, often used in deep learning frameworks.
中文：以向量和矩阵的形式表达计算（如模型操作）以实现高效的并行处理，这在深度学习框架中常被使用。

W

Weight (权重)

English: A parameter in a neural network that is multiplied by an input value. Adjusting weights is central to training neural networks.
中文：神经网络中的一种参数，用于与输入值相乘。调整权重是训练神经网络的核心。

X

XAI (Explainable AI) (可解释的人工智能)

English: Methods and techniques focused on understanding and interpreting the decisions and inner workings of AI models, ensuring transparency.
中文：专注于理解和解释人工智能模型决策及其内部机制的方法和技术，以确保透明度。