AI Glossary: From Basic to Technical Terms

Basic Terms

Essential concepts for understanding the basics of AI.

Artificial Intelligence (AI): computer systems that can perform complex tasks such as learning, problem-solving, and decision making

Machine Learning (ML): a subfield of AI where systems improve their performance on tasks by learning from data rather than being explicitly programmed

Natural Language Processing (NLP): a subfield of AI that focuses on allowing machines to understand, interpret, and generate human language

Generative AI (GenAI): a type of AI built on machine learning techniques that creates content like text, images, audio, and more, based on patterns learned from existing examples

Large Language Models (LLMs): a type of generative AI that utilizes natural language processing techniques to generate human-like text based on patterns from the huge amounts of text it was trained on

Model: a program trained on large amounts of data that processes inputs and produces outputs based on learned patterns. LLMs are one type of AI model, but there are also models for image generation, speech recognition, and many other tasks.

Training Data: data such as text, images, videos, or audio that are used as examples to teach AI systems. The quality, quantity, and diversity of the training data significantly impacts how the system performs, so if the training data is bad, the outputs will be bad (garbage in, garbage out).

Chatbot: a computer program designed to simulate conversation with users

AI Agent: autonomous or semi-autonomous AI entities that can make decisions with some amount of independence. An example of an AI agent today is a virtual assistant that can book appointments.

Prompt: the input a user provides to an AI model to generate an output

Bias: skewed training data can lead to false or offensive outputs that can reproduce or amplify prejudices present in the training data. This is why critical evaluation of AI outputs is essential.

Hallucination: when an AI model generates plausible-sounding but factually incorrect or nonsensical information. This is why it is so important to always fact-check AI outputs, especially for academic or professional use.

Deepfakes: highly realistic fake images, audio, and videos generated using AI. These have serious implications for misinformation, privacy, and consent that makes media literacy more important than ever.

Guardrails: mechanisms to filter the inputs or outputs of generative AI to ensure the model is used ethically and safely. Because much of the training data for LLMs comes from publicly available web content that can be biased, harmful, or inappropriate, guardrails are essential to prevent unsafe or harmful outputs.

Technical Terms

Terms for understanding more about how AI works.

Neural Network: modeled after the human brain, neural networks are made up of interconnected layers of nodes (like neurons in the brain) that work together to process and analyze complex data

Deep Learning: a subset of machine learning that uses multilayered neural networks, including an input layer, often hundreds of hidden layers, and an output layer

Supervised Learning: an approach to training AI models where the model learns from a set of labeled examples called a training set

Unsupervised Learning: an approach to training AI models where the model is given unlabeled data and finds clusters

Reinforcement Learning: an approach to training AI models where the model learns through trial and error in which correct outputs result in reward and incorrect outputs result in penalization

Attention: a mechanism that allows AI models to focus on the most relevant parts of an input when processing information. This is one of the most important breakthroughs in AI as it allows models to capture relationships between words regardless of distance, resulting in the ability to understand context and write coherently.

Transformers: a type of architecture in deep learning that uses attention to process entire sequences of data (like a sentence or paragraph) simultaneously. This is the “T” in ChatGPT, and the foundation of the AI models you interact with today.

Foundation Model: AI models that are pre-trained on large amounts of data to perform a wide range of tasks and can then be fine-tuned for specific tasks. This accelerates the development of AI as developers can adapt these foundation models rather than starting from scratch.

Fine-tuning: the process of taking a pre-trained AI model and training it on a smaller, specific dataset for use on a certain task

Reinforcement Learning with Human Feedback (RLHF): a technique used to fine-tune a pre-trained model through human feedback on the model’s outputs. This process teaches AI systems to respond in ways that align with human preferences and values.

Retrieval-Augmented Generation (RAG): a method where the AI model retrieves information from external sources like documents, PDFs, and other user-uploaded material to help generate responses. This is what allows you to get AI assistance on your own research materials that was not included in the model’s training data.

Parameters: the internal variables of an AI model learned from training data. While proprietary models like ChatGPT and Claude do not publicly share their parameter counts, open-source models like Llama 4 contain 17 billion parameters, according to Meta.

Temperature: a parameter that controls the randomness of a model’s responses where the higher the temperature value, the more random and unpredictable the outputs

Embeddings: representations of words or phrases as vectors in high-dimensional space where the location and distance between words indicates their semantic similarity. This allows AI models to recognize synonyms, analogies, and subtler aspects of language like sentiment or tone.

Token: the smallest unit of text an AI model processes, typically 4 characters in English or about ¾ of a word. This is why AI usage limits are often defined in terms of tokens.

Context Window: the maximum number of tokens that an AI model can process simultaneously when generating a response. This is essentially the “memory” capacity of an AI model during an interaction. The larger the context window the more information the model can “remember” while responding to prompts.

Application Programming Interface (API): a tool that allows developers to access the functionality of AI models and implement them into their own applications

References

Definitions in this glossary have been adapted from the following sources.

Attewell, Sue. “HE Generative AI Literacy Definition.” Jisc AI in Universities and Colleges, July 23, 2024. https://nationalcentreforai.jiscinvolve.org/wp/2024/07/23/he-generative-ai-literacy-definition/.

Center for Teaching, Learning, and Technology, University of British Columbia. “Glossary of GenAI Terms.” Accessed July 21, 2025. https://ai.ctlt.ubc.ca/resources/glossary-of-genai-terms/.

MIT Sloan Teaching & Learning Technologies. “Glossary of Terms: Generative AI Basics.” Accessed July 21, 2025. https://mitsloanedtech.mit.edu/ai/basics/glossary/.

Stryker, Cole, and Eda Kavlakoglu. “What Is Artificial Intelligence (AI)?” IBM, August 9, 2024. https://www.ibm.com/think/topics/artificial-intelligence.

September 2025