Erica Johnson, '03
Recognizing Useful Advice and Using it Efficiently In a Reinforcement Learning Framework.
Reinforcement learning is an artificial intelligence technique in which an artificial agent learns how to act by exploring its environment and analyzing the rewards it receives. This technique suffers from some drawbacks that make it difficult to use in some real-world situations (e.g. the agent may injure itself in the process of exploring and learning about its environment). Incorporating human advice has been proposed as a way of overcoming some of these drawbacks, but relatively little research has been done. We created a classification scheme for advice in a reinforcement learning framework. Within this framework, we investigated the effects of positive advice of different qualities on a reinforcement learning agent in a standard reinforcement learning problem. We showed that "mediocre" advice -- advice that prevents the agent from gaining significant rewards (positive or negative) -- keeps the agent safe at the expense of learning about its environment, and we devised and tested a method that allows the agent to recognize such advice and ignore it in order to learn on its own.
Byron Boots, '03
"Robot Localization and Abstract Mapping in Dynamic Environments"
Byron Boots, '03
Chunking: A Modified Dynamic Programming Approach to Solving Stochastic Satisfiability Problems.
Stochastic Boolean satisfiability (SSAT) is a generalization of satisfiability (SAT) that is similar to quantified Boolean satisfiability (QSAT). The ordered variables of the Boolean formula in an SSAT problem, however, instead of being existentially or universally quantified as in a QSAT problem, are existentially or randomly quantified. Randomly quantified variables are True with a certain probability, and an SSAT instance is satisfiable with some probability that depends on the ordering of and interplay between the existential and randomized variables. The goal is to choose values for the existentially quantified variables that maximize the probability of satisfying the formula. In this work, Byron Boots and Professor Stephen Majercik explored a new technique for solving SSAT problems that works by breaking the problem up into "chunks" acccording to the structure of the particular instance, partially solving those chunks, and using these partial solutions to construct the solution to the original problem. They continued to work on this project after Boots graduated in 2003; so far, their work has resulted in a paper (``DC-SSAT: A Divide-and-Conquer Approach to Solving Stochastic Satisfiability Problems Efficiently'') that was accepted at the Twentieth National Conference on Artificial Intelligence (AAAI-05). The National Conference on Artificial Intelligence is one of the top two artificial intelligence conferences in the country. They are currently working on extending their results in preparation for a journal version of the paper.
Andrew Rusczek, '02
Faster Probabilistic Planning Through More Efficient Stochastic Satisfiability Problem Encodings.
The propositional contingent planner ZANDER solves finite-horizon, partially observable, probabilistic planning problems at state-of-the-art-speeds by converting the planning problem to a stochastic satisfiability (SSAT) problem and solving that problem instead. ZANDER obtains these results using a relatively inefficient SSAT encoding of the problem (a linear action encoding with classical frame axioms). In this work, Andrew Rusczek and Professor Stephen Majercik described and analyzed three alternative SSAT encodings for probabilistic planning problems: a linear action encoding with simple explanatory frame axioms, a linear action encoding with complex explanatory frame axioms, and a parallel action encoding. Rusczek developed and tested these encodings on a suite of planning problems; his results indicated that linear action encodings with simple explanatory frame axioms and parallel action encodings show particular promise, improving ZANDER's efficiency by as much as three orders of magnitude. This work has been published in the Proceedings of the Sixth International Conference on Artificial Intelligence Planning and Scheduling (AIPS-02), the leading international conference on artificial intelligence planning.
Joshua Peteet, '02
Incorporating Advice in a Reinforcement Learning Framework.
Douglas Vail '01
'Genetic Algorithms As A Search Strategy And A Novel Means Of Potential Function Discovery In The Protein Folding Problems'
Eric Forbell '00
"A Connectionist Model of Competition at the Phonological-lexical Interface during Speech Perception" (Joint honors in Computer Science and Neuroscience)
A connectionist architecture comprised of Hebbian cell assemblies was developed and applied to the problem of speech recognition at the phonemic-lexical interface. Speech was encoded in the model as the sequential activation of phoneme representations connected to higher-level linguistic structures. An architectural decision concerning the spatial organization of the top-level lexical map was supported by psycholinguistic evidence suggesting a topological layout in which cognitive distance was related to initial phonological structure. Through computer simulation, lateral inhibition at this lexical level was shown to be a necessary and sufficient mechanism to replicate the findings of a series of lexical priming experiments. The results of these experiments provided added support for a more general theory of cognition based upon Hebb's original cell assembly hypothesis.
Anthony Roy '00
"An Adaptive Approach to Reinforcement Learning: Prospects and Initial Progress"
A generalized environment for testing reinforcement learning algorithms was created. This environment, in turn, was used to test different algorithms including Q-learning, prioritized sweeping and Parr's hierarchical model over a range of different environments.
Nils Nieuwejaar '92
"Algorithmic Search for Extremal Graphs"
James Carenzo '93
"An Emprical Study of Hill-Tracking Heuristics for Combinatorial Search Problems"
Josua Introne '93
"Explorations in Cognitive Modeling"
Andrew Kinley '93
"Computer Vision: A Comparison of Symbolic and Neural Network Approaches to Edge Detection"
Weihua Yan '93
"Statistical Methods for the Derivation of Part-of-Speech Tag Sets from an Unfamiliar Text Corpus"
Jem Lewis '95
"Using a Genetic Algorithm to Extend Transformation-Based Learning for Part-of-Speech Tagging"
Timothy Aron '96
"Monte Carlo Simulations of Phonon Imaging Experiments" (Joint honors in Computer Science and Physics)
Steven Deitz '98
"An Examination of the Insertion Methods for Approximately Solving the Traveling Salesman Problem"