Early artificial intelligence (AI) research typically assumed that the agent always has perfect knowledge of its environment and knows the effects of its actions with certainty. This assumption is, of course, unrealistic in many real-world situations. An agent may be uncertain about its initial conditions, its actions may not always produce their intended effects, and the agent may not always know what state it's in after it starts acting due to incomplete and/or inaccurate observations. Much work has been done during the past decade to develop techniques for planning, reasoning, and learning that can operate effectively under such conditions.
The courses taught in the Bowdoin Computer Science department are aligned with Assistant Professor Stephen Majercik's Research.
Stephen Majercik has developed a new approach to efficiently solve planning problems that have the following characteristics:
1) goal-oriented: the agent is trying to reach one of a set of goal states from its initial state,
2) finite-horizon: the agent has a limited amount of time in which to act,
3) probabilistic: the effects of actions are uncertain, but a probability distribution over possible effects is known for each action,
4) partially observable: observations may be incomplete and/or inaccurate, and
5) contingent: the optimal plan requires that actions be taken contingent on previous observations.
The solution to such a problem is a contingent plan that reaches a goal state with the highest possible probability.

His approach, implemented in the planner Zander, is based on the fact that this type of planning problem can be converted into an instance of a stochastic satisfiability (SSAT) problem, such that the solution to the SSAT instance yields a plan that will reach the goal with the highest probability. Zander is competitive with many other techniques currently being developed for this type of planning.
Professor Majercik is also interested in human-computer collaboration to facilitate learning under uncertainty. There are significant benefits to be gained from combining the strong points of humans (e.g. an ability to focus the search for a solution in interesting areas of the solution space) with those of computers (e.g. enormous speed in generating and evaluating potential solutions). Currently, he is exploring the potential benefit of incorporating human advice in a reinforcement learning framework. In reinforcement learning, the agent learns from the results of its actions as it explores its environment. This type of learning can be time-consuming, since the results of actions are often not immediately clear, and dangerous, since an agent can harm, or even destroy, itself in the process of learning. Negative advice, i.e. advice about what not to do, has the potential to both speed up the learning process and make it safer, but there are many issues that need to be addressed in order to effectively incorporate such advice.