## Saturday, October 29, 2011

### Next Week's Readings: Smart Grid and Propagators

On Monday we will be discussing Agent-Based Control for Decentralised Demand Side Management in the Smart Grid by Sarvapali D. Ramchurn, Perukrishnen Vytelingum, Alex Rogers, Nicholas R. Jennings from the AAMAS 2011 proceedings. You can follow that link to find all the other papers from that conference. Remember that one option for your final project is to implement and/or improve upon one of the algorithms in one of these papers so, start reading.

On Wednesday we will be discussing The Art of the Propagator by Sussman, Gerald Jay and Radul, Alexey. You will also want to view Sussman's talk on this topic.

No class on Friday.

## Tuesday, October 25, 2011

### HW4: Learning Correlated Equilibria

For this homework you will implement the learning algorithm in this paper by Cigler and Faltings. The algorithm in question allows the agents to reach a correlated equilibrium by using only a learning mechanism, that is, without the need for a separate coordination device.

The game is a broadcasting coordination game. There are N agents and C channels (both sliders), where N >= C. At each time step all agents choose a channel for broadcasting but, due to physical limitations, only one agent can actually transmit on each channel at a time. Thus, if an agent uniquely chooses a channel then he gets a utility of 1, if 2 or more agents choose the same channel they all receive a utility of 0.

You will first solve this problem by having all the agents implement the Q-learning algorithm. Note that there is only 1 state in this game, so the Q-learning will only happen over the actions of the players. Run it for a while and plot the sum of the utilities to see how well they do as a group.

Then, you will implement the algorithm from the paper (use the same NetLogo model, just add a toggle switch so the user can choose between the two). The algorithm is described in Section 2 and works roughly as follows (but, see the paper):

1. At each tick, there is a randomly generated integer (signal), from 1 to K, that all agents can observe. They use this signal in their learning.
2. Each agent keeps a table f(k), for k = 1..K and where f(k) is either the channel the agent will transmit if it sees signal k, or 0 if will not transmit at all when it sees that signal. Table f(k) is initialized to random channels.
3. At each tick, with signal k, the agent transmits on channel f(k). If f(k) = 0 then the agent chooses a random channel to monitor.
4. The agents get their utility based on the collisions. If the agent gets 0 utility then it sets f(k) = 0 with probability p (slider).
5. If the agent was monitoring channel c then if no one transmitted on it the agent will set f(k) = c.

The .nlogo model is due on Monday, November 21 @9am.