CSE 516 — Multi-Agent Systems

1) LAX (Security Screening) Example

Targets: $T$, $|T|=n$
Defender chooses screening probabilities $q_t\in[0,1]$ for each target $t$
$V_t$: attacker payoff from a successful attack on target $t$
Budget: $\sum_{t\in T} q_t \le B$
Attacker observes $q=(q_1,\dots,q_n)$ and best-responds.

Defender objective (minimize worst-case attacker payoff): $$ \min_{q}\ \max_{t\in T} (1-q_t)V_t \quad\text{s.t.}\quad \sum_{t} q_t\le B,\ 0\le q_t\le 1. $$

Introduce an epigraph variable $v$ for the max: $$ \min_{q,v}\ v \quad\text{s.t.}\quad v\ge (1-q_t)V_t\ \forall t,\ \sum_t q_t\le B,\ 0\le q_t\le 1. $$ This is a linear program in $(q,v)$.

2) Linear Programs (LP)

Canonical inequality form

$$ \min_x\ c^\top x \quad \text{s.t.}\quad Ax\le b. $$ Feasible set is a polyhedron $P=\lbrace x:Ax\le b\rbrace $.

Standard form

\[ \min_x\ c^\top x \quad \text{s.t.}\quad Ax=b,\ x\ge 0. \]

Slack variables convert inequalities to equalities:
If $a_i^\top x\le b_i$, write $a_i^\top x + s_i = b_i$ with $s_i\ge 0$.

3) Bases, BFS, and Simplex Geometry

Assume standard form with $A\in\mathbb{R}^{m\times n}$ (often $n\gg m$).

Pick a basis $Q\subseteq\lbrace 1,\dots,n\rbrace $ of size $m$ s.t. columns $A_Q$ are linearly independent. - Let $B:=A_Q\in\mathbb{R}^{m\times m}$, and $A=(B,N)$. - Split $x=(x_B,x_N)$.

Setting nonbasic $x_N=0$ gives the basic solution $$ x_B=B^{-1}b,\qquad x_N=0. $$ If $x_B\ge 0$, this is a basic feasible solution (BFS).

4) Reduced Costs and Optimality (Simplex Test)

Given a basis $B$ and its cost vector $c_B$, define for any variable/column $j$: $$ \bar c_j \;=\; c_j - c_B^\top B^{-1}A_j. $$

For a minimization LP in standard form:
If $x_B=B^{-1}b\ge 0$ and $\bar c_j\ge 0\ \forall j$, then the BFS is optimal.

5) ILP / MILP and LP Relaxation

A typical mixed-integer LP: $$ \min_{x,y}\ c^\top x + d^\top y \quad\text{s.t.}\quad Ax + By \le b,\ x\in\mathbb{R}^n,\ y\in\lbrace 0,1\rbrace ^k. $$

LP relaxation: replace integrality by box constraints: $$ y\in[0,1]^k. $$ - For minimization: LP relaxation gives a lower bound on the MILP optimum. - If the relaxed optimum is integral, it is MILP-optimal.

Knapsack (ILP)

Items $k=1,\dots,K$ with value $v_k$, weight $w_k$, capacity $W$: $$ \max_{x\in\lbrace 0,1\rbrace ^K}\ v^\top x \quad\text{s.t.}\quad w^\top x \le W. $$ LP relaxation: $x\in[0,1]^K$ (fractional knapsack).

6) LP Duality (general rules)

Standard form dual pair (most used)

Primal (P): $$ \min_x\ c^\top x \quad \text{s.t.}\quad Ax=b,\ x\ge 0. $$ Dual (D): $$ \max_p\ b^\top p \quad \text{s.t.}\quad A^\top p \le c,\ p\ \text{free}. $$

Duality theorems

Weak duality: if $x$ feasible for (P) and $p$ feasible for (D), then $$ b^\top p \le c^\top x. $$
Strong duality: if both have optimal solutions, then for optima $(x^\star,p^\star)$, $$ b^\top p^\star = c^\top x^\star. $$
Dual of the dual: returns the primal (up to equivalent reformulation).

7) Complementary Slackness

For primal (P): $\min c^\top x$ s.t. $Ax\ge b,\ x\ge 0$
Dual (D): $\max b^\top p$ s.t. $A^\top p \le c,\ p\ge 0$

Complementary slackness conditions: - For each primal constraint $i$: $$ p_i\,(a_i^\top x - b_i)=0. $$ - For each primal variable $j$: $$ \bigl(c_j - p^\top A_j\bigr)\,x_j=0. $$

Interpretation:
- Either constraint $i$ is tight $(a_i^\top x=b_i)$ or the dual variable $p_i$ is zero.
- Either variable $x_j$ is positive or the corresponding dual inequality is tight $(p^\top A_j=c_j)$.

Reduced costs = dual slack (simplex ↔ duality)

If you define $$ p^\top := c_B^\top B^{-1}, $$ then $$ c_j - p^\top A_j \;=\; c_j - c_B^\top B^{-1}A_j \;=\; \bar c_j, $$ so complementary slackness is literally “$x_j \bar c_j = 0$” at optimum.

8) Normal-Form Games (new material from the photos)

A (finite) normal-form game is: $$ \Gamma = \bigl(I,\ \lbrace A_i\rbrace {i\in I},\ \lbrace u_i(\cdot)\rbrace \bigr), $$ where: - $I=\lbrace 1,\dots,n\rbrace $ is the set of players, $n=|I|$ - $A_i$ is player $i$’s action set (pure strategies) - $u_i: A \to \mathbb{R}$ is player $i$’s utility/payoff function - The joint action space is the Cartesian product: $$ A = A_1\times A_2\times\cdots\times A_n. $$

A joint action is $a=(a_1,\dots,a_n)\in A$.
Write: - $a_{-i}=(a_1,\dots,a_{i-1},a_{i+1},\dots,a_n)$ - $a=(a_i,a_{-i})$ - $u_i(a)=u_i(a_i,a_{-i})$

9) Mixed Strategies

A mixed strategy for player $i$ is a distribution over $A_i$: $$ s_i: A_i \to [0,1],\qquad s_i(a_i)=\Pr(a_i). $$ The mixed-strategy space is the simplex: $$ S_i = \Delta(A_i) = \left\lbrace s_i:\ \sum_{a_i\in A_i} s_i(a_i)=1,\ s_i(a_i)\ge 0\right\rbrace . $$

A strategy profile is: $$ s=(s_1,\dots,s_n)\in S,\qquad S=S_1\times\cdots\times S_n, $$ and similarly $s=(s_i,s_{-i})$.

Under independent mixing, the probability of a joint action $a\in A$ is: $$ s(a)=\prod_{i=1}^n s_i(a_i). $$

10) Expected Utility Under Mixed Strategies

\[ u_i(s) \;=\; \mathbb{E}_{a\sim s}[u_i(a)] \;=\; \sum_{a\in A} s(a)\,u_i(a) \;=\; \sum_{a\in A}\left(\prod_{j=1}^n s_j(a_j)\right)u_i(a). \]

11) Nash Equilibrium (NE)

A strategy profile $s\in S$ is a Nash equilibrium if no player can improve by a unilateral deviation: $$ \forall i\in I,\ \forall s_i'\in S_i:\quad u_i(s_i',s_{-i}) \le u_i(s). $$

Equivalent “check only pure deviations” form (finite games): $$ \forall i\in I,\ \forall a_i'\in A_i:\quad u_i(a_i',s_{-i}) \le u_i(s). $$

12) Best Responses

Player $i$’s best response to others’ strategies $s_{-i}$ is any $s_i^\star\in S_i$ such that: $$ u_i(s_i^\star,s_{-i}) \;=\; \max_{s_i\in S_i} u_i(s_i,s_{-i}). $$

The best response correspondence: $$ BR_i(s_{-i}) := \arg\max_{s_i\in S_i} u_i(s_i,s_{-i}). $$

Then $s$ is a Nash equilibrium iff: $$ \forall i,\ s_i \in BR_i(s_{-i}). $$

13) Where LP connects back to games (context)

For zero-sum matrix games, computing an optimal mixed strategy is an LP, and the opposing player’s LP is the dual.
Strong duality corresponds to the minimax equality (game value).

Quick sanity map (what you should remember)

LP dual variables $p$ act like “shadow prices.”
Complementary slackness tells you exactly which constraints/variables are active at optimum.
In games, mixed strategies live on simplices $\Delta(A_i)$, and NE is “no profitable unilateral deviation.”
Best responses are argmax sets; NE is a fixed point of the best-response correspondences.