GAME THEORY STRATEGIC DECISION MAKING

1 GAME THEORY STRATEGIC DECISION MAKING

2 What is Game Theory? GT is an analytical tool for social sciences that is used to model strategic interactions or conflict situations. Strategic interaction: When actions of a player influence payoffs to other players

3 GT: science or art? GT is the science of rational behavior in interactive situations. Good strategists mix the science of GT with their own experience.

How to use GT Explanation: What is the game to be played? Prediction: What outcome will prevail? Advice or prescription: Which strategies are likely to yield good results in which situations? 4

Why is GT important? Facilitates strategic thinking. Provides a standard taxonomy that is needed for a scientific approach in analyzing strategic interactions. Helps confirm long held beliefs. Provides new insights. To be literate in the modern age, you need to have a general understanding of GT. P.Samuelson 5

GT in the News GT, long an intellectual passtime, came into its own as a business tool. Forbes, July 3, 1995 FCC hired game theorists to construct rules of an auction for new wireless phone systems licenses. In response, communications companies hired game theorists first to negotiate with FCC and then help prepare optimal bids given the rules of the auction. Business Week, Mar. 14, 1994. GT is hot. The Wall Street Journal, Feb. 13, 1995. 6

7 GT in the News (cont.) lately game theorists have focused on real-world issues how to raise auction proceeds by revealing the bids and how Wal-Mart can coexist with local retailers. Consultants are jumping in. McKinsey & Co. has set up a game theory unit. Forbes, Nov. 7, 1994.

GT in the News (cont.) Get In The Game: Enhanced Scheduling System Increases Business Profitability Accurate scheduling of people, products and equipment can mean the difference between profit and loss for a business. Most scheduling approaches assume that there is a single decision maker that has complete information about the activity, but this is rarely the case. University of Arkansas researcher Erhan Kutanoglu has developed an approach using incentive-compatible scheduling and game theory to ensure selection of the most profitable schedule. University of Arkansas News Release, May, 2000 8

Where can we use GT? Any situation that requires us to anticipate our rival s response to our action is a potential context for GT. Games: Checkers, poker, chess, tennis, soccer etc. Economics: Industrial Organization, Micro/Macro/ International/Labor/Natural resource Economics, and Public Finance Political science: war/peace (Cuban missile crisis) Law: Designing laws that work Biology: animal behavior, evolution Information systems: System competition/evolution 9

Where can we use GT? (cont.) Business: Games against rival firms: Pricing, advertising, marketing, auctions, R&D, joint ventures, investment, location, quality, take over etc. Games against other players Employee/employer, managers/stockholders Supplier/buyer, producer/distributor, firm/government 10

Some Terminology Strategy Payoffs Rationality Common knowledge of rules Equilibrium 11

12 Strategic (Normal) Form Games Static Games of Complete and Imperfect Information

What is a Normal Form Game? A normal (strategic) form game consists of: Players: list of players Strategies: all actions available to all players Payoffs: a payoff assigned to every contingency (every possible strategy profile as the outcome of the game) 13

Prisoners Dilemma Two suspects are caught and put in different rooms (no communication). They are offered the following deal: If both of you confess, you will both get 5 years in prison (-5 payoff) If one of you confesses whereas the other does not confess, you will get 0 (0 payoff) and 10 (-10 payoff) years in prison respectively. If neither of you confess, you both will get 2 years in prison (-2 payoff) 14

15 Easy to Read Format of Prisoner s Dilemma Confess Prisoner 2 Don t Confess Confess -5, -5 0, -10 Don t Confess -10, 0-2, -2

Assumptions in Static Normal Form Games All players are rational. Rationality is common knowledge. Players move simultaneously. (They do not know what the other player has chosen). Players have complete but imperfect information. 16

Solution of a Static Normal Form Game Equilibrium in strictly dominant strategies A strictly dominant strategy is the one that yields the highest payoff compared to the payoffs associated with all other strategies. Rational players will always play their strictly dominant strategies. 17

Solution of a Static Normal Form Game Iterated elimination of strictly dominated strategies Rational players will never play their dominated strategies. Eliminating dominated strategies may solve the game. 18

Solution of a Static Normal Form Game (cont.) Nash Equilibrium (NE): In equilibrium neither player has an incentive to deviate from his/her strategy, given the equilibrium strategies of rival players. Neither player can unilaterally change his/her strategy and increase his/her payoff, given the strategies of other players. 19

Solution of Prisoners Dilemma Dominant Strategy Equilibrium Prisoner 2 Confess Don t Confess Confess -5, -5 0, -10 Don t Confess -10, 0-2, -2 20

21 Solution of Prisoners Dilemma Iterated Elimination Procedure Prisoner 2 Confess Don t Confess Confess -5, -5 0, -10 Don t Confess -10, 0-2, -2

Solution of Prisoners Dilemma Cell-by-cell Inspection Prisoner 2 Confess Don t Confess Confess -5, -5 0, -10 Don t Confess -10, 0-2, -2 22

NE of Prisoners Dilemma The strategy profile {confess, confess} is the unique pure strategy NE of the game. In equilibrium both players get a payoff of 5. Inefficient equilibrium; (don t confess, don t confess) yields higher payoffs for both. 23

A Pricing Example 24 Firm 2 High Price Low Price High Price 100, 100-10, 140 Low Price 140, -10 0, 0

3x3 Game Using Iterated Elimination 25 Player 2 Left Center Right Top 1, 0 1, 3 3, 0 Middle 0, 2 0, 1 3, 0 Bottom 0, 2 2, 4 5, 3

26 A Coordination Game Battle of the Sexes Husband Opera Movie Opera 2, 1 0, 0 Movie 0, 0 1, 2

Battle of the Sexes: After 30 Years of Marriage Husband Opera Movie Opera 3, 2 0, 0 Movie 0, 0 1, 2 27

A Strictly Competitive Game Matching Pennies 28 Player 2 Heads Tails Heads 1, -1-1, 1 No NE in pure strategies Tails -1, 1 1, -1

International Investment Game 3 Turkish firms investing in a Turkic Republic. A new law is being debated in the Turkic Republic and they all want the law to be favorable for Turkish firms. The president is very powerful. He promises to match the total donation made to a state university in terms of favorable tax cuts for Turkish firms. The 3 firms have to decide whether to contribute or not.the more they contribute the more favorable the law. 29

International Investment Game Firm 1 is the row player. Firm 2 is the column player. Firm 3 is the page player. Firm 3 Donates Firm 3 does not Donate Donate Don t Donate 5, 5, 5 3, 6, 3 Don t 6, 3, 3 4, 4, 1 Donate Don t Donate 3, 3, 6 1, 4, 4 Don t 4, 1, 4 2, 2, 2 NE is: (Don t, Don t, Don t) 30

31 Extensive Form Games Dynamic Games of Complete and Perfect Information

What is a Game Tree? Player 1 Left Right Player 2 Player 2 A B C D P11 P12 P13 P14 P21 P22 P23 32 P24

An Advertising Example Migros Aggressive Normal Wal-Mart Wal-Mart Enter Stay out Enter Stay out 680 730 700 800-50 0 400 033

Assumptions in Dynamic Extensive Form Games All players are rational. Rationality is common knowledge Players move sequentially. (Therefore, also called sequential games) Players have complete and perfect information Players can see the full game tree including the payoffs Players can observe and recall all previous moves 34

Solution of an Extensive Form Game Subgame Perfect Equilibrium: For an equilibrium to be subgame perfect, it has to be a NE for all the subgames as well as for the entire game. A subgame is a decision node from the original game along with the decision nodes and end nodes. Backward induction is used to find SPE 35

Advertising Example: 3 proper subgames Migros Wal-Mart Wal-Mart 680 730 700 800-50 0 400 036

Solution of the Advertising Game Subgame 1 Subgame 2 Wal-Mart Wal-Mart Enter Stay out Enter Stay out 680 730 700 800-50 0 400 0 37

Solution of the Advertising Game (cont.) Migros Aggressive Normal 730 0 700 400 SPE of the game is the strategy profile: {aggressive, (stay out, enter)} 38

Properties of SPE The outcome that is selected by the backward induction procedure is always a NE of the game with perfect information. SPE is a stronger equilibrium concept than NE SPE eliminates NE that involve incredible threats. 39

Suppose WM threatens to enter no matter what Migros does. Is this a credible threat? Migros Aggressive Normal Wal-Mart Wal-Mart Enter Stay out Enter Stay out 680 730 700 800-50 0 400 040

A 3 Player Sequential Game 2 1 1 Left X P1 P2 Middle Y P3 P3 A B C D E Right F 3-2 4-3 10 3 2 1 4 3 2 2 5 6 3 5 3 3 P3 G -9 2 1 41

Backwards Induction Obviously, Player 3 s choices are B, C, and F in the three last period subgames. Eliminating the non-equilibrium strategies will make the game tree simpler. The game tree reduces to: 42

Reduced Game Tree 2 1 1 Left X P1 P2 Middle Y -2 4 1 4 5 6 Right 3 3 3 SPE is when player 1 plays middle, 2 plays Y, and 3 plays C. 43

A Critique of SPE What do you think player 1 would do, if he is not certain whether player 2 is rational or not, but he is certain that player 3 is rational? What do you think player 1 would do, if he is not certain whether player 3 is rational or not, but he is certain that player 2 is rational? 44

A 3 Player Sequential Game (cont.) 2 1 1 Left X P1 P2 Middle Y P3 P3 A B C D E Right F 3 3 P3 G -9 2 3-2 4-3 10 3 1 2 1 4 3 2 2 5 6 3 5 45

Limit Pricing Game Enter Entrant Stay out Incumbent Incumbent Maximize Maximize Limit P. Limit P. 540 loss 0 0 540 265 1275 865 46

Commitment Game Inflexible Technology Incumbent Flexible Technology Entrant Entrant Enter Enter Stay out Stay out 1000 2000 500 3000-100 0 500 0 47

Establishing Credibility Establish and use a reputation. Example: Do not negotiate with terrorists. Write contracts. Example: Supplier agrees to a punishment if he fails to deliver on time. Cut off communication. Example: Mesut Yilmaz government cutting off communication with EU. Burn bridges behind you. Example: Firm investing in inflexible technology. 48

Establishing Credibility (cont.) Leave the outcome to chance. Example: Automatic response to nuclear attacks. Move in small steps. Example: $1 million agreement versus 1000 sequential transactions limited to $1000. Develop credibility through teamwork. Example: Army requires soldiers to shoot deserters. Failing to kill a deserter gets the death sentence. Employ negotiating agents. Example: Union leader negotiating a wage increase instead of the worker. 49

50 JUST PLAYING! Repeated Games

51 Game 1 Firm 2 Low Output High Output Low Output High Output 40, 40 60, 30 30, 60 50, 50

52 Game 2 Greece War Peace War 1, 1 3, 0 Peace 0, 3 2, 2

53 REPEATED GAMES Repeated Normal Form Games

Prisoners Dilemma Revisited Suppose that the two suspects play the same game every time they get caught. Can they coordinate their choices in order to get the best outcome for both of them? Finitely repeated game Infinitely repeated game 54

55 Twice-repeated PD: First stage payoff matrix after adding NE payoffs of the second stage Prisoner 2 Confess Don t Confess Confess -10, -10-5, -15 Don t Confess -15, -5-7, -7

N-times repeated PD 56 In a finitely repeated (n times repeated game where n 2) PD game, the cooperative outcome (don t confess, don t confess) cannot be enforced. Since in the last stage (n th stage) the NE is (confess, confess) and all players know this, in all previous stages the same NE will prevail.

Infinitely Repeated PD When the game is played infinitely or players do not know when the game is going to end, the backward induction breaks down. Following trigger strategies can enforce the cooperative outcome. Trigger strategy: A player cooperates as long as the other players cooperate, but any defection from cooperation on the part of the rivals triggers the player to behave noncooperatively for a specified period of time (period of punishment). 57

Trigger Strategies Grim strategy: A trigger strategy in which the punishment period lasts till the end of the game. Grim strategy for PD game: Play don t confess in the first period. In period t, play don t confess if the outcome was (don t confess, don t confess) in all preceding t-1 periods, and play confess otherwise. 58

Trigger Strategies (cont.) Tit-for-tat (TFT): A trigger strategy in which the punishment period lasts as long as the rival keeps on cheating (returning back to cooperative periods of game play is possible). TFT strategy for PD game: Play don t confess in the first period. In period t, play don t confess if the rival s most recent play was (don t confess, don t confess), and play confess otherwise. 59

Axelrod s Tournament Axelrod s 4 rules for successful repeated PD game play: i) Don t be envious ii) Don t be the first to defect iii) Reciprocate both cooperation and defection iv) Don t be too clever 60

61 REPEATED GAMES (cont.) Repeated Extensive Form Games

Chain Store Example A Chain store has branches in K towns. There is a potential entrant (k) in each town. Chain store has to decide between fighting or accommodating entry in each town. The rival in the next town can observe how the chain store behaved in previous towns. 62

Game Tree in Town K k Enter Stay out Prey C. Store Acco. 0 5-1 -1 2 2 63

SPE of Chain Store Game In the last town the entrant will solve SPE and choose to enter. In town K-1 the entrant will do the same, for that matter in all previous towns the outcome will be the same. Solution: In every town, entry will occur and the chain store will accommodate. 64

Chain Store Paradox The incumbent has an incentive to prey on the first entrant and hence to scare off the entrants in other towns by establishing a predatory reputation. However, the second potential entrant will not be impressed (expecting a rational behavior from the C. store. In that case, there is no incentive for the C. store to prey in the first town. 65

The Paradox The result is counterintuitive The result is due to strict reliance on backward induction. Infinitely repeated version: predatory behavior is an equilibrium strategy. 66

A Critique of Backward Induction Longer chains of backward induction are more sensitive to small changes in the information structure of the game. Backward induction rules out any behavior that is contingent upon an event to which the theory assigns zero probability. 67

Managers May Choose which game to play with whom to play Which strategies are available to each player What payoff each outcome will yield Whether to play or not 68

TWO-PERSON ZERO-SUM GAME Have saddle point (pure strategy solution) EXAMPLE : Consider following zero-sum game c=0; Row player s strategy Column player s strategy Column 1 Column 2 Column 3 Row 1 4 4 10 4 Row 2 2 3 1 1 Row 3 6 5 7 5 Column max 6 5 10 Row min 70

TWO-PERSON CONSTANT-SUM GAME (Have saddle point (pure strategy solution) EXAMPLE There are 100 million audience shared by Network 1 and Network2 so c=100 Networ k 1 Network 2 Western Soap opera Comedy Western 35 15 60 15 Soap opera Row min 45 58 50 45 Comedy 38 14 70 14 Column max 45 58 70 71

GRAPHICAL SOLUTION When there is no saddle point Odds and Evens: Two players (Odd and Even) simultaneously choose the number of fingers (1 or 2) to put out. If the sum of fingers put out by both players is odd, Odd wins $1 from Even. İf the sum is even, Even wins $1 from odd. ROW PLAYER (ODD) COLUMN PLAYER (EVEN) 1 Finger 2 Finger Row minimum 1 Finger -1 +1-1 2 Finger +1-1 -1 Column Maximum +1 +1 x1 = Probability that Odd puts out 1 finger x2 = Probability that Odd puts out 2 fingers y1 = Probability that Even puts out 1 finger y2 = Probability that Even puts out 2 fingers 72

(0,1) A Expected reward to Odd Even picks 1 B E (1,1) Even picks 2 1 FINDING ODD S OPTIMAL STRATEGY x1 + x2 =1 x1, x2=1-x1 İf Even puts out 1 finger Odd reward = (-1)x1 + (+1)(1-x1)=1-2x1 AC İf Even puts out 2 fingers Odd reward = (+1)x1 + (-1)(1-x1)=2x1-1 DE Odd should choose mixed strategy x1=1/2 x2=1/2 x1 AC = Odd s reward with x1 if even picks 1 DE =Odd s reward with x1 if even picks 2 (0,-1) D C (1,- 1) 73

(0,1) A Expected reward to Odd Odd picks 1 B Odd picks 2 E (1,1) 1 FINDING EVEN S OPTIMAL STRATEGY y1 + y2 =1 y1, y2=1-y1 İf Odd puts out 1 finger Odd reward = (-1)y1 + (+1)(1-y1)=1-2y1 AC İf Odd puts out 2 fingers Odd reward = (+1)y1 + (-1)(1-y1)=2y1-1 DE Even should choose mixed strategy y1=1/2 y2=1/2 y1 AC = Odd s reward against y1 if odd picks 1 DE =Odd s reward against y1 if odd picks 2 (0,-1) D C (1,-1) 74

LINEAR PROGRAMMING AND ZERO-SUM GAMES Stone-Paper-Scissors Game ROW PLAYER COLUMN PLAYER Stone Paper Scissor Row minimum Stone 0-1 +1-1 Paper +1 0-1 -1 Scissor -1 +1 0-1 Column Maximum +1 +1 +1 75

COLUMN PLAYER ROW PLAYER Stone Paper Scissor Row minimum Stone 0-1 +1-1 Paper +1 0-1 -1 Scissor -1 +1 0-1 Column Maximum +1 +1 +1 THE ROW PLAYER S LP Column player will choose a strategy that makes the row player s expected reward equal to min (x2-x3, -x1+x3, x1-x2) Then row player will choose (x1,x2,x3) that makes min (x2-x3, -x1+x3, x1- x2) as large as possible Max z= v s.t. v <= x2 x3 (Stone constraint) v <= -x1+x3 (Paper constraint) v <= x1-x2 (Scissor constraint) x1+x2+x3 = 1 X1,x2,x3 >= 0; v urs 76

COLUMN PLAYER ROW PLAYER Stone Paper Scissor Row minimum Stone 0-1 +1-1 Paper +1 0-1 -1 Scissor -1 +1 0-1 Column Maximum +1 +1 +1 THE COLUMN PLAYER S LP Row player is assumed to know (y1,y2,y3) the row player will choose a strategy to ensure that she obtains an expected reward of Max (-y2+y3, y1-y3, -y1+y2) Thus the column player should choose (y1,y2,y3) to make Max (-y2+y3, y1- y3, -y1+y2) as small as possible Min z = w s.t. w >= -y2+y3 (Stone constraint) w >= y1-y3 (Paper constraint) w>= -y1+y2 (Scissor constraint) y1 +y2+ y3 = 1 y1, y2, y3 >=0, w urs 77

LINEAR PROGRAMMING AND ZERO-SUM GAMES EXAMPLE: Solve following zero sum game. ANSWER: This game has no saddle point and no dominated strategies so we set up the row and the column player s LP. ROW PLAYER COLUMN PLAYER Column 1 Column 2 Column 3 Row 1 30 40 36 30 Row 2 60 10 36 10 Column maximum 60 40 36 Row minimu m 78

COLUMN PLAYER Column 1 Column 2 Column 3 Row minimum ROW PLAYER Row 1 30 40 36 30 Row 2 60 10 36 10 Column maximum 60 40 36 ROW PLAYER S LP Max v s.t. v <= 30x1+ 60x2 v <= 40x1+ 10x2 v <= 36x1+ 36x2 x1 + x2 = 1 x1, x2, v >= 0 Substituting x2=1-x1 into the row player s LP yields Max v s.t. (a) v + 30x1 <= 60 (y1, or column 1, constraint) (b) v 30x1 <= 10 (y2, or column 2, constraint) (c) v <= 36 (y3, or column 3, constraint) x1,v >= 0 79

COLUMN PLAYER Column 1 Column 2 Column 3 Row minimum ROW PLAYER Row 1 30 40 36 30 Row 2 60 10 36 10 Column maximum 60 40 36 COLUMN PLAYER S LP Min w s.t. (a) w >= 30y1 + 40y2 + 36y3 (b) w >= 60y1 + 10y2 + 36y3 y1 + y2 + y3 = 1 y1, y2, y3, w >= 0 Substituting y3 = 1- y1 y2 into the column player s LP yields Min w s.t. (a) w + 6y1 4y2 >= 36 (x1, or row 1, constraint) (b) w 24y1 + 26y2 >= 36 (x2, or row 2, constraint) y1, y2, w >= 0 80

COLUMN PLAYER Column 1 Column 2 Column 3 Row minimum ROW PLAYER Row 1 30 40 36 30 Row 2 60 10 36 10 Column maximum 60 40 36 Solving ROW PLAYER S LP Max v s.t. (a) v + 30x1 <= 60 (b) v 30x1 <= 10 (c) v <= 36 x1,v >= 0 SOLUTION (v = 35, x1 = 5/6, x2 = 1/6) Solving COLUMN PLAYER S LP Min w s.t. (a) w + 6y1 4y2 >= 36 (b) w 24y1 + 26y2 >= 36 y1, y2, w >= 0 SOLUTION ( w = 35, y1 =1/2, y2 = 1/2, y3 = 0 ) 81

LINEAR PROGRAMMING AND ZERO-SUM GAMES EXAMPLE: Solve following zero sum game. ANSWER: This game has no saddle point and no dominated strategies so we set up the row and the column player s LP. COLUMN PLAYER Column 1 Column 2 Column 3 Row minimum ROW PLAYER Row 1 30 40 36 30 Row 2 50 30 30 30 Row 3 60 10 36 10 Column maximum 60 40 36 83

LINEAR PROGRAMMING AND ZERO-SUM GAMES COLUMN PLAYER Column 1 Column 2 Column 3 Row minimum ROW PLAYER Row 1 30 40 36 30 Row 2 50 30 30 30 Row 3 60 10 36 10 Column maximum 60 40 36 ROW PLAYER S LP Max v s.t. v <= 30x1+ 50x2+60x3 v <= 40x1+ 30x2 +10x3 v <= 36x1+ 30x2+36x3 x1 + x2+x3 = 1 x1, x2, x3, v >= 0 84

LINEAR PROGRAMMING AND ZERO-SUM GAMES COLUMN PLAYER Column 1 Column 2 Column 3 Row minimum ROW PLAYER Row 1 30 40 36 30 Row 2 50 30 30 30 Row 3 60 10 36 10 Column maximum 60 40 36 COLUMN PLAYER S LP Min w s.t. w >= 30y1 + 40y2+ 36y3 w >= 50y1 + 30y2+ 30y3 w >= 60y1 + 10y2+ 36y3 y1 + y2+ y3 = 1 y1, y2, y3, w >=0 85

LINEAR PROGRAMMING AND ZERO-SUM GAMES ROW PLAYER S LP Max v s.t. v <= 30x1+ 50x2+60x3 v <= 40x1+ 30x2 +10x3 v <= 36x1+ 30x2+36x3 x1 + x2+x3 = 1 x1, x2, x3, v >= 0 RESULT ( V=35,45, X1=0,787,X2=0,091,X3=0,121 ) COLUMN PLAYER S LP Min w s.t. (a) w >= 30y1 + 40y2+ 36y3 (b) w >= 50y1 + 30y2+ 30y3 (c) w >= 60y1 + 10y2+ 36y3 y1 + y2+ y3 = 1 y1, y2, y3, w >=0 RESULT ( W=35,45, Y1=0,272, Y2=0,272, Y3=0, 454) 86

TWO-PERSON NON-CONSTANT-SUM GAMES EXAMPLE : Prisoner s Dilemma PRISONER 2 Confess Don t Confess PRISONER 1 Confess (-5,-5) (0,-20) Don t confess (-20,0) (-1,-1) 88

TWO-PERSON NON-CONSTANT-SUM GAMES EXAMPLE : Hot Dog King and Hot Dog Chef advertisement budget Profit after advertisement budget HOT DOG CHEF HOT DOG KING Spend $10 million Spend $10 million Spend $6 million (2,2) (9,-1) Spend $6 million (-1,9) (6,6) 89

TWO-PERSON NON-CONSTANT-SUM GAMES EXAMPLE : USA and USSR arm race Benefit matrix after deducting missile cost USSR USA Develop new missile Develop new missile Maintain status quo (-10,-10) (10,-100) Maintain status quo (-100,10) (0,0) 90

Tepe Noktalı Oyunlar ve Tam Stratejiler Minimaks-Maksimin Kriteri Bir karar probleminin çözümünde, Minimaks-Maksimin kriteri diye adlandırılan çok tutucu veya temkinli bir kriteri kullanılır. Her rakibin diğerinin avantajının aksine çalıştığı veya uğraştığı gerçeğini dikkate almak için minimaks kriteri her oyuncunun (karışık veya saf) stratejisini seçer. Öyle ki bu strateji en kötü muhtemel sonuçların en iyisini versin. Eğer hiçbir oyuncu stratejisini değiştirmesini faydalı bulmuyorsa optimal bir çözüme ulaşılmıştır denir. Bu durumda oyun dengededir denir veya bir denge durumuna ulaşılmıştır denir. Oyun matrisi genellikle A oyuncusunun ödemeleri (kazanç) cinsinden ifade edildiğinden bunun stratejileri satırlarla gösterilir. Söz konusu tutucu (çekimser) kriter, A oyuncusunun saf veya karışık olmak üzere kendisini minimum kazancını maksimize eden stratejiyi seçmesini gerektirir. Burada minimum, B oyuncusunun bütün stratejisinin yönünden düşünülmektedir. Aynı düşünce tarzıyla B oyuncusu kendisini maksimum kayıplarını minimize eden stratejiyi seçer. Burada da maksimum A oyuncusunun bütün stratejileri yönünden düşünülür.

Tepe Noktası Oyunların en basiti tepe noktası (eyer) olan oyunlardır. Tepe noktayı bulmak için, oyun matrisinin satır minimum elemanı, sütun maksimum elemanına eşit ise oyunun tepe noktası vardır, denir. Oyunun tepe noktası aynı zamanda oyunun değeridir. Her oyunun birden fazla tepe noktası olabileceği gibi hiç olmayabilir de. Oyunun tepe noktası yoksa her oyuncunun optimal stratejisi karma olacaktır. Bu durumda aşağıdaki ilişkiler görülür. Maksimin değer <= oyun değeri <= minimaks değer. Yani; dir. Bu eşitsizlik oyunun alt ve üst sınırını belirler. Bu durum Nash Dengesi olarak ifade edilir.

Örnek Aşağıdaki A oyuncusunun kazancını gösteren ödemeler matrisini göz önüne alalım; Bu örnek, bir oyunun minimaks veya maksimin hesaplarını göstermektedir. A Oyuncusu B Oyuncusu Satır B1 B2 B3 B4 Minimumları A1 8 2 9 5 2 A2 6 5 7 18 5 * A3 7 3-4 10-4 Sütun 8 5 * 9 18 maksimumları Maksimin (alt) değer Oyun değeri Minimaks (üst) değer Yukarıdaki örnekte maksimin = 5, minimaks=5 ti. Bu nokta oyunun matrisinin (2, 2) girişiyle verilen bir oturma noktasına sahip olduğu anlamına gelir. Bu durumda oyunun değeri 5 e eşit olmaktadır. Görüldüğü gibi hiçbir oyuncu başka bir strateji seçerek daha iyisini bulamaz.

Tepe Noktasız Oyunlar ve Karma Stratejiler Bir m x n oyununun tepe noktası yoksa, özellikle büyük boyuttaki oyunların çözümü zor olabilir. Bu tip oyunlarda kazançları optimize etmek isteyen oyuncular karma stratejiler kullanmak zorundadırlar. Tüm oyunlar, her bir oyuncunun %100 olasılıkla yalnızca tek strateji seçtiği bir saf-strateji Nash dengesi biçiminde değildir. Bazı uygulamalarda oyuncular, olanaklı saf stratejileri olasılıklara dayalı olarak seçerler. Bu tür oyunlar karma stratejiye sahiptir. Bir karma strateji, bir oyuncunun her bir olanaklı saf stratejiyi oynayacağı olasılığı tanımlamaktadır Bundan önceki kısımda görüldüğü gibi bir oturma noktasının mevcudiyeti oyun için optimal (en uygun) saf stratejileri hemen vermektedir. Bununla beraber bazı oyunların oturma noktası yoktur. Mesela aşağıdaki sıfır toplam oyununu göz önüne alınız.

Örnek: Bu oyunun bir oturma noktası yoktur Bundan önceki kısımda görüldüğü gibi bir oturma noktasının mevcudiyeti oyun için optimal (en uygun) saf stratejileri hemen vermektedir. Bununla beraber bazı oyunların oturma noktası yoktur. Mesela aşağıdaki sıfır toplam oyununu göz önüne alınız. B 1 2 3 4 Satır Min. 1 5-10 9 0-10 2 6 7 8 1 1 A 3 8 7 15 2 2 *maximin 4 3 4-1 4-1 Sütun Min. 8 7 15 4 *minimax Burada minimax değer 4, maximin değer (2) den daha büyüktür. Dolayısıyla oyunun bir oturma noktası yoktur. Dolayısıyla maximin-minimax optimal değildir. Böyle olması doğrudur. Zira her oyuncu farklı bir strateji seçmek suretiyle kendi ödemesini kazancını iyileştirebilir, artırabilir.

Karma Stratejiler x 1, x 2,.., x m ile satır olasılıklarını ve y 1, y 2,.., y n ile de sütun olasılıklarını gösterelim. Bu olasılıklarla sırayla A ve B saf stratejilerini seçerler. Bu sebepten aij oyunun (i, j) inci girişini (elemanını) gösterirse xi ve yj aşağıdaki matristeki gibi görünürler. B y 1 y 2 y n x 1 a 11 a 12.. a 1n A x 2 a 21 a 22.. a 2n...... x m a m1 a m2 a mn Karışık strateji probleminin çözümünde daha önce gördüğümüz minimax kriterine dayanır. Aradaki yegane fark A nın bir kolondaki en küçük beklenen ödemeyi (kazanç) maximize eden x i yi seçerken B nin bir satırdaki en büyük ödemeyi kazancı minimize eden y j yi seçmesidir. Matematiksel olarak bir karışık strateji için minimax aşağıdaki gibi verilir. A oyuncusu, max min ( a i1 x i, a i2 x i,, a im x i ) bunu veren xi yi seçer, Burada (x i 0 ve x i = 1). x i i=1 i=1 i=1

Karma Stratejiler Aynı şekilde B ise, min max ( a ij y j, a ij y j,, a mj y j ) bunu veren y j yi seçer, y j j=1 j=1 j=1 (y j 0 ve y j = 1) Bu değerlere sırasıyla maximin ve minimax beklenen değerler denir. Yine saf stratejide olduğu gibi minimax beklenen ödeme bağıntısı burada da geçerlidir. x i ve y j optimal çözüme tekabül edince eşitlik geçerli olur ve meydana gelen değerler oyunun beklenen optimal değerine eşit olurlar. Bu sonuç minimax teorisinden çıkarılabilir ve burada ispatı verilmeksizin gösterilmiştir. Eğer, x i * ve y j * her iki oyuncu için optimal çözümler iseler, her ödeme veya kazanç elemanı aij, ( x i *, y j ) olasılığı ile irtibatlı olacaktır. Bundan dolayı oyunun beklenen değeri şöyledir. V* = a ij x i *y j * olacaktır.

İki kişi sıfır toplam oyunlarının çözüm yöntemleri İki kişi sıfır toplam oyunlarını xi ve yj nin optimal değerleri için birkaç metod mevcuttur. Biz sadece 2 metodu ele alacağız. (2xn) veya (mxn) oyunlarının çözümü için kullanılan grafik çözüm metodu ve (m x n) şeklindeki herhangi bir oyunun çözümü için lineer programlama metodu söz konusu metodlardır.

Grafik Çözüm Yöntemi (mx2 veya 2xn Oyunları için) Grafik çözümler en azından bir oyuncunun iki stratejisinin olduğu oyunlara uygulanır. Mesela aşağıdaki (2xn) oyununu gözönüne alalım. Oyunun bir oturma noktası olmadığı kabul ediliyor. A nın sadece iki stratejisi olduğundan, x 2 = 1- x 1 ; x 1 0, x 2 0 olacağı görülür. A nın beklenen ödemeleri (kazançları) yani B nin saf stratejilerine tekabul eden kazançları aşağıdaki gibi verilir. B nin saf stratejisi A nın beklenen ödemesi(kazancı) 1 (a 11 -a 21 )x 1 + a 21 2 (a 12 -a 22 )x 1 + a 22.. n B y 1 y 2.y n A x 1 a 11 a 12.. a 1n x 2 =1- a 21 a 22.. a 2n x 1 (a n -a 2n )x 1 + a 2n Buradan görüldüğü gibi A nın ortalama kazancı x 1 ile doğrusal (lineer) olarak değişir. Karışık strateji oyunları için minimax kriterine göre A oyuncusunun, minumum beklenen ödemelerini (kazançlarını) maxsimize eden x 1 değerini seçmesi gerekir. Bu ise, yukarıda x 1 in fonksiyonları olarak doğruların çizilmesi ile yapılır.

Örnek: (2 x 4) oyununu gözönüne alalım B 1 2 3 4 A 1 2 2 3-1 2 4 3 2 6 B nin saf A nın beklenen Stratejisi ödemesi(kazancı) 1-2x1 + 4 2 -x1 + 3 3 x1 +2 4-7x1 +6 Bu dört doğru sonra şekildeki gibi x1 in fonksiyonları olarak çizilirler. Şekilden de görüleceği gibi maxsimin değeri x1* = 1/ 2 de meydana gelir. Bu da 3 ve 4 doğrularının ikisinin kesişme noktasıdır. x1 +2 ve -7x1 +6 eşitlenip çözülür. Sonuç olarak, A nın optimal stratejisi x1* = 1/2, x2* = 1/2 dir. Oyunun değeri, maximin noktasından geçen herhangi bir doğrunun denkleminde x1 i yerine koymak suretiyle elde edilir. Bu ise bize şu sonuçları verir. V*=1/ 2 +2=5/2 veya -7(1/2)+6=5/2 Bu oyunun oturma noktası yoktur. Bu sebepten B nin saf stratejilerine tekabül eden, A nın beklenen ödemeleri aşağıdaki gibi verilirler. (1) (2) (3) (4) y1=-2x1+4 y1=-x1+3 y1=x1 +2 y1=-7x1+6 x1=0 y1=4 x1=0 y1=3 x1=0 y1=2 x1=0 y1=6 x1=1 y1=2 x1=1 y1=2 x1=1 y1=3 x1=1 y1=-1

(m x n) Şeklindeki Oyunların Lineer Programlama İle Çözümü Oyun teorisi LP ile kuvvetli bir bağlantı içindedir. Zira her sonlu iki şahıs sıfır toplamlı oyun bir LP olarak ifade edilebilir ve bunun tersi olarakta her LP bir oyun olarak temsil edilebilir. Oyunların LP ile çözülmesi büyük matrisli oyunlar için gerekli ve kullanışlı olmaktadır. Karışık stratejiler bahsinde gösterildiği gibi A nın optimum karışık stratejileri aşağıdaki ifadeyi karşılarlar: max min( a i1 x i, a i2 x i,.., a in x i ) ilgili kısıtlar x i 0 i=1, 2,, m xi i=1 i=1 i=1 Bu problem aşağıdaki şekilde LP formuna sokulabilir. V=min( a i1 x i, a i2 x i,.., a in x i ) olsun. O zaman problem şöyle olur: i=1 i=1 i=1 max Z = V Kısıtlar a ij x i V xi 1 xi 0 ;j=1, 2,., n Burada V oyunun değerini gösterir.

Örnek 1: Aşağıdaki (3 x 3) oyununu gözönüne alalım. Aşağıda A ve B oyuncuları arasında oynanan oyunun matrisi verilmiştir. Oyunun değerini ve her iki oyuncunun optimal strateji vektörünü bulunuz. B oyuncusu B 1 B 2 B 3 A 1 3 2 3 A oyuncusu A 2 2 3 4 A 3 5 4 2 Çözüm : B oyuncusu için yj teriminde çözümlenecek doğrusal programlama problemi; Maksimum y o = y 1 + y 2 + y 3 Kısıtlar 3y 1 + 2y 2 + 3y 3 1 2y 1 + 3y 2 + 4y 3 1 5y 1 + 4y 2 + 2y 3 1 y 1, y 2, y3 0

Simplex metodu ile problemi çözmek için önce kısıtlayıcı denklemlerin sol tarafına S 1, S 2, S 3 aylak değişkenlerini ekleyelim. Maksimum y 0 = y 1 + y 2 + y 3 + 0S 1 + 0S 2 + 0S 3 Kısıtlayıcılar 3y 1 + 2y 2 + 3y 3 + S 1 = 1 2y 1 + 3y 2 + 4y 3 + S 2 = 1 5y 1 + 4y 2 + 2y 3 + S 3 = 1 y1, y 2, y 3, S 1, S 2, S 3 0 Başlangıç simpleks tablosu aşağıdaki gibidir. 1 1 1 0 0 0 cj y 1 y 2 y 3 S 1 S 2 S 3 Çözüm 0 S 1 3 2 3 1 0 0 1 0 S 2 2 3 4 0 1 0 1 0 S 3 5 4 2 0 0 1 1* zj 0 0 0 0 0 0 0 cj zj = Y o 1* 1 1 0 0 0

İkinci simplex çözüm matrisi aşağıdaki şekilde optimal çözümü vermektedir. cj 0 y 3 0 s 2 0 y 1 zj cj zj = Yo 1 1 1 0 0 0 y 1 y 2 y 3 S 1 S 2 S 3 Çözüm 0-19/6 0 1-9/16-3/8 1/16 0 7/16 1 0 5/16-1/8 3/16 1 5/8 0 0-1/8 1/4 1/8 0 17/16 1 0 3/16 1/8 5/16 0-1/16 0 0-3/16-1/8 Buna göre B oyuncusunun optimal strateji vektörünü belirleyecek değerler, doğrusal programlama ile; y 1 = 1/8, y 3 = 3/16, S 1 aylak değişken olup gerçek değeri yoktur. Oyunun değeri ise V = 1 / Y 0 = 1/5 / 16 = 3. 2 dir. B oyuncusunun gerçek değeri optimal stratejileri q 1 = y 1 * V, q 3 = y 3 *V formüllerinin değeri yerine konularak bulunur. q 1 = 1/8 * 16/5 = 0. 4 q 3 = 3/16 *16/5 = 0. 6 B oyuncusu % 40 oranda B 1 stratejisini ve % 60 oranında B 3 stratejisini uygularsa rakibinin A oyuncusunun kabul edebileceği bir düzeyde oyun değeri belirlemiş olacaktır. Bu değerde V = 3. 2 dir.