Bits Randômicos: Othello e Minimax com poda alpha beta

No post anterior criamos o framework básico do game e vimos (de forma bem simplificada) alguns conceitos sobre o Minimax. Neste post vamos colocar esses conceitos em prática, além de utilizar poda alfa - beta e ordenação de lances para melhorar o desempenho da engine, fundamental caso você esteja desenvolvendo para alguma plataforma mobile.

Classe MinimaxOthello

Entender essa classe é o objetivo deste tópico. O Minimax com poda alfa-beta é um desses casos onde é fácil entender o algoritmo, mas é difícil visualizar o mecanismo.

Ao longo do algoritmo, mantemos referência a variável de instância bestFound, que representa o melhor lance encontrado até então. Por ser recursivo e consumir bastante CPU, a cada lance da AI criamos uma nova Thread e esperamos run() retornar, após isso obtemos o melhor lance com getBestFound(). Repare no construtor, que precisa saber qual posição queremos analisar, qual a profundidade máxima que iremos atingir, e se queremos ou não analisar até o fim do jogo. Obviamente, analisar a árvore de posições até o fim só é possível na fase final do jogo, e o caller do algoritmo deve decidir quando isso acontece.

No começo do algoritmo, checamos se atingimos uma posição final ou a profundidade máxima. No primeiro caso, retornamos um valor evalGameOver na faixa -INFINITO < evalGameOver < INFINITO, que será positivo caso currentPlayer tenha vencido, negativo caso tenha perdido, ou zero no empate. Caso a profundidade máxima seja atingida e solve == false, retornamos o valor da posição atingida, caso solve == true, seguimos expandindo a árvore até chegar em um nó terminal.

Seguindo em frente, obtemos uma lista com os lances válidos para a posição atual e os ordenamos, logo após setar a variável eval de cada lance, com o método estático scoreMoves da classe OthelloHeuristics. Em seguida vem a parte mágica: para cada lance da lista, aplicamos esse lance na posição atual, e invocamos recursivamente o algoritmo. Ao invocar a recursão, o objeto board será alterado, e é por isso que na próxima linha desfazemos o lance com undoMove. Podiamos criar uma cópia profunda do objeto board e usar esta cópia na recursão, eliminando undoMove, mas criando milhares (ou milhões) de novos objetos a cada chamada. Nem sempre é fácil implementar undoMove, mas vale bastante a pena em termos de memória (mais tarde vamos olhar a variável calls). O algoritmo é intrincado, mas a estrutura é a mesma para qualquer aplicação. Segue a classe:


import java.util.*;
import java.util.concurrent.*;

public class MinimaxOthello implements Runnable
{
    private CountDownLatch doneSignal;    
    private int maxDepth;
    private int calls;    
    private OthelloMove bestFound;
    private OthelloBoard board;
    private static float INFINITY = Float.MAX_VALUE/1000;    
    private boolean solve = false;
    private Comparator<OthelloMove> comparator = Collections.reverseOrder(new MoveComparator());
           
    public MinimaxOthello (OthelloBoard board, int maxDepth, CountDownLatch doneSignal, boolean solve)
    {
        this.board = board;        
        this.bestFound = new OthelloMove();
        bestFound.setPlayer(board.getCurrentPlayer());
        this.maxDepth = maxDepth; 
        this.doneSignal = doneSignal;                
        this.solve = solve;
    }
    
    public OthelloMove getBestFound()
    {       
        return this.bestFound;
    }
    public void run()
    {        
        float val = minimax(board, bestFound, -INFINITY, INFINITY, 0);
        System.out.println("calls: " + calls);
        System.out.println("eval: " + val);
        System.out.println();
        doneSignal.countDown();        
    }
    
    private float minimax(OthelloBoard board, OthelloMove best, float alpha, float beta, int depth)
    {
        calls++;             
        OthelloMove garbage = new OthelloMove();             
        int currentPlayer = board.getCurrentPlayer();
        
        if (board.checkEnd())
        {                        
            int bd = board.countDiscs(OthelloBoard.BLACK);
            int wd = board.countDiscs(OthelloBoard.WHITE);
            
            if ((bd > wd) && currentPlayer == OthelloBoard.BLACK)
            {                
                return INFINITY/10;
            }
            else if ((bd < wd) && currentPlayer == OthelloBoard.BLACK)
            {                
                return -INFINITY/10;
            }
            else if ((bd > wd) && currentPlayer == OthelloBoard.WHITE)
            {                
                return -INFINITY/10;
            }
            else if ((bd < wd) && currentPlayer == OthelloBoard.WHITE)
            {                
                return INFINITY/10;
            }
            else 
            {                
                return 0.0f;
            }
        }
        if (!solve)
        {
            if (depth == maxDepth)
                return OthelloHeuristics.eval(currentPlayer, board);
        }
        
        ArrayList<OthelloMove> moves = board.getAllMoves(currentPlayer);
        if (moves.size() > 1)
        {
            OthelloHeuristics.scoreMoves(moves);        
            Collections.sort(moves, comparator);
        }
        
        for (OthelloMove mv : moves)
        {                                    
            board.makeMove(mv);            
            float score = - minimax(board, garbage, -beta,  -alpha, depth + 1);           
            board.undoMove(mv);             
            
            if(score > alpha)
            {  
                alpha = score;                
                best.setFlipSquares(mv.getFlipSquares());
                best.setIdx(mv.getIdx());        
                best.setPlayer(mv.getPlayer());                              
            }
            
            if (alpha >= beta)
                break;                
            
        }            
        return alpha;
    }  
}

Classe OthelloHeuristics

Apesar da simplicidade, o jogo Othello é bastante rico em termos de estratégia. Alguns conceitos encontrados no xadrez também se aplicam ao jogo: mobilidade, controle do centro, aberturas e tempo, para citar alguns.

Como sou um péssimo jogador, uma breve pesquisa me esclareceu os seguintes pontos:

Apenas contar peças é uma péssima estratégia, quase sempre levando a derrota
O controle das 'quinas' (corner squares) e 'bordas' (edge squares) é fundamental
A mobilidade é fundamental

Existem outros conceitos igualmente importantes ('paridade' por exemplo) mas a lista acima já é um bom começo. As estratégias acima estão relacionadas com o conceito de 'disco estável', que são os discos que uma vez conquistados, não podem mais ser 'flipados' pelo seu adversário. Repare que ao conquistar uma quina, podemos utilizá-la como pivô para novos discos estáveis, o que geralmente garante uma boa vantagem. É claro que o seu adversário também sabe disso, e não vai te entregar as corner squares de graça, o que nos leva ao conceito de mobilidade.

Para forçar seu adversário a fazer lances ruins, devemos ter uma mobilidade superior (maior número de opções a cada jogada) e tirar proveito disso forçando o adversário a entregar uma ou mais quinas. Geralmente lances adjacentes a quina são fracos, pois levam a perda desta. As duas casas adjacentes lateralmente a uma quina são chamadas de 'C squares', e a casa diagonalmente adjacente a quina é conhecida como 'X square'.

Esses são os fatores que vamos considerar na nossa heurística, o que já garante um desempenho razoável contra um jogador mediano. Um dos fatores que não estamos levando em conta e que melhoraria muito a força da engine, são os edge patterns, ou padões da borda. Se levarmos em conta a quina, cada borda possui 3¹⁰ = 59049 configurações, que geralmente são pré-calculadas e acessadas durante a avaliação da posição.

A heurística utilizada é bem simples: dependendo da fase do jogo, atribuímos pesos diferentes para cada característica abaixo:

mobilidade

mobilidade potencial

corner squares

C squares

X squares

número de discos

Observe que na fase final do game consideramos apenas o número de discos, já que a busca ocorre até uma posição finalizadora ser alcançada. A mobilidade potencial conta o número de casas vazias com pelo menos uma peça adversária ao redor. Definimos também uma array que mapeia o valor individual de cada casa, o que nos ajudará a ordenar os lances e aumentar a performance da engine. Repare que as casas adjacentes a uma quina não são consideradas desvantagem caso a quina seja nossa, os métodos badXSquares e badCSquares levam isso em consideração. Segue a classe:

import java.util.ArrayList;

public class OthelloHeuristics
{
    private static final int[] cornerIndexes = new int[]{11, 18, 81, 88};
    private static final int[] xIndexes = new int[]{22, 27, 72, 77};
    private static final int[] cIndexes = new int[]{12, 21, 17, 28, 71, 82, 78, 87};
    private static final int[] squareValues = new int[]
    {
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 120, -20, 20, 5, 5, 20, -20, 120, 0, 
        0, -20, -40, -5, -5, -5, -5, -40, -20, 0, 
        0, 20, -5, 15, 3, 3, 15, -5, 20, 0, 
        0, 5, -5, 3, 3, 3, 3, -5, 5, 0, 
        0, 5, -5, 3, 3, 3, 3, -5, 5, 0, 
        0, 20, -5, 15, 3, 3, 15, -5, 20, 0, 
        0, -20, -40, -5, -5, -5, -5, -40, -20, 0, 
        0, 120, -20, 20, 5, 5, 20, -20, 120, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    };
    
    public static int mobility (int player, OthelloBoard board)
    {
        return board.getAllMoves(player).size();      
    }
 
    public static int mobilityDiff (int player, OthelloBoard board)
    {           
        return mobility(player, board) - mobility(board.getOpponent(player), board);
    }
    
    public static int potentialMobility(int player, OthelloBoard board)
    {
        int opponent = board.getOpponent(player);
        int[] cells = board.getCells();
        int mob = 0;
        
        for (int i = 10; i < 90; i++)
        {
            int col = i % 10;
            if (col != 0 && col != 9)
            {
                if (cells[i] == opponent)
                {
                    for (Integer dir : OthelloBoard.DIRECTIONS)
                    {
                        int around = dir + i;
                        if (cells[around] == OthelloBoard.EMPTY && cells[around] != OthelloBoard.WALL)
                        {
                            mob++;
                            break;
                        }
                    }
                }
            }
        }
        return mob;
    }
    
    public static int potentialMobilityDiff(int player, OthelloBoard board)
    {
     return potentialMobility(player, board) - potentialMobility(board.getOpponent(player), board);
    }
         
    public static int cornerSquares (int player, OthelloBoard board)
    {   
        int corners = 0;  
        int[] cells = board.getCells();
        
        for(int i = 0; i < cornerIndexes.length; i++)
            if (cells[cornerIndexes[i]] == player)
                corners++;
        return corners;
    }
 
    public static int cornerSquaresDiff (int player, OthelloBoard board)
    {
        return cornerSquares(player, board) - cornerSquares(board.getOpponent(player), board);
    }
    
    public static int badXSquares (int player, OthelloBoard board)
    {
        int x = 0;
        int[] cells = board.getCells();

        for (int i = 0; i < 4; i++)
            if ((cells[cornerIndexes[i]] != player) && (cells[xIndexes[i]] == player))
                      x++;

        return x;
    }
    
    public static int badXSquaresDiff (int player, OthelloBoard board)
    {
        return badXSquares(player, board) - badXSquares(board.getOpponent(player), board);
    }
    
    public static int badCSquares (int player, OthelloBoard board)
    {  
        int c = 0;
        int[] cells = board.getCells();
        
        int corner = 0;                             
        for (int i = 0; i < cIndexes.length; i+= 2)
        {
            if (cells[cornerIndexes[corner++]] != player)
            {
                if (cells[cIndexes[i]] == player)
                    c++;
                if (cells[cIndexes[i + 1]] == player)
                    c++;               
            }
        }
                
        return c;
    }
    
    public static int badCSquaresDiff (int player, OthelloBoard board)
    {
        return badCSquares(player, board) - badCSquares(board.getOpponent(player), board);
    }
        
    public static int discsDiff (int player, OthelloBoard board)
    {
        int playerDiscs = board.countDiscs(player);
  int opponentDiscs = board.countDiscs(board.getOpponent(player));
  
  return playerDiscs - opponentDiscs;        
    }
    
    public static void scoreMoves(ArrayList<OthelloMove> moves)
    {
        for (OthelloMove mv : moves)
        {
            mv.setEval(squareValues[mv.getIdx()]);
        }
    }
    
    public static float eval(int player, OthelloBoard board)
    {
        float value = 0.0f;
        int phase = board.getPhase();
    
        if (phase == OthelloBoard.PHASE_OPENING)
        {                        
            value = 100 * mobilityDiff(player, board) + 100 * potentialMobilityDiff(player, board)
            + 800 * cornerSquaresDiff(player, board) 
            - 200 * badXSquaresDiff(player, board) - 200 * badCSquaresDiff(player, board);        
                                  
        }
        else if (phase == OthelloBoard.PHASE_MIDGAME)
        {                        
            value = 100 * mobilityDiff(player, board) + 100 * potentialMobilityDiff(player, board)
            + 900 * cornerSquaresDiff(player, board) 
            - 250 * badXSquaresDiff(player, board) - 200 * badCSquaresDiff(player, board); 
        }
        else if (phase == OthelloBoard.PHASE_ENDGAME)
        {
            value = discsDiff(player, board);
        }
        else
        {
            System.out.println("Erro na heuristica: fase indeterminada.");
        }
                        
        return value;
    }
}

Testando

Já temos as peças necessárias para criar a engine, o que nos falta é implementar um game loop e criar uma GUI simples. A classe OthelloPiece nos ajuda a desenhar as peças. Segue a classe:

import javax.swing.*;
import java.awt.*;
import java.awt.geom.*;

public class OthelloPiece
{
 private int color;
 private int size;
 
 public OthelloPiece(int color, int size)
 {
  this.color = color;
  this.size = size;
 }
 
 public void draw(Graphics g, int x, int y)
 {                                            
  Color col = null;
  if (color == OthelloBoard.WHITE)
   col = Color.white;            
  else
   col = Color.black;
   
  g.setColor(col);
  g.fillOval(x, y, size, size);          
 
 }
}

A classe OthelloPanel abaixo implementa a GUI e o game loop:

import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import java.util.ArrayList;
import java.util.Random;
import java.util.concurrent.*;

public class OthelloPanel extends JPanel implements Runnable
{
    private int cellSize;
    private int pieceSize;
    private OthelloBoard board = null;
    private int humanColor;
    private int cpuColor;
    private boolean waitingInput = false;
    private boolean finished = false;
    public static final int HUMAN = 100;
    public static final int CPU = 200;
    private int lastMoveIndex;
    
    public OthelloPanel(int cellSize, int startPlayer)
    {
        board = new OthelloBoard();                  
                                
        if (startPlayer == HUMAN)
            this.humanColor = OthelloBoard.BLACK;
        else
            this.humanColor = OthelloBoard.WHITE;
        
        this.cpuColor = 3 - this.humanColor;
        
        this.cellSize = cellSize;
        this.pieceSize = (int)(cellSize * 0.75);
        this.setDoubleBuffered(true);
        int screenSize = (2 + OthelloBoard.BOARD_SIZE) * cellSize;                
        this.setPreferredSize(new Dimension(screenSize, screenSize));                                          
        this.addMouseWatcher();
        
        UIManager.put("OptionPane.noButtonText", "N\u00E3o");
        UIManager.put("OptionPane.yesButtonText", "Sim");        
     }            
    
    private boolean isFinished()
    {       
        return board.checkEnd();       
    }
    
    private boolean isHumanTurn()
    {                    
        return humanColor == board.getCurrentPlayer();                
    }
    
    private void doLogic()
    {            
        if (isFinished())
        {
            finished = true;
            return;
        }
        
        //verifica somente uma vez se existem lances validos
        //caso existam, espera por input
        //caso contrario, faz lance fantasma
        if (isHumanTurn())
        {   
            if (waitingInput)
                return;
                        
            ArrayList<OthelloMove> moves = board.getAllMoves(humanColor);
            if (moves.get(0).getFlipSquares() == null)
            {                
                board.makeMove(moves.get(0));                
                waitingInput = false;
            }            
            else    
            {
                waitingInput = true;
            }            
        }
        
        //dispacha uma nova Thread minimax, e espera a busca terminar      
        else               
        {   
            int maxDepth = 6; 
            boolean solve = board.getPhase() == OthelloBoard.PHASE_ENDGAME;
            
            CountDownLatch signal =  new CountDownLatch(1);
            MinimaxOthello AI = new MinimaxOthello(board.cloneBoard(), maxDepth, signal, solve);
                   
            new Thread(AI).start(); 
            try
            {
                signal.await();
            }
            catch(InterruptedException exception)
            {
                exception.printStackTrace();
            }
                        
            OthelloMove mv = AI.getBestFound();
            board.makeMove(mv);    
            //repaint();
            lastMoveIndex = mv.getIdx();
            System.out.println("AI move: " + mv.getIdx());
                                    
        }
        
    }
      
    private void doInput(int index)
    {                                       
        if (isHumanTurn())
        {
            int[] cells = board.getCells();
            if (cells[index] != OthelloBoard.EMPTY)
                return;
            
            ArrayList<Integer> flips = board.getFlips(index, humanColor);            
            if (flips.size() > 0)
            {                          
                OthelloMove mv = new OthelloMove();
                mv.setFlipSquares(flips);
                mv.setIdx(index);
                mv.setPlayer(humanColor);                
                board.makeMove(mv); 
                lastMoveIndex = index;
                repaint();
                waitingInput = false;                
            }
        }
    }
        
    public void startNewGame()
    {
        board = new OthelloBoard();
        finished = false;
        waitingInput = false;
        lastMoveIndex = -666;
        
        //troca cores
        int temp = cpuColor;
        cpuColor = humanColor;
        humanColor = temp;
        
        new Thread(this).start();
    }
    
    public void run()
    {
        while (! finished)
        {                              
            this.doLogic();
            this.repaint();
                                  
            try
            {
                Thread.sleep(20L);
            }
            catch (InterruptedException exception)
            {}                                                         
        }
        
        //fim de jogo
        int human = board.countDiscs(humanColor);
        int cpu = board.countDiscs(cpuColor);  
        int empty = OthelloBoard.BOARD_SIZE * OthelloBoard.BOARD_SIZE - (human + cpu);        
        String msg = null;
        
        if (human  > cpu)
        {
            human += empty;
            msg = "Voce venceu por " + human + " - " + cpu + "!";
            
        }
        else if (human < cpu)
        {
            cpu += empty;
            msg = "Voce perdeu por " + cpu + " - " + human + "!";            
        }
        else
            msg = "Empate!";
        
        msg += "\nDeseja jogar uma nova partida?";
        int x = JOptionPane.showConfirmDialog(null, msg, "Game Over", JOptionPane.YES_NO_OPTION);
        
        if (x == 0)
            startNewGame();
    }
    
    private void drawGUI(Graphics g)
    {                
        Graphics2D g2D = (Graphics2D) g;
        g2D.setStroke(new BasicStroke(1.5f));
        
        //cor de fundo
        Color bg = new Color(0, 130, 0);        
        setBackground(Color.gray);
        
        int w = OthelloBoard.BOARD_SIZE * cellSize;
        int origin = cellSize;
                
        //linhas horizontais
        g.setColor(Color.black);
                
        for (int i = 0; i < OthelloBoard.BOARD_SIZE + 1; i++)
        {
            g.drawLine(origin, i * cellSize + origin, origin + w, i * cellSize + origin);
        }
        //linhas verticais
        for (int i = 0; i < OthelloBoard.BOARD_SIZE + 1; i++)
        {
            g.drawLine(i * cellSize + origin, origin, i * cellSize + origin, w + origin);
        }
        
        g.setColor(Color.white);
    
        //headers
        String[] headers = new String[]{"A","B","C","D","E","F","G","H"};
        for (int i = 0; i < OthelloBoard.BOARD_SIZE; i++)
            g.drawString(headers[i], cellSize * i + (int)(origin * 1.5), (int)(cellSize * 0.75));
        
        //linhas
        for (int i = 0; i < OthelloBoard.BOARD_SIZE; i++)
            g.drawString("" + (i + 1), (int)(cellSize * 0.6), cellSize * i + (int)(origin * 1.5));     
                                                 
    }
    
    private void drawScore(Graphics g)
    {
        //score
        g.setColor(Color.yellow);
        g.setFont(new Font(null, Font.BOLD, 15));
        int wd = board.countDiscs(OthelloBoard.WHITE);
        int bd = board.countDiscs(OthelloBoard.BLACK);
        
        g.drawString("White: " + wd, cellSize * 1, (int)(cellSize * 9.5));
        g.drawString("Black: " + bd, cellSize * 8, (int)(cellSize * 9.5));
               
    }
    private void drawBoard(Graphics g)
    {
        int[] cells = board.getCells();                
        int gameCols = OthelloBoard.BOARD_SIZE;
        int hiddenCols = gameCols + 2;
        int mid = (cellSize - pieceSize)/2;
        
        for (int i = hiddenCols; i < cells.length - hiddenCols; i++) 
        {
            int col = i % hiddenCols;
            int row = i / hiddenCols;
            
            if ((col != 0) && (col != hiddenCols - 1))
            {
                int piece = cells[i];
                if (piece == OthelloBoard.EMPTY)
                    continue;
                OthelloPiece p = new OthelloPiece(cells[i], pieceSize);
                p.draw(g, col * cellSize + mid, row * cellSize + mid);
            }
        }
        
        //lance fantasma idx = -666
        if (lastMoveIndex > 0)
        {
            g.setColor(Color.yellow);
            int col = lastMoveIndex % hiddenCols;
            int row = lastMoveIndex / hiddenCols;
            
            g.drawRect(col * cellSize, row * cellSize, cellSize, cellSize);
        }
    }
        
    private void drawMoves(Graphics g)
    {
        ArrayList<OthelloMove> moves = board.getAllMoves(humanColor);
        if (isHumanTurn() && moves.get(0).getFlipSquares() != null)
        {
            g.setColor(new Color(0, 0, 0, 80));
            for (OthelloMove mv : moves)
            {
                int idx = mv.getIdx();
                int col = idx % 10;
                int row = idx /10;
                                
                g.fillOval(col * cellSize + cellSize/2, row * cellSize + cellSize/2, cellSize/8, cellSize/8);
            }
        }
    }
           
    public void paintComponent(Graphics g)
    {
        super.paintComponent(g);
            
        Graphics2D g2 = (Graphics2D) g;
        g2.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
        
        this.drawGUI(g);     
        this.drawBoard(g);
        this.drawScore(g);
        this.drawMoves(g);
    }
    
    //mouse
    private void addMouseWatcher()
    {
        this.addMouseListener(new MouseAdapter()
        {
            public void mousePressed(MouseEvent e) 
            {                                
                int col = e.getX()/cellSize;
                int row = e.getY()/cellSize;
                int index = row * (OthelloBoard.BOARD_SIZE  + 2) + col;
                                               
                if ((row > 0 && row < 9) && (col > 0 && col < 9))
                    doInput(index);                                                
            }          
        });
    }
                                                                               
    public static void main (String[] args) throws Exception
    {                                                                    
        OthelloPanel panel = new OthelloPanel(60, HUMAN);
        JFrame window = new JFrame("Othello");
        window.getContentPane().add(panel);
        window.setResizable(false);
        window.pack();
        window.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        window.setVisible(true);     
        new Thread(panel).start();                                                                                                      
    }    
}

Eu particularmente não consegui vencer a engine, pois como disse sou péssimo jogador. Testando contra algumas applets o desempenho foi bem convincente, perdendo apenas para os programas que utilizam inúmeros padrões pré-calculados e implementam uma heurística mais refinada. Bons jogos e até o próximo post!

Bits Randômicos

sexta-feira, 9 de março de 2012

Othello e Minimax com poda alpha beta - parte 2

Classe MinimaxOthello

Classe OthelloHeuristics

Testando

Nenhum comentário:

Postar um comentário