0% found this document useful (0 votes)
36 views24 pages

Chess

The document outlines a capstone project focused on developing a chess engine that integrates Monte Carlo Tree Search and deep neural networks to enhance decision-making in chess. It details the rules of chess, the architecture of the AI system, including data representation and CNN architecture, and the training process for the AI. The project aims to create a competitive chess-playing AI capable of learning and improving through self-play.

Uploaded by

namedc221
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views24 pages

Chess

The document outlines a capstone project focused on developing a chess engine that integrates Monte Carlo Tree Search and deep neural networks to enhance decision-making in chess. It details the rules of chess, the architecture of the AI system, including data representation and CNN architecture, and the training process for the AI. The project aims to create a competitive chess-playing AI capable of learning and improving through self-play.

Uploaded by

namedc221
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

SCHOOL OF INFORMATION AND COMMUNICATIONS TECHNOLOGY


------

INTRODUCTION TO ARTIFICIAL
INTELLIGENCE
Topic: Chess with AI
Students & IDs:

Pham Van Vu Hoan – 20235497

Le Hoang Nam – 20235536

Ngo Nguyen Ngoc - 20235538

Phan Hai Nguyen – 20235540

Tran Nam Tuan Vuong - 20225540

Class ID: 152630

Lecturer: Le Thanh Huong


Hanoi, 2024

1
1. Introduction ........................................................................................... 1
1.1. Problem Statement ......................................................................................... 1

1.2. Rule of chess .................................................................................................. 2

1.3. Description ..................................................................................................... 4

2. About the problem................................................................................. 7


2.1. Dataset............................................................................................................ 7

2.2. CNN Architecture .......................................................................................... 8


2.1.1. Layers of the architecture ................................................................................ 8
2.1.2. CNN Architecture .......................................................................................... 12

2.3. Decision-making .......................................................................................... 14

2.4. Data Generation ........................................................................................... 16

2.5. Training ........................................................................................................ 16

3. Evaluation ............................................................................................ 19
3.1. Environment................................................................................................. 19

3.2. Performance Comparison ............................................................................ 19

3.3. Data Generation and Model Management ................................................... 20

3.4. Testing.......................................................................................................... 20

4. Conclusion ............................................................................................ 21

References ................................................................................................ 23

1. Introduction
1.1. Problem Statement
Chess is a timeless and universally recognized board game that combines
strategy, foresight, and creativity. With its origins tracing back over a
CAPSTONE PROJECT REPORT – Chess with AI
millennium, the game has evolved into one of the most studied and competitive
activities worldwide. The enduring appeal of chess lies in its blend of simple
rules and complex decision-making, offering an unparalleled platform for
intellectual challenge.
From an artificial intelligence (AI) perspective, chess has long been considered
the ultimate "thinking" game, making it a benchmark for evaluating and
advancing AI technologies. The game's complexity arises from its vast search
space; there are approximately 10120 possible positions in chess, far exceeding
the number of atoms in the observable universe. This combinatorial explosion
makes it impractical to solve chess using brute force alone, requiring
sophisticated algorithms and heuristics to guide decision-making.
Historically, chess has played a pivotal role in the development of AI. A
landmark moment was the 1997 match between IBM's Deep Blue and world
chess champion Garry Kasparov, where Deep Blue became the first computer
system to defeat a reigning champion in a full match under standard time
controls. This achievement demonstrated the power of computational
advancements and ushered in a new era of AI research.
Today, chess remains a fertile ground for innovation in AI. Modern chess
engines like Stockfish and AlphaZero utilize advanced techniques such as
neural networks and reinforcement learning, setting new standards for AI
capabilities. Beyond competitive play, chess AI serves as a valuable tool for
education, analysis, and entertainment, showcasing its multifaceted importance
in both technological and societal contexts.
1.2. Rule of chess

A fully functional chess-playing AI, however “smart” it could be, must


first be aware of basic rules and play along with them. Chess is played on a
chessboard, a square board divided into a grid of 64 squares with alternating
colors between a dark one (e.g., black, gray, green) and a light one (e.g., white).

2
CAPSTONE PROJECT REPORT – Chess with AI
Each player has 16 pieces to control with 6-piece types: King, Queen, Rook,
Bishop, Knight and Pawn.

Each game begins with the white side making the first move, after which
each side takes turn to move their pieces. The game ends once the king on one
side is checkmated, a state where the King is attacked by the opponent’s
piece(s) and has no escape move. A game can be proposed to be a draw if there
are no more possible series of moves for a side to claim victory, or if both
players make 50 consecutive moves without capturing any pieces or moving any
pawns.

Figure 1: A standard chessboard

Each piece (or piece type) on the chess board has different moving patterns:

- The King can only move to a neighboring square.


- The Rook moves in a straight line or row.
- The Bishop moves in a diagonal path.
- The Knight moves in an L-shape path.
- The Queen moves diagonally, horizontally, and vertically.

3
CAPSTONE PROJECT REPORT – Chess with AI
- The Pawn only moves one square step forward and can move up to
two squares in its first move.

There are also some major rules such as castling (moving the King to a
safer position and the Rook to an available-for-attacking position), en passant (a
special move of the Pawn to take another piece), etc.
1.3. Description

Our group aims to create a chess engine that integrates Monte Carlo Tree
Search (MCTS) with a deep neural network inspired by AlphaZero. The engine
is capable of evaluating board states, predicting optimal moves, and improving
its performance by learning from self-play. Unlike traditional rule-based
engines, this approach combines strategic search with data-driven insights,
leveraging reinforcement learning for continuous improvement. As a result, we
hope the engine will be able to play against humans and other engines at a
certain competitive level.

It is crucial to identify characteristics of a chess playing agent to have a clear


foundation on what we will be creating. The PEAS components of our agent are
as follows:
• Performance measure: The evaluation is based on a combination of
Monte Carlo simulations and the neural network's prediction of the board
state's value and the probability distribution of moves.
• Environment: The chessboard and rules defined by the Python-chess
library.
• Actuators: Functions to generate and execute moves based on MCTS
recommendations and perform that move on the chessboard.
Sensors: Inputs include the current board state, encoded as tensors for
evaluation by the neural network.

4
CAPSTONE PROJECT REPORT – Chess with AI
The chess environment can be classified as:
• Fully observable: All pieces and their positions are visible at all times.
• Deterministic: There is no randomness in move outcomes; the next state is
entirely predictable.
• Sequential: Each move builds on the previous state.
• Semi-dynamic: The environment changes only with player actions (no time
progression without moves).
• Discrete: The game of chess has only a finite number of moves. The number
of moves might vary with every game, but still, it’s finite.
• Competitive multi-agent: Two opposing agents interact, aiming to achieve
mutually exclusive goals (winning or losing).

The initial state of the system includes:


• A standard 8x8 chessboard with pieces in their starting positions.
• The AI agent, which uses MCTS to search for optimal moves and a neural
network to evaluate board states.
• The player or opposing AI starts the game, and the system alternates turns
with legal moves until the game concludes.

5
CAPSTONE PROJECT REPORT – Chess with AI
Figure 3: Our chessboard

The AI agent is designed to:


• Encode the board state into a tensor representation using custom encoding
schemes.
• Predict the value of the board and the distribution of possible moves using a
deep neural network.
• Simulate potential outcomes using Monte Carlo Tree Search to refine its
decisions.

6
CAPSTONE PROJECT REPORT – Chess with AI

2. About the problem


2.1. Dataset
The dataset is a collection of data in the form of s, p, v where:
Details of S (State of the Board)
The game state is stored as an 8×8 2D array, where each element is an array of
22 values, described as follows:
• 0–11: Encodes the pieces present on the square:
• R, N, B, Q, K, P (white pieces: rook, knight, bishop, queen, king, pawn)
• r, n, b, q, k, p (black pieces).
• 12: Indicates the player's turn (1 for white, 0 for black).
• 13–16: Flags indicating castling rights:
• 13: Cannot castle on the king-side for white.
• 14: Cannot castle on the queenside for white.
• 15: Cannot castle on the king-side for black.
• 16: Cannot castle on the queenside for black.
• 17: Move count (total number of half-moves).
• 18–19: Repetition count for white (18) and black (19).
• 20: No-progress count (number of moves since the last capture or pawn
move).
• 21: En passant target square (if applicable).
Details of P (Policy or Action Probabilities)
The policy is represented as an 8×8 2D array, where each element is an array of
73 moves:
• 0–13: Moves in the North-South direction.
• 14–27: Moves in the East-West direction.
• 28–41: Moves in the Northwest Southeast diagonal.
• 42–55: Moves in the Northeast Southwest diagonal.
• 56–63: Knight moves.

7
CAPSTONE PROJECT REPORT – Chess with AI
• 64–72: Pawn underpromotion moves (excluding promotion to queen).
To simplify processing, the policy is transformed into a 1D array of 4672
elements, where each index corresponds to a unique action on the board.
Details of V (Value or Reward)
The value represents the expected outcome of the game from the perspective of the
current player:
• +1: Indicates a win.
• 0: Indicates a draw.
• -1: Indicates a loss.
2.2. CNN Architecture
Purpose of using CNN: Using generated game data which contains board
states, move probabilities - policy , game results – value (-1, 0, 1), create a new
CNN model that can predict move probabilities and game results from board states.
2.1.1. Layers of the architecture
Convolutional layer: Essential for learning local features from a chessboard
(grid-like data). By applying filters of size 3x3, the model can extract local features
that are essential for understanding the position and making decisions about
potential moves.

Fully connected layer: Used in the final stages of the network to make
predictions based on the learned features.
The fully connected layer consists of neurons, where each neuron is connected to
every neuron in the previous layer. This means that the output of each neuron in
8
CAPSTONE PROJECT REPORT – Chess with AI
the fully connected layer is influenced by every feature learned by the previous
layer.
With this structure, the fully connected layers can take the learned local
features from previous layers and combine them to form a global representation of
the input, which is needed to make the final prediction.

Flattening layer: Converts the 2D or multi-dimensional feature map into a


1D vector, which is needed for fully connected layers.

Batch normalization: Normalizes the input to each layer to improve


training stability and speed up convergence.
Let us use B to denote a mini-batch of size m of the entire training set. The

empirical mean and variance of B could thus be denoted as and

9
CAPSTONE PROJECT REPORT – Chess with AI
For a layer of the network with d-dimensional input, ,
each dimension of its input is then normalized (i.e. re-centered and re-scaled)
separately,

, where and ; and are the per-


dimension mean and standard deviation, respectively.

is added in the denominator for numerical stability and is an arbitrarily small


constant. The resulting normalized activation have zero mean and unit
variance, if is not taken into account. To restore the representation power of the

network, a transformation step then follows as , where the

parameters and are subsequently learned in the optimization process.

An activation function is a mathematical function applied to the output of a


neuron. It introduces non-linearity into the model, allowing the network to learn
and represent complex patterns in the data:

• ReLU (Rectified Linear Unit): Used after convolutional layers and residual
blocks as it is computationally efficient and help reduce the risk of vanishing
gradient compared to other activation functions.

• Softmax(used in the model as 𝑒 𝐿𝑜𝑔𝑆𝑜𝑓𝑡𝑚𝑎𝑥 ) : Used in policy head to


converts the logits into probabilities by normalizing the output to a range
between 0 and 1.

10
CAPSTONE PROJECT REPORT – Chess with AI

• Tanh: An activation function used in value head that squashes input values
to a range between -1 and 1.

11
CAPSTONE PROJECT REPORT – Chess with AI
Residual connection: A feature of deep neural networks where the output
of a layer (or set of layers) is added directly to the input of those layers.
Residual connection helps the model efficiently learn patterns in the chessboard
state without losing critical information as the network gets deeper.

Formally, denoting the desired underlying mapping as H(x), we let the


stacked nonlinear layers fit another mapping of F(x):=H(x)−x. The original
mapping is recast into F(x)+x.
The intuition is that it is easier to optimize the residual mapping than to
optimize the original, unreferenced mapping. To the extreme, if an identity
mapping were optimal, it would be easier to push the residual to zero than to fit an
identity mapping by a stack of nonlinear layers.

2.1.2. CNN Architecture


The input of the model is a tensor with the shape:
batch_size×22×8×8
where:
• 22 represents the number of channels (features per board state),
• 8 x 8 represents the height and width of the chessboard.

Next, the tensor passes through a CovBlock:


• Convolutional layer + batch normalization + relu

12
CAPSTONE PROJECT REPORT – Chess with AI
The convolutional layer applies 256 filters to the input tensor, transforming
it into a higher-dimensional feature map of size (batch_size, 256, 8, 8), enables the
network to learn and represent a broader range of features.The Batch
Normalization layer after that normalizes the output from the convolutional layer.
The ReLU activation layer introduces non-linearity into the model.

Then, the tensor is processed through 19 ResBlocks:


• Convolutional layer + batch normalization + relu
• Convolutional layer + batch normalization + residual connection + relu
This is the most important part of the architecture as there are 19
ResBlocks. The first three layers are the same as the CovBlock, containing 3 layers
Convolutional Layer + Batch Normalization + ReLU. Another convolutional layer
followed by batch normalization and the addition of the input of the block before
ReLU activation.
The ResBlocks help in building deeper networks without suffering from
vanishing gradients.
Finally, the tensor passes through the OutBlock, which produces two
separate outputs: the predicted move probabilities (Policy Head) and the game
results (Value Head). Here’s the structure of the policy head:
Convolutional layer + batch normalization + relu
Flattening layer
Fully connected layer + softmax
The Convolutional Layer is used for reducing the number of channels to
128 while keeping spatial dimensions the same. The flattening layer flattens the
feature map into a 1D vector of size (8192). After that, the fully connected layer
reduces the size of the vector to (4672) containing logits, which are then
normalized using softmax to generate probabilities over potential moves.

And here’s the structure of the value head:


13
CAPSTONE PROJECT REPORT – Chess with AI
Convolutional layer + batch normalization + relu
Flattening layer
Fully connected layer + relu
Fully connected layer + tanh
The Convolutional Layer reduces the number of channels to 1 The flattening layer
flattens the feature map into a 1D vector of size (64). Next, the vector passes
through a fully connected layer followed by ReLU activation. Then, the final fully
connected layer with a Tanh activation function, which outputs a scalar value
representing the game result. The output is between -1 (loss) and 1 (win), with 0
indicating a draw.

2.3. Decision-making
We want our chess agent to be able to decide the next move of the current
input game board. To improve performance and accuracy of chess agent, we use
MCTS algorithms combined with the UCB evaluation function to search for the
best move.
We consider each node of the tree with these attributes: game_state,
node_prior, node_value, node_visited. A node is considered expanded if its child
nodes haven’t been visited yet.

A tree starts with a root node, which is currently state, and continuously explores
its child nodes to expand the tree. Tree expansion process can be divided into 4
stages:
• Selection: Start at the root node and successively select a child node until
reaching a node that is unexpanded. When selecting child node, we apply
UCB evaluation function to ensure that there is balance between exploring
the tree and delving into potential branches. Child node is selected which
maximizes this evaluation function:

14
CAPSTONE PROJECT REPORT – Chess with AI

• Simulation: From unexpanded node above, we ask our CNN model to return
policy and estimated value of the current board state. This is slightly
different from the original MCTS but keeps the same idea.
• Expansion: If the current node is not one move away terminate, we expand
this node by updating node_prior base on the returned policy to all its valid
child node. Especially, if expansion node is root node, we apply Dirichlet
noise to the policy before updating its valid child nodes’s node_prior, which
is an effort to ensure all moves may be tried, but the search may still
overrule bad moves.
Here is how we apply Dirichlet noise to children of root node:

• Backpropagation: Backtracking to all participant node and update its


node_value base on the returned estimate value and node_visited.
After the tree is built up, we return the most visited child of root node as best move
for the input game state and also its policy

15
CAPSTONE PROJECT REPORT – Chess with AI
.

2.4. Data Generation


To ensure an adequate amount of training data, we employ self-play
techniques for data generation. This technique is implemented by letting our chess
agent play for both sides, like an individual playing chess alone. The game ends if
there is a winning side or over 100 moves executed. Additionally, the game ends if
a game state is repeated three times, in accordance with the rules of chess, to
prevent the agent from becoming stuck in an infinite loop. Upon termination, if
white wins, the value of all game states is set to 1; if white loses, it is set to -1; and
in the case of a draw, it is set to 0.

All game states, along with their policies and termination values, will be
stored in a file to create a dataset for training purposes. This dataset will be used to
train and improve the performance of our CNN model.

2.5. Training
a) Some basic idea
Convolutional Neural Networks (CNNs) are structured such that, at each
layer, the input data is analyzed and decomposed into various features through
filters. These filters facilitate the extraction of significant characteristics from the
input, enabling the model to effectively learn and recognize patterns. These filter’s
weights and some other weights serve as hyperparameters that require tuning to
enhance the model's performance. The learning process involves the model
adjusting its weights to better fit the training dataset, thereby improving its
accuracy and effectiveness.

We divide our training data into many batches, each containing 30 data. We
rearrange training set again for each epoch. This is to ensure that the model not
only focuses on learning from a specific data but also performs well when faced
16
CAPSTONE PROJECT REPORT – Chess with AI
with new and unseen data.

The model's hyperparameter can be retrieved using the model.parameters()


method. This method returns an iterator over all the parameters, including weights
and biases, of the model. It allows for access and manipulation of these parameters,
which is particularly useful for tasks such as parameter initialization, optimization,
and fine-tuning

To evaluate the accuracy of our model's predictions compared to the actual


dataset values, we utilize an evaluation function, also known as a loss function.
Because of the specificity of model output data, we will use Mean Squared Error
Loss Function for value estimate and Cross-Entropy Loss Function for policy,
combine all of this into loss function.

Loss function
For tuning the model’s hyperparameters, we utilize the Adam optimizer
model to optimize the gradient descent technique. This process involves
backpropagation, where we compute the partial derivatives of the loss function
with respect to each hyperparameter. The Adam method is then employed to
determine the step size for each hyperparameter adjustment.
Backpropagation algorithm to find partial derivatives of the loss function for each
weight:

17
CAPSTONE PROJECT REPORT – Chess with AI

Adam optimizer pseudocode:

b) Workflow
For each epoch, we rearrange training data to the data batches. After each
18
CAPSTONE PROJECT REPORT – Chess with AI
100 epochs were run, the learning rate in Adam optimizer will decrease 5 times.

For each data batch, we let our model run all datasets in batch. We then
calculate the loss function and perform backpropagation to determine the partial
derivatives of the loss function with respect to each hyperparameter. We reset
Adam optimizer model and let it tunning our model’s hyperparameter. After
processing all batches, we proceed to the next epoch.

After finishing enough epochs, we save all important attributes of the current
model, which can be retrieved using the model.state_dict() and store it in a file for
reusing.

3. Evaluation
3.1. Environment
The file evaluator.py primarily serves as an evaluation environment for a
chess bot to compare the trained bot to find the best net or export the result of
matches.

It creates an "arena" where two versions of the chess bot (the current and
best) compete in simulated games. Outcomes of these games are recorded,
including board states, moves, and game results (e.g., wins, losses, or draws).

3.2. Performance Comparison


It assesses the relative strength of the current model versus the best model by
calculating win ratios across multiple games. If the current model performs better
than the best model (e.g., by winning a higher percentage of games), it could
replace the best model as the new benchmark. However, the decision logic to
update the best model is commented out and does not execute (#if
current_wins/num_games > 0.55).
19
CAPSTONE PROJECT REPORT – Chess with AI
3.3. Data Generation and Model Management
The file generates datasets with encoded board states and game results,
which can later be used for retraining or further analysis.

Furthermore, it loads pre-trained models stored in pth.tar format and saves


updated versions after evaluations.

After each evaluation, we trained the model again with the evaluator data +
the already available data to optimize it.

3.4. Testing
To test the performance of our chess-playing agent, we have carried out
games for the agent against human players, mostly us, and against some well-
known chess bots, namely Leela Chess Zero based on AlphaGo Zero and Stockfish
on lichess.org.

a, Playing against Stockfish 14


We only considered the required data for each section and ignored the actual
performance of our agent in the match. So, we carried out another set of chess
matches to see whether the agent could win against some well-known chess
engines. In this case, we put our agent up against Stockfish 14 on lichess.org.
Stockfish provides multiple levels of difficulty, ranging from 1 to 10, and we first
tried out with levels 1 to 3:

Result against Level 1 Level 2 Level 3 Level 4


Wins 8 4 0 0
Loses 2 6 10 10
Draws 0 0 0 0
b, Playing against
Results against these advanced engines showed no wins, emphasizing the
need for improved evaluation functions.

20
CAPSTONE PROJECT REPORT – Chess with AI
Result against Leela Chess Zero AlphaGo Zero
Wins 0 0
Loses 15 15
Draws 0 0
c, Playing against Human
After playing 15 matches with us and 15 online matches on chess.com at an
Elo rating of 1200, we obtained the following results:

Result against Students Online player


Wins 9 5
Loses 6 10
Draws 0 0

Therefore, we can estimate that our chess engine stands in the range of 900
to 1200 rating on Lichess.org, which is equivalent to the level of a decent chess
player, not too good to compete professionally but not an amateur.

4. Conclusion

It is certain that our chess program is not flawless and would require more
improvements to become a good chess engine. Throughout the making of the
chess-playing agent, we have conducted some observations and made comments
on some difficulties and how we can surmount them to further increase the
performance of our program:
• Chess is played with more than just simple calculations of a board
state; yet, of course, the “game sense” of a chess player is something
that cannot be depicted by just a function or a searching algorithm, so
the best we can do to make our agent act more human-like is through
improving evaluating methods and training with larger datasets.

21
CAPSTONE PROJECT REPORT – Chess with AI
• Although the basic rules of chess should be universal knowledge, it
takes a long road to become a good chess player and, therefore, our
view on how to evaluate a chessboard can be limited. Our model is
still primal and there is still room for improvement. .
• Despite the difficulties, this project allowed us to have hands-on
experiences on how to create an intelligent agent using knowledge
from the course. These experiences will be valuable to us as we
proceed to following courses in the future, allowing us to approach
these courses more easily and have deeper knowledge in the field of
Artificial Intelligence.

All our work was posted on github link: https://github.com/Hainguyen2650/Chess-


player.git

22
CAPSTONE PROJECT REPORT – Chess with AI

References

1. Lecturing slides provided by the lecturer, Assoc. Prof. Le Thanh Huong


2. How to Make a Chess Game with Pygame in Python - The Python Code
3. Chess - Wikipedia
4. Monte-Carlo Tree Search (MCTS) — Mastering Reinforcement Learning
5. UCT - Chessprogramming wiki
6. Self-play - Wikipedia
7. Neural-networks-a-beginners-guide
8. Introduction-convolution-neural-network
9. Residual-networks-resnet-deep-learning
10.Reinforcement-learning

23

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy