Unraveling the Mysteries of the Hole and Clash (black Hole) Problem
Welcome to the fascinating world of RoboCup Soccer Simulation 2D, where artificial intelligence meets the thrill of competitive sports. As with any complex system, the game is not without its challenges. One such perplexing puzzle that teams often encounter is the hole and black hole problem. This report delves into this intriguing obstacle, exploring how it affects gameplay, why the H&Cs occur and the methods we used to test and solve this predicament.
Introduction
The RoboCup Soccer Server is a client-server application in which each client communicates with the server via a UDP socket. The server is responsible for executing requests from each client (i.e. agent) and updating the environment accordingly. At specific intervals, it sends sensory information (visual, auditory and body_sense) about the state of the world to each agent. The agents can then use this information to decide which action to perform next. The soccer server provides a pseudo-real-time simulation with discrete time intervals known as server cycles.
What are Holes and Blackholes (Clashes)?
In the current version of the server, each cycle lasts for 100ms. During this period, clients can send requests to the server for player actions. However, it is only at the end of a cycle that the server executes the actions and updates the environment. The server uses a discrete action model; thus only execute one primary action command in each simulation cycle. When the agents send multiple primary commands during a cycle, the server chooses the first one for execution and discards the others. This situation is known as a black hole (clash).
On the other hand, sending no request during a cycle will mean that the agent misses an opportunity to act and remains idle*. This situation will be referred to as a hole (miss) and is also undesirable since, in real-time domains, this usually leads to the opponents gaining an advantage. We refer to black holes as clashes to prevent confusion between holes and black holes.
*we don't count secondary actions in this matter.How it affect Teams and gameplay?
Holes (missed opportunities) and clashes (unplanned actions) can negatively impact a player's game in RoboCup. Holes result in missed chances to act, giving opponents an advantage. Clashes introduce unpredictability and disrupt strategic plans, potentially leading to unintended consequences. Agents must actively avoid holes and manage actions to maximize their game performance.
How and Why Do Holes and Clashes Occur? What are the reasons?
One of the most straightforward reasons is when a team does not want to do any actions or sends multiple actions during a cycle, which will count as holes and clashes on the server RCL logs.
In an Ideal System, every message will be sent instantly without losses (see the above image), and the agent only sends one action each cycle.
However, the Ideal system does not exist, and each massage will take some time to receive or send. We assume the average time for each message is 5ms. Agents will begin each cycle upon receiving the sense_body command, with the Rcssserver acting as the primary timer. Even if the server process takes longer than 100ms, the server's new cycle will start once the previous cycle calculation is complete.
A hole will occur if a message gets lost in the network and never reaches the server or agent. For example, in Cycle T, the agent's message never reached the server; we count this as a hole. On the other hand, The Cycle T+2 server's message did not reach the agent, and logically the agent did not respond, and another hole will occur.
If the agent's calculation takes longer to finish or reach the server, the agent's massage will reach the server later in the next cycle, creating a hole. A clash will occur if the agents send another action during the next cycle. If this happens each cycle and the agent send the actions late, clashes will not occur until sometime in future cycles.
The issue of holes and clashes can be rather complex, especially when it comes to determining the exact arrival time of messages (either from or to the server) is influenced by several factors, such as the available resources (CPU time, memory, Storage IO) and the speed and reliability of the network and the agent's calculations. Furthermore, determining the right moment to send an action is complicated because the agent only has explicit information about the duration of a cycle and not about its starting time. An additional problem related to holes and clashes is that sensing and acting in the soccer server are asynchronous.
why are we examining the topic?
During the last year of the competition, we used a new environment for the first time. This system used a website to let teams upload and validate their binary and added some flexibility for teams to update or change their binaries any time of the day. The central point of this system was the usage of the Docker.
Why we are using Docker?
There was a need to slightly update how the competitions were held, considering there has not been much change in twenty years.
Using Docker in the Soccer Simulation 2D environment provides several advantages.
Easy setup and deployment: The utilization of Docker provides a convenient means of packaging and distributing the entire simulation environment, including all associated dependencies, libraries, and configurations. This approach streamlines the process of setting up and deploying the simulation on multiple machines or platforms while ensuring that the environments are consistent and reproducible. This streamlined process can run on various platforms, including Amazon AWS, Google Cloud Platform, VirtualBox, Rackspace server, or any other viable platform. It is essential to note that the host operating system must support Docker.
Isolation and consistency: Docker containers provide an isolated and consistent environment for running the soccer simulation. It encapsulates all the necessary components and dependencies, eliminating conflicts or compatibility issues that may arise when simulating teams on different systems. This isolation and consistency ensure the simulation runs reliably and consistently across any platform. This isolation gives us the perfect environment to terminate any unwanted connection between agents themself or the internet, making it more realistic and up to the rules.
Scalability and resource optimization: We can run several simulation instances simultaneously in separate containers, optimizing resource allocation and allowing for scalability. It can run multiple simulations simultaneously on a single machine or distribute them across a cluster of machines efficiently using computational resources.
Version control and reproducibility: Docker containers are based on images, which can be version controlled. Different versions of the soccer simulation teams can be stored, managed, and easily reproduced. It ensures that specific teams and simulation versions can be accessed and utilized when needed, maintaining consistency and reproducibility in research, development, and competitions.
Collaboration and sharing: Docker simplifies collaboration by providing a standardized, portable environment. Developers, researchers, and participants can share their simulation setups and team binaries by sharing Docker images. Enables seamless collaboration, sharing of ideas, and easy replication of experiments, fostering a more efficient and collaborative research community.
Overall, Docker in the Soccer Simulation 2D environment streamlines setup, deployment, consistency, scalability, reproducibility, and collaboration, enhancing the overall efficiency and effectiveness of the simulation for research, development, and competition.
Is there a problem with the Docker system?
The Docker and the new system should have improved the competition. However, there was a problem with Holes and Clashes in the past few tournaments, which is why we are examining this problem.
Analysis of Recent Tournaments
In order to gain a better understanding of the situation, we reviewed and analyzed the logs from the previous competition that we could access
2017 RoboCup
2018 RoboCupAP
2018 RoboCup
2019 RoboCup
2020 JapanOpen
2020 RemoteCup 1
2020 RemoteCup 2
2021 IranOpen (Docker Alpha)
2021 RoboCup
2022 IranOpen (Docker Beta)
2022 RoboCup (Docker)
2023 IranOpen (Docker)
This chart shows holes and clashes (H&C) in past competitions, including the H&C of teams that send multiple commands or have their agents killed. If we remove those teams and do the analysis again, we will see the following chart.
We removed RI-One from RC19, OBG from IO21, AEteam and CyrusGirls from IO22, and MRL from IO23. The new chart indicates holes problems in RCAP18, IO22, and IO23, in which the corresponding committee held a starter league; thus, we see the jump in holes due to the novice competitors. Now, let us focus on clashes (Black Holes) and the recent years.
Teams clashes (log) box plot
In most of the years, there were at least two outliers with many clashes.
The plots show a problem with RoboCup 2022, although there are still some problems with RC17, RCAP18, IO21, IO22, and IO23; the RoboCup 2022 problem is different.
Teams percentile on average number of clashes
80% of Teams had an average of between 0 - 3 in most of the years
The charts show that almost all teams had at least one clash within the system in RoboCup 2017. Around 50 percent of the teams had 1-2 clashes, and 30 percent of teams had 0-1 clash on average in RC22. The same things happened in IO23 and RCAP18 with less intensity.
How did we test and find the underlying reason?
The holes and clashes (H&C) are influenced by several factors, such as the available resources (CPU, Memory, Storage), the Network's speed and reliability, and the agent's calculations. we conduct multiple tests to find which one was the main culprit of RC22.
Test servers
We no longer have access to the previous competition servers, so we tried our best to recreate the same problem in a new host.
We rented different hosts and servers and used our system with different settings to find which underlying reason had the most impact. To ensure everything will stay the same in each test and nothing changes, we used Ansible. It is worth considering the benefits of choosing Ansible and why it might be a good choice.
Ansible
Ansible is an automation tool; with Ansible, we can automate a wide range of IT tasks. Ansible is agentless, so we do not need to install anything on the remote machine; all we need is to have ssh access and Python installed on the remote machine.
Ansible is written in Python, and it uses YAML for writing playbooks. If you want to learn more about Ansible, you can watch Ansible in 100 Seconds video from fireship.io; or if you want to learn more about Ansible, Jeff Geerling has a great Ansible tutorial on his YouTube channel.
Some of the benefits of using Ansible in the competition are
Reproducibility: Ansible allows for a reproducible competition setup, enabling the easy creation of identical setups for testing and development. Integration of cloud services such as Linode can further simplify the process by enabling the creation of consistent hosts and servers for each test.
Collaboration and Sharing: By using Ansible, we can work together on setting up the competition through a git repository where we store the Ansible playbooks. Collaborating on the playbooks is effortless with this method.
Resource optimization: One way to significantly decrease the cost of the competition is by deleting the servers after each competition and creating new ones before the next day. This approach could lead to a cost reduction of over 66% as it saves 16 hours of server cost for every competition, which usually lasts 8 hours on average.
Time efficiency: With Ansible, we can reduce the time consumed for the competition setup by creating the servers and deploying the application in less than 10 minutes. This efficiency allows the technical committee to focus more on new features and improvements.
In the Iranopen2023 competition, we developed an Ansible playbook for deploying the application on the competition servers. This automation helped us to deploy the application on the servers in less than 10 minutes.
In IO23 Ansible playbook we have:
Configure WireGuard network
Install required packages
Configure web server
Configure database
Install & configure the DockerRunner agent
Deploy the application on the servers in less than 10 minutes
Deploy the application on the servers with one command
Remove the server after the competition with one command & create & configure a new server after.
Download logs from the servers
Create new test servers for testing
Create new servers for the competition
We are using the result of this report to improve our ansible playbook for RoboCup 2023.
CPUs
Processing power is one of the most critical aspects.
(1) High-End CPUs vs Low-End CPUs
The higher and better CPUs, fewer holes and clashes occur. Competitions with higher-quality CPUs experience fewer issues, while those with lower-end CPUs may experience more performance issues, especially in teams with complex calculations.
(2) Shared vCPUs vs Dedicated vCPUs
We tested both Shared and Dedicated CPUs; the result showed that dedicated CPUs are a must for a more reliable and stable server. as KVM Shared CPUs are used by other applications on the same machines and could load some pressure on the CPU at the same time as the competition leading to teams process take longer and creating an H&C problem. during our tests we saw some interruptions in Shared vCPUs. However, as we do not have access to the other application on the same machine, we cannot conclude.
(3) Teams performance and CPU usage
The amount of CPU each team uses differs; some teams use less, and others use more. Consider two teams, X and Z. X agents use at most four CPUs at the highest calculation, while team Z uses 12 CPUs to perform well.
In typical 16-core architecture, the match X vs Z will not raise any H&Cs (there is a possibility that team Z uses X resources).
In Docker 16-core architecture, each team is given a fixed core of CPUs, and teams at most can use a limited amount of the CPU power. The exact match X vs Z in Docker will raise H&Cs for Team Z, while Team X has no problems as their resources are separated and not accessible by the other team.
If a team uses a large amount of CPU and reaches the limit, Docker will not allow the team agents to use more CPU, which leads to H&Cs.
Past tournaments
IranOpen 2021 ran four simulation games simultaneously with 104-core Dedicated vCPUs, which gave each game a limited 18-core for both teams and the Rcssserver.
RoboCup 2022 ran two simulation games simultaneously with 32-core shared CPUs, which gave each game a limited 16-core for both teams S1(0-7 / 8-15) - S2(16-23/24-31) and the Rcssserver.
IranOpen 2023 ran four simulation games simultaneously with 50-core shared CPUs, which gave each game a limited 12-core for both teams and the Rcssserver.
Memory
Memory availability directly affects the performance and responsiveness of the game. Insufficient memory can lead to slow response times. Adequate memory resources are essential for smooth simulation, quick decision-making, and timely execution of actions.
Memory limitations impact resource allocation and scalability in distributed Soccer Simulation setups. In scenarios where multiple simulation instances are run concurrently, memory requirements increase. If memory is constrained, it may limit the number of simultaneous simulations that can be executed, affecting scalability and the ability to utilize available computational resources efficiently.
Fast memory speed is crucial for Soccer Simulation 2D. It enables quick data access, faster agent decision-making, and smoother gameplay. Slow memory speed can introduce delays in processing data, prolong simulation cycles, impact learning performance, and affect the coordination of agents.
The memory in the Docker system is limited to 2GB for each agent (each teams has access to 25GB of ram). We lowered this amount to test how it will affect the gameplay, and the number of H&C rises as expected.
Past tournaments
Unfortunately, there is no definite information on the RC22 memory, IO23 had 96GB, and IO22 had 128GB.
Network
The network can significantly impact Soccer Simulation 2D games, particularly concerning the occurrence of H&Cs. Here is how the network can affect holes in the gameplay:
Network Latency: Network latency is the time delay when data travels from one point to another on a network. In 2D Soccer Simulation games, network latency can cause delays in communication between the server and clients (agents). When there is high latency, it takes longer for the server to receive client requests and send game state updates back. This delay can cause H&C.
Packet Loss: When data packets do not reach their destination, it can interrupt communication between the server and clients. If a client's request packets are lost, the server will not receive the necessary actions for that simulation cycle. This PL can be especially problematic in real-time games like Soccer Simulation 2D, as it can immediately affect gameplay.
Network Congestion: When there is too much data traffic on a network, it can cause delays and increased latency, known as network congestion. Network congestion can make H&C happen more often because as more clients use the network, it takes longer for their requests to reach the server and receive updates back.
Unstable Connections: An unstable or unreliable network connection can cause inconsistencies and irregularities in gameplay. Frequent disconnections or fluctuations in network stability can lead to unreliable communication between the server and clients, resulting in agents experiencing frequent issues such as failed or delayed requests.
Each simulation server network will be handled separately in the Docker setup, as we did not encounter any network issues, and the Rcssservers are in one location or in one host, indicating that the network speed and stability are secured. However, we suspected that the usage of the WireGuard might be problematic. To ensure a secure and stable connection between our streaming system (rcssmonitor) and the host, we utilized WireGuard to establish a private network. WireGuard was necessary as many hosts do not permit repeated bidirectional UDP connections. The private network allowed the rcssmonitor to function properly.
To see the impact of the WireGuard, we changed the number of WireGuard connections, increased the client's number, limited the transmission speed, and changed server locations, and there were no changes in the number of H&Cs.
Fact: We checked that if we use one rcssmonitor, we transmit an average of 207.8 Kb of data per second. Using two rcssmonitor simultaneously reaches almost 591.8 Kb/s, and three monitors use 761.9 Kb/s, which means the transmission speed is linear. The average speed for monitoring every game is 250 Kb/s transmission speed.
Past tournaments
The network setups stayed the same on the Docker side of IO22, RC22 and IO23. We aim to remove WireGuard from the upcoming competitions.
Storage
At first glance, storage does not look like it will impact the H&C situation but let us dive deeper into it.
Data logging and analysis: Soccer Simulation 2D games may generate a significant amount of data during gameplay, such as player actions, game logs, and server information. Storage speed influences the efficiency of logging and storing this data. Slow storage speed may introduce delays in data recording, impacting real-time analysis and post-game analysis processes and next team loading.
Scalability and resource utilization: In scenarios where multiple Soccer Simulation 2D games run simultaneously or in a distributed environment, storage speed becomes critical for efficient resource utilization. Suppose storage speed is slow or storage systems have limitations. In that case, it can bottleneck the system's scalability, affecting the number of games executed concurrently or the speed at which data can be exchanged between different components.
Loading and caching team's assets: Soccer Simulation 2D teams often require loading various assets, such as neural network weights, formations, and some logic files, from storage. Slow storage speed can result in longer loading times, impacting responsiveness and team experience resulting in H&C problems. Additionally, limited storage capacity may restrict the number or size of assets that can be stored.
Teams should not write on the storage as it is prohibited. However, the Docker isolation does not allow teams to communicate via write and read from files; but it will still impact the simulation.
(4) HDD vs SSD
In our reviews, we ran two simulations on 32-core Dedicated vCPUs, the only difference being the storage type. The HDD performed so poorly, according to the SSD, that it marks itself as the main reason, which needs further investigation (Running one server in HDD and SSD does not shows that much of a difference).
Past tournaments
Unfortunately, there is no definite information on the RC22 and IO23. The IO22 was a hybrid storage, mixed SSD and HDD.
Experiment Results
Summery
What we found out from our experiment briefly is:
CPUs:
Higher-quality CPUs result in fewer performance issues compared to lower-end CPUs.
Dedicated CPUs are more reliable and stable for server usage than shared ones.
Different teams utilize varying amounts of CPU power, and exceeding CPU limits can lead to H&C (Holes and Clashes) issues.
Memory:
Insufficient memory can lead to slow response times and affect the game's performance.
Memory limitations impact resource allocation and scalability.
Fast memory speed is crucial for quick decision-making and smooth gameplay.
Network:
Network latency, packet loss, congestion, and unstable connections can all impact gameplay and lead to H&C issues.
Network setups were handled separately in the Docker system, and WireGuard was used for a secure connection between the streaming system and the host.
Storage:
Storage speed and efficiency affect data logging, scalability, loading times, and team assets.
Slow storage speed and limited capacity can lead to H&C problems.
HDD storage performed so much worse compared to SSDs.
The difference in performance between HDD and SSD storage types requires further investigation.
What are the most impactful reasons on RC22 and IO23?
We cannot say there was one single reason and the main culprit for the RC22 H&Cs; as discussed last year, the systems provided by the LOC of RoboCup2022 were not even close to what we requested, resulting in using inefficient shared CPUs for the tournaments, same goes for the IO23, as well as there is a possibility for using HDD resulting in more H&Cs due to the storage performance (although we do not have factual information on what storage were used).
As well as, the instance of matches like X vs Z carries some H&Cs by default.
Conclusion
The holes and clashes (H&C) are influenced by several factors, such as the available resources (CPU, Memory, Storage), the network's speed and reliability, and each agent's calculations. Usually, the problems occur when there are insufficient CPU resources or The Storage used by multiple simulations is an HDD.
What we can do to prevent the H&Cs?
Increasing the number of CPU cores, separating the games runner from each other, and using SSD can significantly impact the number of H&Cs.
For the RoboCup 2023, we used the provided LaBRI servers and did some of the tests on LOC systems to check the integrity of Docker Runner thoroughly, for this tests we used the Top 12 Teams of IO23 with the highest H&Cs.
HDD Limited-32-Core 2-Server*
HDD No-Limit-32-Core 2-Server*
SSD Limited-32-Core 2-Server*
SSD No-Limit-32-Core 2-Server*
The Simplified Architecture of the RoboCup 2023
The components used for the system are Cloudflare, Apache, Redis, our Ansible playbooks, our test load balancer and the Docker runner. As for the streaming systems, we used WireGuard to ensure the security and reliability of the Virtual network. We upload every day's logs/Bins to the GDrive (It is temporary and will move to the RoboCup archive). Furthermore, we will use a discord bot to post the updates into the corresponding channels.
Acknowledgement
We want to express our sincere gratitude and appreciation to all individuals and organizations who have contributed to completing this report on holes and clashes in SS2D (Soccer Simulation 2D). Their support, guidance, and assistance have been invaluable throughout this report.
First and foremost, we sincerely thank Prof. Dr. Thomas Gabel, Prof. Dr. Eicke Godehardt and the FRA-UNIted team for bringing this problem into the spotlight and igniting the first flames to solve this problem with their support, expertise, and valuable insights as the report's starting point.
We would also like to express our gratitude to the RC22 and IO23 Committee members who dedicated their time and efforts to collecting and analyzing the data necessary for this study. Their commitment, collaboration, and contributions have significantly enriched the findings and discussions presented in this report.
Furthermore, We appreciate the members of the Cyrus Team who participated in the experiments and provided valuable feedback and insights. Their willingness to participate in this questioning was crucial in obtaining meaningful results and drawing important conclusions.
We would like to acknowledge the contributions of previous researchers and scholars in the field of SS2D, whose pioneering work and published studies have laid the foundation for this system and analysis. We can mention a few from many: Helios team, AT-Humboldt, UvA Trilearn, and others. Their dedication to advancing knowledge in this domain has been instrumental in shaping the current understanding of holes and clashes in SS2D.
We would also like to thank the local organizing committee and the authorities responsible for providing the necessary resources and infrastructure to conduct the experiments and gather the data as well as their guidance and feedback throughout the project (LaBRI (Dr. Julien Allali, Frédéric Lalanne), Linode Akamai and Dalhousie University (Prof. Dr. Malcolm Heywood, Prof. Dr. Stan Matwin, Prof. Dr. Amilcar Soares). Furthermore, we would like to thank Nader Zare as one of the persons with many bright ideas. Their support and cooperation have been crucial in the smooth execution of this research project and in validating the integrity of the upcoming competitions.
Once again, I extend my sincere gratitude to all individuals and organizations mentioned above. Without their contributions, this report would not have been possible. Thank you for being an integral part of this report to a bright future for Soccer Simulation 2D.
If you have any questions or additional information regarding this matter, we will gladly hear from you.
This report and experiment were conducted by Omid 'MROA' Amini and Alireza Saddraii Rad.
Best Regards,
Members of Soccer Simulation 2D community
See you soon at next competition!
links
GitHub:
https://github.com/RCSS-IR/SS2D-Docker-Tournament-Runner
https://github.com/RCSS-IR/HoleAnalyzer
Diagrams:
https://drive.google.com/file/d/10slnhA-oKQqNjNF5zv0mHTr9Tdo5V6es/view?usp=sharing
GDrive of some of the results:
https://drive.google.com/drive/folders/1qa1HgRTR2GcB9Zzh4QY1bmOdTRGzZt9G?usp=sharing
https://drive.google.com/drive/folders/1odcltKYD3KG4gK1CSQ-4WMmL1lU7d0c_?usp=sharing
NOTE: rest of the results and links will be added later on.