technology header
TECHNOLOGY

Overview

The groundbreaking superhuman racing agent was developed through close technical collaboration between Sony AI, Polyphony Digital (PDI), and Sony Interactive Entertainment (SIE).

Below are the technical components of Gran Turismo Sophy (GT Sophy) and the contributions from each organization.

1. Hyper-realistic Simulator

Gran Turismo® Sport (GT Sport) is a driving simulator for PlayStation® 4 created by Polyphony Digital. GT Sport recreates the real-world racing environment as true to life as possible, including its racing cars, tracks, and even physical phenomena such as air resistance and tire friction. PDI provided access to the necessary APIs to train GT Sophy in this ultimate simulating environment.

1. HYPER-REALISTIC SIMULATOR1
1. HYPER-REALISTIC SIMULATOR2
What Makes GT Sport so real?
GT Sport is equipped with some of the latest vehicle dynamics simulation, incorporating knowledge gained in technical support activities for real life racing cars.
The performance of a car is recreated true to life in many respects, modeling phenomenon such as air resistance, tire friction, orientation changes due to suspension movements, etc.
Under guidance from automobile manufacturers, the details of cars are recreated accurately, from the curves of vehicle bodies, down to the width of the gap between body panels and the shapes of the turn signals and headlights.
GT Sport is designed in partnership with the FIA (Fédération Internationale de l’Automobile) and has an eSports community of over 400,000 people around the world. It brings to life a fair racing environment with clear rules and judgement criteria.

2. Novel Reinforcement Learning Techniques

Reinforcement learning (RL) is a type of machine learning used to train AI agents that take actions in an environment by rewarding or penalizing those actions based on the outcomes they lead to. The diagram below shows how an agent interacts with its environment. The agent takes an action in the world, is given a reward (or penalty) and receives an updated description of the world state to determine its next action.

AI AGENT, ENVIRONMENT, STATE, REWARD, ACTION

Sony AI researchers and engineers developed innovative reinforcement learning techniques including a new training algorithm called Quantile-Regression Soft Actor-Critic (QR-SAC), agent-understandable encodings of the rules of racing, and a training regimen that promoted the acquisition of nuanced racing skills.

Deep reinforcement learning (deep RL) has been a key component of impressive recent artificial intelligence milestones in arcade games, complex strategy games such as chess, shogi and Go as well as other real-time, multiplayer strategy games. RL is particularly well suited to developing game AI agents because RL agents consider the long-term repercussions of their actions and can independently collect their own data during learning, avoiding the need for complex, hand-coded behavior rules. However, dealing with complex domains like Gran Turismo requires equally complex and nuanced algorithms, rewards, and training scenarios.

Skills GT Sophy Mastered
Through RL Techniques

Through key innovations in RL Techniques, GT Sophy mastered the skills of Race Car Control, Racing Tactics and Racing Etiquette.

  • RACE CAR CONTROLA new algorithm, QR-SAC, explicitly reasoned about the various possible outcomes of GT Sophy’s high-speed actions. Accounting for the consequences of driving actions and uncertainty therein helped GT Sophy take corners at their physical limit and consider complex possibilities when racing against different kinds of opponents. 

    Driving along the walls

    You can see GT Sophy’s driving skill here: the agent drives through a series of tight curves, right up against the walls, without making contact.

    AI AGENT, ENVIRONMENT, STATE, REWARD, ACTION
  • RACING TACTICSWhile RL agents can collect their own data, training particular skills like slipstream passing requires the opponent to be in a particular position. To address this issue, GT Sophy’s schooling included mixed-scenario training using hand-crafted race situations likely to be pivotal on each track, as well as specialized sparring opponents that helped the agent learn these skills. These skill-building scenarios helped GT Sophy acquire expert racing techniques including handling crowded starts, making slingshot passes out of the slipstream, and even defensive maneuvers.

    Overtaking on a sharp corner

    GT Sophy successfully overtakes the human driver by taking advantage of the sharp corner. Pay attention to the agent’s steering in particular.

    AI AGENT, ENVIRONMENT, STATE, REWARD, ACTION
  • RACING ETIQUETTETo help GT Sophy to learn sports etiquette, Sony AI researchers found ways to encode the written and unwritten rules of racing into a complex reward function. The research team also found it necessary to balance the population of opponents to make sure GT Sophy had competitive training races while not becoming too aggressive or timid for human competition.

    Fair overtaking

    GT Sophy overtakes the human driver without blocking their driving line but instead leaves them enough space to maneuver. This is a fierce battle where GT Sophy demonstrates fairness and sportsmanship.

    AI AGENT, ENVIRONMENT, STATE, REWARD, ACTION

3. Distributed Training Platform

Distributed, Asynchronous Rollouts and Training (DART) is a custom web-based platform which was developed by Sony AI to enable Sony AI’s researchers to train GT Sophy on PlayStation 4 consoles in SIE’s cloud gaming platform.

DART allows researchers to easily specify an experiment, have it run automatically when cloud resources became available, and collect data that can be viewed in the browser. In addition, it also manages PlayStation 4 consoles, agent compute resources, and GPUs for training across data centers. The system allows Sony AI’s research team to seamlessly run hundreds of simultaneous experiments while they explore techniques that would take GT Sophy to the next level.

Web-based interface

4. Mass Scale Training Infrastructure

The DART platform had access to over 1,000 PlayStation 4 (PS4) consoles. Each was used to collect data for training GT Sophy or evaluate a trained version. The platform consisted of the necessary computing components (GPUs, CPUs) to interact with a large number of PS4s and support large scale training over an extended period of time.

4. MASS SCALE TRAINING INFRASTRUCTURE1
4. MASS SCALE TRAINING INFRASTRUCTURE2

Nature

Get more in-depth technical information about GT Sophy by reading the Nature paper here.

nature

Future Applications

While GT Sophy has achieved major milestone, there is still room for further research and development. In partnership with PDI and SIE, Sony AI will continue to upgrade GT Sophy’s capabilities as well as explore ways in which the agent can potentially be integrated into the Gran Turismo series going forward.

In addition to Gran Turismo, Sony AI is also eager to explore new partnerships to enhance the gaming experience for players through AI.

FUTURE APPLICATIONS
  • Driving along the walls

  • Overtaking on a sharp corner

  • Fair overtaking