View topic - Feature request: Remove unit cheat and/or Reset/restart game


All times are UTC + 1 hour


Post new topic Reply to topic  [ 9 posts ] 
Author Message
PostPosted: 03 Nov 2009, 13:03 
User avatar

Joined: 07 Sep 2009, 13:29
Location: Aalborg, Denmark
We are developing an AI with RL (Reinforcement Learning). When using RL we need to train the AI throughout several thousand games.

For this purpose we use the headless spring, which works very well. However, it is a problem that you can't directly reset a game from the AI. This means that you need to do a complete restart & reload of the game for every match. This loading takes ~13 seconds, which is alot compared to the actual gametime of 5-10 seconds at speed 120. Waiting the ~13 seconds for every single match is a huge waste of time.

As an attempt to solve this, we self-destruct all our units and add a new commander (using cheat) at the start position. This way we can somewhat start a new game, but it is an ugly approach. It still has the problem that a self-destruct explosion destroys the ground underneath, slowly revealing water and making the area unusable.

So now to our actual feature request :-) Why not either add a remove-unit cheat (without explosion and countdown), or even better add a possibility to reset/restart the game, without having to reload all the spring resrouces again. Such a reset/restart would simply reset the map and reload the AI's.
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 13:15 
Moderator

Joined: 22 Aug 2006, 15:19
add a luarules gadget to the mod you're doing your training on - this will be the simplest solution for now.
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 13:33 
Spring Developer

Joined: 01 Jun 2005, 10:36
Location: The Netherlands
Also, with regards to the map getting damaged, you can use the engine option NoMapDamage to disable map damage altogether.
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 13:45 
User avatar

Joined: 07 Sep 2009, 13:29
Location: Aalborg, Denmark
Thanks for the tips!

I am not familiar with the posibilities of LuaRules in Spring, so what exactly would be possible with such a script?
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 14:03 
User avatar

Joined: 27 Nov 2006, 12:57
You have to edit the mod file itself. (Unzip it with 7z, and zip it back with 7z using normal compression mode). Look at luarules\gadgets inside the mod, Those are lua scripts that are, how do I put this, unlimited in their power to change game state.
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 14:13 
AI Coder
User avatar

Joined: 28 Nov 2006, 16:46
Location: Netherlands
Hi allanmc,

As I am a student A.I. I would really like to know some of the details of the RL approach like:
  • How did you define the states?
  • What actions have you got?
  • What is the reward function?
  • What RL algorithm are you using specifically? (Q-Learning, Sarsa,...)
I have been thinking about an RL approach for quite some time, but so far I'm quite sceptic about it as the state-action space is too large or too abstract for guaranteed convergance.

Very curious!
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 14:46 
User avatar

Joined: 07 Sep 2009, 13:29
Location: Aalborg, Denmark
Error323 wrote:
  • How did you define the states?
  • What actions have you got?
  • What is the reward function?
  • What RL algorithm are you using specifically? (Q-Learning, Sarsa,...)


Hi Error,

This AI is part of a Master Thesis in Software Engineering, which i am doing along with three other people (on this forum: allanmc, initram, jepperc and shredguitar).

Currently we are trying to use BN to classify the opponent, and RL only to build the base (reach many labs as quickly as possible). In this simple RL we just have 3 build actions: Lab, Solar, Mex. The counts of these are used as state variables - Lab and solar can be in the range 0-19, and Lab in the range 0-4. The time used to build any of these buildings is used as negative reward, which ensures that the AI eventually figures out that it needs resources to build quickly. The goal state is reached when 4 labs has been build, and this results in a high positive reward. This has been implemented with standard Q-Learning, and works very well.

Right now we are in the middle of expanding the use of RL to more usefull cases, such as complete base building, attacking, scouting, etc.. You are right that with these bigger problems, the state space, among other things, can become a problem. Therefore we are currently looking into the possibilities with hierarchical RL: http://www.ijcai.org/papers/1552.pdf
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 15:21 
User avatar

Joined: 23 Oct 2004, 00:43
You also should probably be culling the wrecks between restarts. All the unreclaimed stuff that existed before the big Self-D is still going to be sitting there.

Got a screenshot of the 100th game? I want to see what that map looks like after that much mayhem.
Top
 Offline Profile  
 
PostPosted: 03 Nov 2009, 15:45 
AI Coder
User avatar

Joined: 14 Sep 2004, 10:32
Location: Cookieland
features too, you want a map with little in the way of trees and metal rocks
Top
 Offline Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: Bing [Bot] and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group
Site layout created by Roflcopter et al.