• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

help with implementing a Genetic Algorithm

Shalmanese

Platinum Member
Okay, I know the basic theory behind GA's but I havent dealt much in the nitty gritty.

For our final assignment, we are to create a program that can play a game similar to the Settlers of Catan (L=Java Version]http://settlers.cs.northwestern.edu/[/L]). Now I was thinking that I could use a GA to implement this by defining the value of serveral things a variables (eg, value of wood, sheep, corner squares etc) and then get the GA to figure out just what those variables should be.

Now, I need help in the implementation. For example, I noticed that there were several different strategies in playing the game which all required widely different vaule sets. Would it be better to have 5 or 6 seperate GA's running (ie they play against each other but no cross breeding) so I could get a number of sperate strategies?

Also, the game is limited to 4 people playing at a time so will this limit how big the population size will be?

Most improtantly of all: how should the breeding be done?

I was thinking, if I had a population of 16 GA's, I could ge the most succesful 4 from each round, breed 8 children from them and then add another 4 random people to make sure there is sufficient variation.

so 4 from older generation, 8 children from the older generation and 4 newbies.

What would be the best way to generate children?

I was just thinking the average from two of them plus maybe a little randomness.

Also, The game is a little variable in who wins so would it be better if it was a best 2 out of 3 or best 3 out of 5?

I don't know how long each game will take, I haven't coded the game yet but I have access to a P3-933 laptop and possibly a P4 1.8(SDRAM only but memory bandwidth shouldnt be a problem right?) if needed. Would this be enough to get a decent GA?

Thanks!
 
You're thinking about GA's all wrong! It sounds like you're wanting to do a GA run on the results of GA runs, which might be interesting but not appropriate for your game I'm sure. You will only need one GA, which will run with several different inputs which will be appropriate to the game sanario. That GA will take care of all the breeding for you, through the use of turnament select (usually) and its cross over and mutation functions (use very little mutation!) If you would like to play with some basic GA code look up SGA-C on google and download your self a copy of the code. You may want to see if it's ok with your teacher/professor/whatever if you just can make a shell function around SGA-C and use that code as your GA. The reason is even a simple GA will take some time to implament, and I don't know what the time limits on getting the game done as a whole will be, but I'm sure you don't want to waist the majority of that time on the GA. At any rate, you'll want a copy of SGA-C to take a look at how a basic GA works, and for you to be able to play with one so you can really start stratigizing on how you're going to use it in your game.

Originally posted by: Shalmanese
Okay, I know the basic theory behind GA's but I havent dealt much in the nitty gritty.

For our final assignment, we are to create a program that can play a game similar to the Settlers of Catan (L=Java Version]http://settlers.cs.northwestern.edu/[/L]). Now I was thinking that I could use a GA to implement this by defining the value of serveral things a variables (eg, value of wood, sheep, corner squares etc) and then get the GA to figure out just what those variables should be.

Now, I need help in the implementation. For example, I noticed that there were several different strategies in playing the game which all required widely different vaule sets. Would it be better to have 5 or 6 seperate GA's running (ie they play against each other but no cross breeding) so I could get a number of sperate strategies?

Also, the game is limited to 4 people playing at a time so will this limit how big the population size will be?

Most improtantly of all: how should the breeding be done?

I was thinking, if I had a population of 16 GA's, I could ge the most succesful 4 from each round, breed 8 children from them and then add another 4 random people to make sure there is sufficient variation.

so 4 from older generation, 8 children from the older generation and 4 newbies.

What would be the best way to generate children?

I was just thinking the average from two of them plus maybe a little randomness.

Also, The game is a little variable in who wins so would it be better if it was a best 2 out of 3 or best 3 out of 5?

I don't know how long each game will take, I haven't coded the game yet but I have access to a P3-933 laptop and possibly a P4 1.8(SDRAM only but memory bandwidth shouldnt be a problem right?) if needed. Would this be enough to get a decent GA?

Thanks!

 
Locutus,
I think he understands GA, the question is how to use a GA to play this game. It's not a simple problem. It's been itching the back of my brain all morning.

The problem (among many) is that, at the start of each turn you have a certain problem space to choose from. What does your population represent at this point? All possible moves for this turn? That's a relatively small set. Some set of future moves from this point forward? But how do you do crossover? Move histories starting from different points will not neccesarily be compatible. How do you evaluate the population? Play each move history against a simulated player? Or maybe against another GA?

Another issue is that after every turn, the state of the board is fixed. This will likely invalidate a large fraction of your existing population. So, do you run a single GA for every move, or try and run a single GA for the whole game?

You really need some way to represent a strategy, not just a series of moves. I didn't look at the game rules to much, but it doesn't seem easy.

As far as some of Shal's specific questions...

Would it be better to have 5 or 6 seperate GA's running (ie they play against each other but no cross breeding) so I could get a number of sperate strategies?

You might look at niching operators for this. Niching is a mechanism for preserving disparate "species" within a single population. I haven't used it myself ... I'm moving more toward an island model, but that works better with distributed systems.

Most improtantly of all: how should the breeding be done?

Good question ... but the better and preceeding question is how will you represent your solution space? IMHO, problem representation is the key to success in GA.

I was thinking, if I had a population of 16 GA's, I could ge the most succesful 4 from each round, breed 8 children from them and then add another 4 random people to make sure there is sufficient variation.

A population of 16 GA's? A Meta-GA? Locutus may have a point here, that doesn't make much sense, at least the way you explain it.
Do you mean a GA with a population size of 16? That's almost certainly to small. Also, adding random population members is ussually not neccesary, and counterproductive if you populations size is large enough, you have enough mutation, and your crossover operator is sufficiently disruptive.

Frankly, I think this assignment is better suited for a NN or GP approach.

 
Actually, 16 isn't that far down. Usually, it's recommended somewhere between 20 to 30.

Start off by concentrating on seeing if a GA can even produce a single worthwhile strategy... since your fitness function is effectively a measure of how the player did at the end of the game (for example, let them play 5 mins, and give them points, or allow them to complete a game and use the game's ranking system). Use a single GA, have several dfiferent players. DON"T just do averages!!!!!!!!! Averages make sense in many situations but not all. Averages (with some randomness) should be one method that you use, but generating the child using crossovers is considered to be better. You can also randomly choose the chromosome each time from one of the parents if you want. You might have a problem with generations taking too long, you'll have to go through at least 70 gens if not a few hundred to get at anything good. What you might want to do, to limit this, is to concentrate on certain scenarios: have them play the game for 2 mins, and breed a good starting strategy. Then set up a scenario for selecting good offense and defense (give one side some medium advantage on the other. You can breed good attackers and defenders silmultaneously). Apply a fitness function which rewards resource gathering. You can then take these strategies and use a GA to find a way to incorporate them into a good overall player (i.e. based on the current situation it chooses the appropriate strategy).

The options are rather limitless.
 
Originally posted by: m0ti
Actually, 16 isn't that far down. Usually, it's recommended somewhere between 20 to 30.

Maybe for relatively trivial problems. I regularly use a population size of several hundred . For my current project I did a crude population size study that showed a population size of less then about 100 didn't represent the problem space well enough to avoid premature convergence. Right now I'm using 300 as a reasonable trade between speed and performance. But I get better final convergence up to about a population size of about 600 if I'm willing to wait that long.

Start off by concentrating on seeing if a GA can even produce a single worthwhile strategy... since your fitness function is effectively a measure of how the player did at the end of the game (for example, let them play 5 mins, and give them points, or allow them to complete a game and use the game's ranking system). Use a single GA, have several dfiferent players. DON"T just do averages!!!!!!!!! Averages make sense in many situations but not all. Averages (with some randomness) should be one method that you use, but generating the child using crossovers is considered to be better. You can also randomly choose the chromosome each time from one of the parents if you want. You might have a problem with generations taking too long, you'll have to go through at least 70 gens if not a few hundred to get at anything good. What you might want to do, to limit this, is to concentrate on certain scenarios: have them play the game for 2 mins, and breed a good starting strategy. Then set up a scenario for selecting good offense and defense (give one side some medium advantage on the other. You can breed good attackers and defenders silmultaneously). Apply a fitness function which rewards resource gathering. You can then take these strategies and use a GA to find a way to incorporate them into a good overall player (i.e. based on the current situation it chooses the appropriate strategy).

The options are rather limitless.

But again, what do you represent in your GA? Your high-level strategy sounds fine. I'm just stuck on how/what you represent to allow the GA to actually play the game.
 
Okay, maybe I didn't quite specify clearly enough in the problem. What I am thinking of doing is to make an overarching stratergy module which determines the relationship between several variables. Ie. I determine the value of Wood by How much wood there is on the board, how much wood I am collecting and the intrinsic value of wood. I define the value of a certain location by the value of the 3 resources it is collecting etc. Now the problem is, how do I find the weightings of each factor and how to find stuff like the intrinsic value of wood. So I have a basic stratergy and I can just alter variables to make it more agressive/passive/wood loving/expansionist etc. I will then use the GA to find the optimal variables for this particular game. I was a bit confused with the terminology before but, yes, I was wondering if a GA with a population of 16 would be enough?

I was thinking some more on this today and:

1. There seems to be several strategies that work well, would niching be more likely to find more than one stretgy so I can be more versatile?

2. The crossbreeding/mutation will run on a per game basis.

3. I have heard of crossovers bandied around but I fail to see much difference between these and just plain Averaes + randomness, what is the advantage of crossovers.

4. I am guessing the number of variables will be around 50 - 100. Is this WAY too many for a GA to handle? From naive impressions, I would guess a GA would do better if there were only very few variables since eahc variable would have a more significant effect and not get lost in noise.

5. I realised that writing my own software, while educational, would be pretty tedious work, is there any free, good software availible that is easy to use for beginners (point and click windows user here 🙂).

I can get access to the Uni lab computers which are running Linux but all of my personal machines are runing Win32 so it would be better if it was windows based.

Thanks for all your help so far.
 
Hmmm, it sounded to me quite like he was mixing up what the GA and the populations were, at least the way I read it... I was thinking about it whilst showering this morning, and the best I could think of was wrap a function around a pre-written publicly available GA such as SGA, and add a paramater to that function with tells it what game mode you're trying to do your evaluations on. Then you could have SGA pass that to the black box that you're going to have to write, and the black box could change what it considers a "better fit" individual based on what paramater you passed it. As long as you kept your population sizes to 1K or lower you should be able to just run it in a loop with out too much of a noticable slow down. I still think you're going to want to find something pre-written, it sounds like your black box may end up being enough code in its self, let alone writing the game and incorperating the GA. You'll probably want to play with your paramaters some, I don't know how computationally intensive your black box will be. I would think that a pop size of 1K, possibly up to 2 or 3K for 10 generations might not be that bad... Even with the NYC Pipes black box I was using, runs like that practically ran faster than you could snap your finger... That was on an 800MHz Athlon BTW.

Originally posted by: ergeorge
Locutus,
I think he understands GA, the question is how to use a GA to play this game. It's not a simple problem. It's been itching the back of my brain all morning.

The problem (among many) is that, at the start of each turn you have a certain problem space to choose from. What does your population represent at this point? All possible moves for this turn? That's a relatively small set. Some set of future moves from this point forward? But how do you do crossover? Move histories starting from different points will not neccesarily be compatible. How do you evaluate the population? Play each move history against a simulated player? Or maybe against another GA?

Another issue is that after every turn, the state of the board is fixed. This will likely invalidate a large fraction of your existing population. So, do you run a single GA for every move, or try and run a single GA for the whole game?

You really need some way to represent a strategy, not just a series of moves. I didn't look at the game rules to much, but it doesn't seem easy.

As far as some of Shal's specific questions...

Would it be better to have 5 or 6 seperate GA's running (ie they play against each other but no cross breeding) so I could get a number of sperate strategies?

You might look at niching operators for this. Niching is a mechanism for preserving disparate "species" within a single population. I haven't used it myself ... I'm moving more toward an island model, but that works better with distributed systems.

Most improtantly of all: how should the breeding be done?

Good question ... but the better and preceeding question is how will you represent your solution space? IMHO, problem representation is the key to success in GA.

I was thinking, if I had a population of 16 GA's, I could ge the most succesful 4 from each round, breed 8 children from them and then add another 4 random people to make sure there is sufficient variation.

A population of 16 GA's? A Meta-GA? Locutus may have a point here, that doesn't make much sense, at least the way you explain it.
Do you mean a GA with a population size of 16? That's almost certainly to small. Also, adding random population members is ussually not neccesary, and counterproductive if you populations size is large enough, you have enough mutation, and your crossover operator is sufficiently disruptive.

Frankly, I think this assignment is better suited for a NN or GP approach.

 
Why you shouldn't do just averages:

You've got a very complex problem space, and you don't know what effects each value has on the effectiveness of a strategy, particularly since the variables aren't independant at all. Crossovers allow you to maintain the same chromosomes in the pool working with them longer. Overall, you're correct; if you add in random members to each generation you will still achieve convergance, it will just take tend to take longer to get to something worthwhile if you're ONLY using averages.

I agree taht it will take quite a while to reach convergance, and that for a problem of this scope it would be better to use a greater sized population. I think though, that the time constraints are quite high... despite what he thinks about his hardware, running games of several minutes for many generations over even a small population takes a long time. Especially when taking into consideration the number of variables he's trying to converge.

I think that the time factor may be a problem, and that he may need a way to look at speeding things up.

Best of luck.
 
Well, I'll leave the objective function to you ... I don't understand the game well enough.

Originally posted by: Shalmanese
I was a bit confused with the terminology before but, yes, I was wondering if a GA with a population of 16 would be enough?

If you look in some of the GA FAQs you can probably find some rules-of-thumb for population size, but in general, it's something of a trial & error thing. For the number of variables you're considering, I'd start at least an order of magnitude higher. The good news is that I don't think your objective function will be very expensive to evaluate at all, so you can afford to go big on the population size.

I was thinking some more on this today and:

1. There seems to be several strategies that work well, would niching be more likely to find more than one stretgy so I can be more versatile?

That's my understanding of it, but I haven't used niching myself, except the more extreme case of an Island model GA, and my island model isn't working properly yet. I would get the basics working before I worried about niching though.

3. I have heard of crossovers bandied around but I fail to see much difference between these and just plain Averaes + randomness, what is the advantage of crossovers.

Uhmm, crossover is a primary key to GA along with fitness ranking! It's how you combine information from the parent entities. I could get into it here, but it sounds like you have some reading to do. I'd take a look at Goldberg. That's still the best intro GA book IMHO. If you want some insight into the math behind schemata theory, etc., take a look at Holland.

4. I am guessing the number of variables will be around 50 - 100. Is this WAY too many for a GA to handle? From naive impressions, I would guess a GA would do better if there were only very few variables since eahc variable would have a more significant effect and not get lost in noise.

Nope, the model I'm working with now can have several thousand variables. As you get into larger variable sets, you need to get creativewith your representation though, or your schemata distances get to big. For the problem I'm describing, I have, in effect, multiple chromosomes and a dominant/recessivemodel.

5. I realised that writing my own software, while educational, would be pretty tedious work, is there any free, good software availible that is easy to use for beginners (point and click windows user here 🙂).

If you have to knock the assignment out quickly, there is plenty of freeware out there. Just put "Genetic Algorithm" into google and you'll get tons of hits. I really think you gain alot of insight into the GA process by writing your own though.

I can get access to the Uni lab computers which are running Linux but all of my personal machines are runing Win32 so it would be better if it was windows based.

Thanks for all your help so far.

Good luck!
 
Okay, lets go over the corssover thing again, it was to my understanding that with crossover, you get a bitstring, break it at a random point and then combine one fragment from one bitstring with a fragment from the other. Now, my major beef with this is that each successive bit in a bitstring is "worth" more than the bit before it so if you split it 4/4, that means 15/16ths of the final value is coming from 1 bitstring and only 1/16th i coming from the other. Wouldn't it be simpler just to choose a random genome and add a 1/16th random factor?

Also, after playing around with the game a bit more, I have come to the conclusion that GA's would not be terribly useful with this game, the chance factor is too high for even a highly successful strategy to win significantly more times than a not so good strategy. It really all depends on how the board is laid out at the start of the game so I think I will focus more on that area rather than general strategy.

Thanks for all your help anyway.
 
Originally posted by: Shalmanese
Okay, lets go over the corssover thing again, it was to my understanding that with crossover, you get a bitstring, break it at a random point and then combine one fragment from one bitstring with a fragment from the other. Now, my major beef with this is that each successive bit in a bitstring is "worth" more than the bit before it so if you split it 4/4, that means 15/16ths of the final value is coming from 1 bitstring and only 1/16th i coming from the other. Wouldn't it be simpler just to choose a random genome and add a 1/16th random factor?

As I suggested, you should read up some on schemata theory for a full explenation, but I'll give it a go.

I'm not entirely clear what you mean by worth, but it seems that your suggesting that you have variable represented by a 16 bit bitstring. You're concern is that bits on the left end have much more significance in the value then those at the right end. But remeber that having a single variable will never happen in practice. In general, you have many variables concatenated in that bit string. Most of the time single-point crossover moves whole variables intact from one generation to the next. It can only potentially split one variable per crossover.

A second point on your suggestion that you just add 1/16 randomeness. When you are doing crossover, you're not adding randomness. The two entities you're doing crossover on came through the selection process. That means that the fitness of these entities is generally higher then average for the population. So, that 1/16th you are replacing comes from a pool of above-average values, not random values. The idea is that individual bits from a good entity are generally more valuable/optimum then bits from a bad entity. If you bang good entities together long enough, the bad bits will gradually get replaced by good ones to improve the overall fitness of the population.

Also, after playing around with the game a bit more, I have come to the conclusion that GA's would not be terribly useful with this game

I agree on this point!

Thanks for all your help anyway.

Good luck!

 
Back
Top