Trying to develop a complex weighting algorithm

Train

Lifer
Jun 22, 2000
13,587
82
91
www.bing.com
So for the lack of a better example, lets say we are ranking the difficulty of video games.

Say I have 10 games, each of course is different. I am attempting to rank the players, but of course its hard to compare player 1's score in Game A with Player 2's score in game B

So what I need is a way to have a "weight" dynamically configured when the same player plays more than one game. Not all players will play all 10 games, some will only play two, some 3, etc.

Say Player 1 plays games A, B, C. Player 2 plays games B, C, D Player 3 plays games A, C, D

How can I go about ranking these players, or more specifially, ranking the games based on the players scores across different games.
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
Are you trying to rank the players' skill, or the games' difficulty (or both)? Doing both at once is difficult, since in isolation you can't tell if a player is bad at a particular game because they suck or because the game is very difficult. You would have to take a large sample of players and estimate the game's difficulty based on some sort of average measure of performance, and then you could measure each player against the same average to see how well they played that particular game compared with everyone else.

Also, it's very difficult to provide an absolute rating as to how 'difficult' a video game is (unless you are confining this to a particular genre of games). Someone who finds puzzle games 'easy' might have a great deal of difficulty with a fast-paced FPS (and vice versa).
 

CycloWizard

Lifer
Sep 10, 2001
12,348
1
81
Matthias is right - there's not necessarily any correct or easy way to do this. So, the best you can do is try to come up with an approximation. Here's how I would go about it.

The easiest way to allow comparison between games is to normalize the 'skill' variable that each player has. This just means you take it as a fraction (like a batting average) such that it's always less than 1.

Now, you need to determine the individual's skill in each game. Simplest way: winning percentage (# of wins/# matches played). This will give you a normalized value for his skill in that game. Of course, you could make this more complex by considering score margins and so on, but I think you'll get the point.

When you want to weight the various games, then, you can simply multiply the winning percentage for a game by its corresponding weight value. For example, if Quake 3 is the toughest game, multiply its score by 10. For Counter-Strike (a less skilled game), multiply the score by 2. Then, add the resulting value for Q3 with that of CS and divide the sum by the sum of all the game multipliers. Only divide by the multipliers corresponding to the games the player competes in.

Example (player record in parentheses):
Player A competes in games a (10-1), b (5-9), c (6-4).
Player B competes in games b (5-2), c (10-0), d (3-8).
Player C competes in games a (3-1), b (4-2), c (1-5), d (8-1).

Now, you need to assign some weighting factors for each game. Arbitrarily, I'll say game a = 3, b = 5, c = 2, d = 4.

Calculating the score of each player is then done as follows:
A = {[10/(10+1)]*3+[5/(5+9)]*5+[6/(6+4)]*2}/(3+5+2)

Generally, the score is calculated as:
player's skill = {[(wins in game a)/(wins + losses in game a)]*(weighting for game a) + [(wins in game b)/(wins + losses in game b)]*(weighting for game b)...+{[(wins in game n)/(wins + losses in game n)]*(weighting for game n)}/(weighting for game a + weighting for game b... + weighting for game n)

Then you can multiply the normalized result by 1000 or whatever you'd like to get an attractive number. Sorry for the truncated example, but I sort of feel like dying today so I leave the rest as an exercise. :(
 

Train

Lifer
Jun 22, 2000
13,587
82
91
www.bing.com
Are you trying to rank the players' skill, or the games' difficulty (or both)?
Both, but the primary goal is to rank the players, but since the games will vary on difficulty, they have to be weighted.
Doing both at once is difficult, since in isolation you can't tell if a player is bad at a particular game because they suck or because the game is very difficult. You would have to take a large sample of players and estimate the game's difficulty based on some sort of average measure of performance, and then you could measure each player against the same average to see how well they played that particular game compared with everyone else.
I was thinking of doing a percentile score for each player in each game, then adding up each players percentile scores for a player ranking. Which i think may actually have the same result as a complex weighting, but im not sure.
Also, it's very difficult to provide an absolute rating as to how 'difficult' a video game is (unless you are confining this to a particular genre of games). Someone who finds puzzle games 'easy' might have a great deal of difficulty with a fast-paced FPS (and vice versa).
I'm only using video games as an example, since I didnt want to explain to complexity of the real world problem I am facing. A more accurate analogy is that the games are all just different versions of the same game. They are very similar, but there are a few factors that we cant control that allows some games to be harder than others, thus the need to weight them to allow an accurate comparison of players

 

Train

Lifer
Jun 22, 2000
13,587
82
91
www.bing.com
Heres what I'm thinking so far:

1) Determine average score for each game
2) Determine average of game averages (aka the "Normalized Score")
3) Record Each courses deviation from the normalized score. (Game deviance = Normalized score - game avg score)
4) Each players normalized score for each game = Score + Deviance
5) Each players overall score = The sum of thier 3 best normalized scores

Seems so simple when I put it that way.

I guess the part that I'm missing is that this way each game is given a deviance pretty much independently. Say I have 100 scores for Game A, and 100 scores for game B, but only 50 players in each have done both, I really want to use those 50 to determine the comparison between game A and B, instead of raw averages. I think I may have introduced too many variables though. Perhaps for now I will use the above method (since I'm sorta in a rush) and try to develop a more thourough method later.
 

Future Shock

Senior member
Aug 28, 2005
968
0
0
You don't want to ask this question here. Instead, go research predictive modeling done by bookies/bettors for horse races. I am NOT kidding - this is basically the problem you are presenting us - how to rank various players (horses) on past performance on varying games (tracks) under varying conditions.

I seem to remember that there are serious betting groups in Hong Kong using complex computer simulations (neural nets and the like) to solve this problem, and are literally cleaning up at the track doing so. The conditions in Hong Kong (limited number of horses and tracks, unlimited stakes, etc.) seem to benefit especially well from this scheme according to the article I read.

To not use a neural net, you could emulate a D&D style approach,and ad hoc decompose each game into what you think it's compent skills are (logic, problem solving, reaction times, mental recall, etc.), and then derive a skills factor matrix for each player based upon how they perform on each game, assigning weights to each game by skill factor. But then again, I advise you to visit your local betting parlor, because this is precisely what those guys reading "The Sporting News" do all day...

If you don't want to be this complex, then I think your idea of simply summing percentile rankings is a fine one.

FS
 

CycloWizard

Lifer
Sep 10, 2001
12,348
1
81
Originally posted by: Future Shock
You don't want to ask this question here. Instead, go research predictive modeling done by bookies/bettors for horse races. I am NOT kidding - this is basically the problem you are presenting us - how to rank various players (horses) on past performance on varying games (tracks) under varying conditions.
He's not gambling... Sounds like he's trying to set up a ladder recognition system to recognize the most overall skilled players over a variety of games. Thus, predictive modeling won't tell him much, as he's only concerned with previous performances (though predictive modeling will rely on previous performances for its guesses).
I seem to remember that there are serious betting groups in Hong Kong using complex computer simulations (neural nets and the like) to solve this problem, and are literally cleaning up at the track doing so. The conditions in Hong Kong (limited number of horses and tracks, unlimited stakes, etc.) seem to benefit especially well from this scheme according to the article I read.
You could do this using simple ANOVA analysis - no neural nets needed.
 

DrPizza

Administrator Elite Member Goat Whisperer
Mar 5, 2001
49,601
167
111
www.slatebrookfarm.com
IIRC, there was a relatively simple method for doing something like this using matrices... I seem to remember squaring a matrix to find some sort of rank matrix back in linear algebra.

Speaking of predictive simulations; my prof did some sort of presentation on a really complicated use of matrices (he was a linear algebra guru) to predict NBA scores. IIRC, his predictions were really good at predicting wins, but not quite useful enough for betting.


edit: but, if you consult a linear algebra book, perhaps there might be something in there.
 

Future Shock

Senior member
Aug 28, 2005
968
0
0
Originally posted by: CycloWizard
He's not gambling... Sounds like he's trying to set up a ladder recognition system to recognize the most overall skilled players over a variety of games. Thus, predictive modeling won't tell him much, as he's only concerned with previous performances (though predictive modeling will rely on previous performances for its guesses).
My initial answer to your post was that they are both two sides of the same coin - if I can rate people's past performances for recognition, then I can predict future outcomes statistically. Then I realized that that assumes a causual effect of the prediction variables, which you do not need for ranking, but do need for prediction. Point well taken.

You could do this using simple ANOVA analysis - no neural nets needed.
Yes, any regression will get you answers (and what is a neural net but a non-linear regression), but the article I was referring to made specific references to the sophisitcation of the analysis being done in Hong Kong. Nets will allow you to analyze an order of magnitude more input variables than ANOVA, and if I seem to remember that was a key part in how the winning bettors were succeeding - analyzing the tiniest differences of many variables. So I was using the term "neural net" to signify "a highly sophisticated regression as mentioned in the article", and should have been more explicit.

Thanks for helping me re-think it.

FS
 

CycloWizard

Lifer
Sep 10, 2001
12,348
1
81
Originally posted by: Future Shock
Yes, any regression will get you answers (and what is a neural net but a non-linear regression), but the article I was referring to made specific references to the sophisitcation of the analysis being done in Hong Kong.
Well, as one famous chemical engineering professor often says, with a five-parameter equation, he can fit an elephant. Heaping on additional layers of sophistication is no guarantee of increased accuracy. Indeed, for most systems (especially stochastic systems, which I'm sure anything involving human performance is), more robust fits are achieved with the least number of variables.
Nets will allow you to analyze an order of magnitude more input variables than ANOVA, and if I seem to remember that was a key part in how the winning bettors were succeeding - analyzing the tiniest differences of many variables. So I was using the term "neural net" to signify "a highly sophisticated regression as mentioned in the article", and should have been more explicit.
There's not really a limit to the number of variables you can analyze using ANOVA, unless the amount of data you have is limited. If the number of data points is much greater than the number of independent variables, then primary effects and higher-order interactions can be examined. If the number of independent variables and data points is similar, then higher-order interactions will be confounded - this is where I'm guessing neural nets would come into play, though it's been too long since I read about them to be sure.