Wednesday 22 July 2015

How To Fix a Tennis Match (by Andrey Golubev and Aleksandr Nedovyesov)

Scheveningen is not a name that is likely to be familiar to the majority of people. A brief search suggests that it is a subdistrict of The Hague, which advertises itself as 'Holland's most famous seaside resort' with a 'fascinating marine world at Sea Life'. More pertinent to this article is that it also hosts The Hague Open each year, a clay court Challenger event. The Challenger Tour is far from the glamour of the likes of Wimbledon and generally consists of a collection of up-and-coming youngsters and journeymen players.

It was in this tournament this lunchtime that two of the members of Kazakhstan's overachieving Davis Cup team met in the first round, fresh off the plane from their agonising 3-2 defeat in Darwin. Fourth seed, Aleksandr Nedovyesov, was drawn to play Andrey Golubev, who he had played doubles with in Australia mere days ago. As the odds below show, the bookmakers were unable to separate them and had it priced as a perfect 50-50 match.
The majority of bookmakers priced this as an each of two match
Andrey Golubev came out of the blocks quickly and dominated the first set, winning it 6-2. So far, there appeared to be nothing suspicious about this match. At the end of the first set, Golubev was around 1.23 and Nedovyesov was around 4.0 (thanks to @ahunnbet and @dougalltennis for these prices). This is more or less perfectly in line with where we would expect the prices to be at this stage.

The second set began with a pair of holds, but already there were a few flags being raised as the price on Nedovyesov began to shorten with Golubev's price drifting outwards. At 3-3 in the second set, Andrey Golubev was still to drop a point on serve in the second set, yet there had been an astonishing move in the prices. The image below (thanks to @ahunnbet again) shows the Betfair graph at 3-3, 30-15 on Nedovyesov's serve:
Bizarrely, not only has Nedovyesov shortened, he is actually now the favourite for the match. We can see that the most recent price matched on Nedovyesov was 1.69, which signifies an implied probability of 57.2% to win the match. Now, given that his starting price of 1.91 with Pinnacle gave him a 52.4% probability of winning, it seems rather peculiar that he was apparently more likely to win the match at 2-6, 3-3 down than he was at the start.

Now, you do sometimes see markets like this if there are injury doubts about a player and a retirement is expected. Given that Betfair pays out on the winner via retirement if the first set has been completed, you can often see injured players drift right out, but it is difficult to see any indication of this. There was no MTO and there was no indication of any injury according to those watching the match. Indeed, given that Golubev had held to love in four consecutive service games at this stage, it seemed as though he was not struggling at all.

However, that just seems to leave the conclusion that someone betting on the match on Betfair knew what was going to happen in the match.

Going into the second set TB, Andrey Golubev had still only dropped one point on serve in the entire second set. Despite that though, he was priced at 3.0 to win the tiebreak with Bet365 and 1.61 to win the match. Three sloppy unforced errors and a double fault later, he was 0-4 down in the TB. He pulled it back to 2-4, but the market had no faith in him by this stage. Nedovyesov had fallen right down to 1.27 to win the match or 78.7%. Quite impressive for a player that was only around 50% at the start of the match to be almost 30% more likely to win the match down a set and up just one mini-break in the TB.
As scripted, Aleksandr Nedovyesov won the second set TB 7-2 and by this stage, the market knew precisely where this match was headed.  Priced at just 1.17 for the match or 85.5%, there seemed little doubt who was winning this match. Still, there was no sign of any injury concern that might be about to cause an imminent retirement and no momentum effect would cause that sort of price move. If a player is rated at a 50% chance to win a match, it follows that he must be 50% to win a single set. There could be an argument that with the momentum of having won the second set, Nedovyesov might have been a slight favourite in the deciding set, but certainly not an 85.5% favourite.
To the surprise of nobody that was following the match, Aleksandr Nedovyesov broke early in the third set dropping to 1.07 immediately following the break and then just a steady decent to 1.01.

Now, there will be the usual arguments that there was an injury or that it is simply the complaints of bitter gamblers on the internet that lost money. However, the reality is that virtually all of the major bookmakers stopped taking bets on this match during the later part of the second set and a number of them have reported the match to the TIU, showing that it is not only the gamblers that are suspicious of this match.

Interestingly, these two players also met in the first round of a Challenger event just over two months ago in Aix en Provence. In that match, Andrey Golubev was victorious, winning 4-6, 6-4, 6-3 in the reverse situation of this match. What makes it even more interesting, particularly in the light of the odds movements on this match, is this observations from @tennispurist on Twitter:
Whatever argument you might want to make to explain today's match as a one-off, it becomes far more difficult to make that argument twice for the same pair of players. If two matches between the same two players show almost identical suspicious odds movements, surely this must be a huge red flag?

The link below shows the full match. Now that we have seen the suspicious odds movements, you can watch the match and see whether you can see anything to indicate why the odds should have moved as they did.

d

Saying this is getting repetitive, but as with both the Meersbusch and the Dallas matches in the past twelve months, it is likely that nothing more will be heard of this match. The notoriously secretive TIU does not release any information beyond ban announcements, so we should not expect to hear anything. All we can do is flag up these matches and hope that eventually the weight of evidence will force action to be taken.

Once again, I will finish with a quote from my last article on this subject. It is as relevant now as it was six months ago. There have been some good articles written by the mainstream tennis media recently addressing the match fixing issue, but this start cannot be allowed to just peter out.
The authorities need to target those players and associates that are involved in match fixing. And if the authorities are not going to do it, it needs journalists to question those in authority. We saw in cycling how it was journalists that eventually blew the lid on the Lance Armstrong doping situation. Is there a serious respected tennis journalist that is willing to ask the right questions to the right people?

Tuesday 7 July 2015

Tennis Age Curves: An Introduction

Over the years, we have seen teenage phenomenons come through and win major titles at incredibly young ages, while we are also now seeing a rising number of players aged over 30 remaining at the top level. It raises the question as to what exactly is the peak age for a tennis player?
Roger Federer is one of the greatest players of all time, but when was his peak?

Using data for every ATP match stretching back to 1991, we can look at answering this question. As a basic starting point, we shall look at a player's combined score as a representation of player quality. The combined score is very simple - it is the percentage of service games held added to the percentage of return games that a player breaks in. Clearly, the higher the score, the better. It is basic for now in that it is aggregated over all the surfaces and it does not take into account the quality of opposition, but it is a decent starting point.

By calculating a player's combined score on each of his birthdays throughout his career, we can begin to draw up an age curve for that player. By ensuring that we calculate the score on the same day each year, it should mean that a player has competed in roughly the same amount of matches on each surface, ensuring that this does not affect the data in a noticeable manner.

So, let us take a look at the age curve for possibly the greatest tennis player of all time, Roger Federer:

We can see the rapid improvement from his 18th birthday, reaching a peak around the age of 24. That would have consisted of the second half of 2005 and first half of 2006, where he won Wimbledon, the US Open, the Australian Open and lost in the final of the French Open. He also reached the final of the Year End Championships, Masters 1000 titles in Cincinnati, Indian Wells and Miami, finals in Rome and Monte Carlo, plus titles in Thailand and Halle. It seems pretty reasonable to accept that this could well have represented his peak.

After that, we see a very gradual decline through until around his 30th birthday where, for whatever reason, he seemed to manage to lift his game again briefly before settling at a relatively constant level.

Now let us take a look at Federer's biggest rival, Rafael Nadal:

The first thing to note is that Rafael Nadal, at an incredibly young age of 17, immediately appeared on the ATP Tour at an incredibly high level. Consider that it took Federer around four years from his debut on tour to reach a combined score of around 106, where Nadal started.

We can see a very rapid rise during his teenage years, followed by a slight fall, before another rise to reach the a combined score of over 120 at the age of 23. There was another slight dip before he continued his rise, reaching his career peak of over 122 at the age of 27. This would have been from mid-French Open 2013 to mid-French Open 2014, during which period he won the French Open and US Open, as well as reaching the Australian Open final. There were also Masters 1000 titles in Canada, Cincinnati and Madrid, plus further titles in Doha and Rio de Janeiro.

Since that peak, we can see the period where injuries have really started to hamper the Spaniard and his current level is as low as it has been since he left his teenage years.

The next player to take a look at is the 14-time Grand Slam champion, Pete Sampras:

The first thing to note is that there would have been matches that he played during his teenage years, but, as the data only stretches back to 1991, we cannot model that for now. I will not go into too much detail here, but it is interesting to see the sudden revival during the final year of his career, strongly driven by his US Open success.

The final player to look at is Andrei Medvedev:

Medvedev is a curious case as he appears to have reached his peak at the age of just 18. He had one of the highest ratings of any player in the database at 18, but it appears as though that was as good as it was going to get. He dropped rapidly from that peak, managed to stabilise at around 105 for his early twenties, but dropped rapidly again as he passed 26 and eventually retired at the age of 29.

As an interesting comparison, here are those four players all on the same graph:

We can see Medvedev's incredible early career spike and how incredible Rafael Nadal was at an incredibly young age. We can also observe that past the age of 22, both Rafael Nadal and Roger Federer were better than Pete Sampras. One final interesting observation is that it would appear that Rafael Nadal's peak is very fractionally higher than Roger Federer's peak.

This has been a very basic introduction to the age curve work. However, it will be interesting to see whether we can find any general trends as to where players tend to peak. One thought might be that certain groups of players improve and decline in different ways. For example, we might expect to see players that rely on a big serve (think Ivo Karlovic and John Isner) may well decline far more slowly given that a slight loss of speed and stamina may not affect them, while players that rely more on their speed around the court and dogged returning (think David Ferrer and Juan Carlos Ferrero) may suddenly drop off a cliff as their physical attributes fall.

These are all ideas for future work and any further ideas or suggestions around this area would be much welcome and feel free to mention them either on Twitter or in the comments section below.
Powered by Blogger.