Comment: How well do wines scores work? - Andrew Jefford

Highlights

Andrew Jefford goes running with the numbers.

Now that the 100-point scale approaches ubiquity, and now that we are fully into the post-Parker era in which a multiplicity of scores and scorers joust and jostle for the attention of drinkers, it’s time to review the scoring scene.

1. The scale doesn’t matter

In practice, the 100-point scale and the 20-point scale are the same thing. If you use the former, no wine scoring below 80 is worth reviewing; if you use the latter, no wine scoring below 10 is worth reviewing. So both scales have twenty points of graduation (since practitioners of the 20-point scale use half-points). Indeed most reviewed wines lie within fifteen points of graduation: anything less is pure punishment, and critics are reluctant to waste time punishing, since it seems vindictive and there is in any case so much good wine to be enthusiastic about.

Whether a critic uses one or other system is a critical dress code: 20 points is old-fashioned formal attire and ‘European’, conveying respect and cautious sobriety; 100 points is casual, open-necked and globalist, implying unstuffiness and easy-going enthusiasm.

2. Scores are not universal

A universal scoring system does not exist. Critics sometime protest otherwise, but all scores are relative, relating to the peer-group within which the reviewed wines lie. It has to be so, since the differences which exist between wine genres are so great as to make these genres quite literally incomparable. All this is right and proper, permitting the untrammelled assessment of quality within any particular peer group: of most use for both drinkers and producers. It must be possible to create (and to acclaim) a perfect Muscadet, a perfect Gewurztraminer or a perfect rosé wine.

Misunderstandings persist, though, for two reasons. One is that critics fear being thought foolish, so are reluctant to award high scores to the ‘lesser genres of wine’, though in relative terms these high scores may be merited.

The other reason is that the alluring simplicity of scores means that drinkers assume that the scoring system is indeed universal rather than relative. They would therefore assume that any 100-point Muscadet must be “as good as” Latour 2010 (the incorrect conclusion), rather than being “different to Latour 2010 but as good as Muscadet can ever be” (the correct conclusion).

We’re left with a mash-up of faux-universalism and sensible peer-group scoring, with both muddied further by cap-doffing to fashion biases among somms, bloggers and social-media chatterers, and by quite natural preferences on the part of critics for certain styles of wine. All very human, in sum. Treat scores with tender care.

3. Scoring is inflationary

How do scores make scorers famous in a world where many are jostling for influence? By a score achieving some kind of sales traction. Low scores, though they may be well-judged, don’t achieve sales traction; high scores do. This effect is amplified when producers start to market and promote their wine based on scores: they will obviously cite the highest, thereby increasing the fame of the most lavish scorers. Hence the inherent inflation in the scoring process. Yes, experienced users of scores learn to ‘discount’ the scores of certain critics while taking others at face value, but they are in a minority among those who buy wines based on scores, and by then the damage is done. This in turn leads to…

4. The tragedy of 89

Ask any Californian: a score of 89 is a disaster. It’s damnation by faint praise. Much the same holds sway in Australia, and increasingly in Europe, too: 89 is a tombstone score and a snub to ambition.

Yet with truly large cohorts of ‘assessable’ wines, like the annual Bordeaux or Burgundy harvests, excellent wines must be squeezed down to 89 or less by the mathematical jostling consequent on the region’s best wines topping out at, say, 96 or 97 for any vintage regarded (like 2017 in Bordeaux) as being good but not great. This is no less true, indeed, for great vintages topping out with 100-point scores, since in such vintages there are even more outstanding wines to nuance. In both scenarios, a score of 89 is very respectable indeed.

In Bordeaux, 89 is just about the maximum any ‘normal’ cru bourgeois — i.e. one which hasn’t yet been purchased by a classed growth or acquired the services of a celebrated ‘name’ consultant — can hope to be awarded. For this reason, it’s the score I always look for in any Bordeaux I buy, especially in a great vintage, since the price-quality ratio is always likely to be better (often much better) than for higher scoring wines. Indeed I’d suggest that a well-sited 89-point Bordeaux from a good or great vintage will, after half a decade’s storage, seem to most palates (if served blind) a better wine than most 93-point or 94-point reds from other regions: more proof, were more needed, that universal scores cannot and do not exist.

So what are we going to do about the tragedy of 89? How can we set about restoring the reputation of this maligned integer, and thereby render justice to 88 and 87, which should also be regarded in large-cohort regions, of which there are now many, as indubitably good scores? I don’t know, especially since in aspiring small-cohort regions any wine scoring 89 might truly be puffing and blowing a bit to keep up with the best (yes, scores are also relative to cohort size).

You’d think the problem would be less acute with the 20-point scale, since the symbolism of the first digit plays a less crucial role, yet somehow 14.5 sounds even dustier and more dismissive than does 89.

5. Score overload

More and more wine critics, more and more scores: drinkers (I suspect) are beginning to feel nauseous with score overload. At the same time, a lot of sighted fine-wine scoring now seems as if it is generated by artificial intelligence, based on pedigree and reputation, with the only interest accruing to wines which actually break their habitual scoring trajectory in some way or other.

(It’s an appallingly boring prospect, I know, but a lot of time and effort could be saved by giving each new regional vintage a single score as a vintage, to set an overall benchmark, then using the ‘underperform’, ‘neutral’ or ‘outperform’ terminology familiar from financial broking analysis for every individual wine in that vintage rather than playing around with numbers themselves.)

Perhaps there is a positive side to all of this, which is that the words written to accompany scores may become more scrutinised than of late, and the scores a little less. You should certainly use the written note to gauge how carefully a critic may have tasted a wine, and to reach an assessment of how credible or reliable that note might be. Notes, indeed, can indicate tasting prowess itself (or its strenuously camouflaged absence).

Look out, too, for ‘the authentic voice’ coming out from inside the AI tasting-note babble – and especially a sense of personal commitment and enthusiasm about a wine. You may, as I often do, prefer a wine with a lower score to one with a higher score based on what the critic has actually written about the wine, and the way she or he has described it. Then (assuming you’re not a label drinker) enjoy more pleasure for less money.