Tag Archives: Goodhart’s Law

Goodhart’s Law Gone Wrong…

I didn’t envision this when I first wrote the previous post…

Conditional Risk

Goodhart’s Law and Citation Metrics

According to Wikipedia, Goodhart’s law colloquially states that:

“When a measure becomes a target, it ceases to be a good measure.”

It was originally formulated as an economics principle, but has been found to be applicable in a much wider variety of circumstances. Let’s take a look at a few examples to understand what this principle means.

Police departments are often graded using crime statistics. In the US in particular, a combined index of eight categories constitute a “crime index”. In 2014, it was reported in Chicago magazine that the huge crime reduction seen in Chicago was merely due to reclassification of certain crimes. Here is the summary plot they showed:

ChicagoCrime

Image reprinted from Chicago magazine

In effect, some felonies were labeled misdemeanors, etc. The manipulation of the “crime index” corrupted the way the police did their jobs.

Another famous example of Goodhart’s law is Google’s search algorithm, known as PageRank. Crudely, PageRank works in the following way as described by Wikipedia:

“PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.”

Knowing how PageRank works has obviously led to its manipulation. People seeking to have greater visibility and wanting to be ranked higher on Google searches have used several schemes to raise their rating. One of the most popular schemes is to post links of one’s own website in the comments section of high-ranked websites in order to inflate one’s own ranking. You can read a little more about this and other schemes here (pdf!).

With the increased use of citation metrics among the academic community, it should come as no surprise that it also can become corrupted. Increasingly, there are many authors per paper, as groups of authors can all take equal credit for papers when using the h-index as a scale. Many scientists also spend time emailing their colleagues to urge them to cite one of their papers (I only know of this happening anecdotally).

Since the academic example hits home for most of the readers of this blog, let me try to formulate a list of the beneficial and detrimental consequences of bean-counting:

Advantages:

  1. One learns how to write a technical paper early in one’s career.
  2. It can motivate some people to be more efficient with their time.
  3. It provides some sort of metric by which to measure scientific competence (though it can be argued that any currently existing index is wholly inadequate, and will always be inadequate in light of Goodhart’s law!).
  4. Please feel free to share any ideas in the comments section, because I honestly cannot think of any more!

Disadvantages:

  1. It makes researchers focuses on short-term problems instead of long-term moon-shot kinds of problems.
  2. The community loses good scientists because they are deemed as not being productive enough. A handful of the best students I came across in graduate school left physics because they didn’t want to “play the game”.
  3. It rewards those who may be more career-oriented and focus on short-term science, leading to an overpopulation of these kinds of people in the scientific community.
  4. It may lead scientists to cut corners and even go as far as to falsify data. I have addressed some of these concerns before in the context of psychology departments.
  5. It provides an incentive to flood the literature with papers that are of low quality. It is no secret that the number of publications has ballooned in the last couple decades. Though it is hard to quantify quality, I cannot imagine that scientists have just been able to publish more without sacrificing quality in some way.
  6. It takes the focus of scientists’ jobs away from science, and makes scientists concerned with an almost meaningless number.
  7. It leads authors to overstate the importance of their results in effort to publish in higher profile journals.
  8. It does not value potential. Researchers who would have excelled in their latter years, but not their former, are under-valued. Late-bloomers therefore go under-appreciated.

Just by examining my own behavior in reference to the above lists, I can say that my actions have been altered by the existence of citation and publication metrics. Especially towards the end of graduate school, I started pursuing shorter-term problems so that they would result in publications. Obviously, I am not the only one that suffers from this syndrome. The best one can do in this scenario is to work on longer-term problems on the side, while producing a steady stream of papers on shorter-term projects.

In light of the two-slit experiment, it seems ironic that physicists are altering their behavior due to the fact that they are being measured.