Tag Archives: Jan Hendrick Schon

Data Representation and Trust

Though popular media often portrays science as purely objective, there are many subjective sides to it as well. One of these is that there is a certain amount of trust we have in our peers that they are telling the truth.

For instance, in most experimental papers, one can only present an illustrative portion of all the data taken because of the sheer volume of data usually acquired. What is presented is supposed to be to a representative sample. However, as readers, we are never sure this is actually the case. We trust that our experimental colleagues have presented the data in a way that is honest, illustrative of all the data taken, and is reproducible under similar conditions. It is increasingly becoming a trend to publish the remaining data in the supplemental section — but the utter amount of data taken can easily overwhelm this section as well.

When writing a paper, an experimentalist also has to make certain choices about how to represent the data. Increasingly, the amount of data at the experimentalist’s disposal means that they often choose to show the data using some sort of color scheme in a contour or color density plot. Just take a flip through Nature Physics, for example, to see how popular this style of data representation has become. Almost every cover of Nature Physics is supplied by this kind of data.

However, there are some dangers that come with color schemes if the colors are not chosen appropriately. There is a great post at medvis.org talking about the ills of using, e.g. the rainbow color scheme, and how misleading it can be in certain circumstances. Make sure to also take a look at the articles cited therein to get a flavor of what these schemes can do. In particular, there is a paper called “Rainbow Map (Still) Considered Harmful”, which has several noteworthy comparisons of different color schemes including ones that are and are not perceptually linear. Take a look at the plots below and compare the different color schemes chosen to represent the same data set (taken from the “Rainbow Map (Still) Considered Harmful” paper):

rainbow

The rainbow scheme appears to show more drastic gradients in comparison to the other color schemes. My point, though, is that by choosing certain color schemes, an experimentalist can artificially enhance an effect or obscure one he/she does not want the reader to notice.

In fact, the experimentalist makes many choices when publishing a paper — the size of an image, the bounds of the axes, the scale of the axes (e.g. linear vs. log), the outliers omitted, etc.– all of which can have profound effects on the message of the paper. This is why there is an underlying issue of trust that lurks in within the community. We trust that experimentalists choose to exhibit data in an attempt to be as honest as they can be. Of course, there are always subconscious biases lurking when these choices are made. But my hope is that experimentalists are mindful and introspective when representing data, doubting themselves to a healthy extent before publishing results.

To be a part of the scientific community means that, among other things, you are accepted for your honesty and that your work is (hopefully) trustworthy. A breach of this implicit contract is seen as a grave offence and is why cases of misconduct are taken so seriously.

Misconduct and The Wire

Season five of the critically acclaimed TV show The Wire tackles the issue of journalistic fraud and misconduct. In particular, Scott Templeton, a young ambitious journalist at the Baltimore Sun, writes a series of articles where he embellishes details, conjures up quotes out of thin air and ultimately fabricates events. His articles win him wide praise among those in the journalism community. He also garners the Pulitzer Prize, one of the highest accolades one can earn in the field. Even though flags are raised by some of his peers at the Baltimore Sun, at the upper management level, Scott Templeton’s stories are celebrated with enthusiasm.

Of course The Wire is fictional, but at the time The Wire was written, there was precedent for such journalistic falsification. Stephen Glass at the New Republic, Janet Cooke at the Washington Post and Jayson Blair at the New York Times had all been found guilty of journalistic misconduct associated with either plagiarism or fabrication in effort to advance their careers. Cooke was even awarded a Pulitzer Prize for her stories, which she eventually returned.

The reason I bring this all up is because I saw a very strong parallel between the fictional events that occurred in The Wire surrounding Scott Templeton and the actual events that occurred with respect to Jan-Hendrik Schon. In both cases, their notebooks were empty, there were claims by both that their information (e.g. data and notes) had somehow been corrupted and their sources were a closely guarded secret. While working at Bell Labs, Schon famously claimed to use the evaporator in Konstanz, Germany, so that he could “work” in isolation, making it more difficult to for others to reproduce his methods.

The question as to why this kind of misconduct takes place is an interesting one. In the case of Jayson Blair, Wikipedia says:

On the NPR radio show Talk of the Nation, Blair explained that his fabrications started with what he thought was a relatively innocent infraction: using a quote from a press conference which he had missed. He described a gradual process whereby his ethical violations became worse and contended that his main motivation was a fear of not living up to the expectations that he and others had for his career.

As can be gleaned from the quote above, there is little doubt that there is a certain amount of careerism and elevated expectation that is tied in with these instances of misconduct. That these and similar cases occur with relative frequency and happen in different fields suggests that the root cause is societal — an emphasis on perceived career success rather than valuing honesty and hard work. Because this is a sociological problem, all of us have a role to play in correcting it. The solution to the problem may require us to emphasize different values: integrity, meaningfulness of labor and honest motivations. Often these are not the qualities that advance one’s career, but this is because of a lack of emphasis on these values. Perhaps they should.

While the Wire is a fictional show and some readers are no doubt a little fed up with my frequent references to it, I do think that one can learn a lot from its main themes. As Tim O’Brien, author of The Things They Carried, said:

That’s what fiction is for. It’s for getting at the truth when the truth isn’t sufficient for the truth.

Plastic Fantastic

In the past year, I got the chance to read Plastic Fantastic by Eugenie Samuel Reich, a nonfiction work following the short career of Jan Hendrik Schon. Just in case you haven’t heard of him, Schon was one of the biggest fraudsters in scientific history. In a short period between 2000-2001, Schon published a series of  subfield-creating results ranging from superconductivity at 117K in intercalated buckyballs to light-emitting field effect transistors. Most notably, he also announced the discovery of self-assembled molecular field effect transistors (SAMFETs), which would have had the potential to revolutionize the processors in one’s computer and thereby the economy. Most of his results, including the ones mentioned, were found to have been fabricated.

It is quite remarkable that Schon was able to publish 15 first-author papers in either Nature or Science in a time frame spanning from 2000-2001, while also publishing a whole slew of papers in other journals as well.  Is this the absurd length one must go to for one to get caught? While physicists tend to be quite rigorous when trying to explain data, they tend to generally be much more trusting of colleagues that produce the data.

Although the book can be quite gossipy at times, it achieves the goal of imparting to the reader a sense of skepticism about published data. While he may have been the most egregious of the lot, Schon is not alone in perpetrating scientific dishonesty (the recent case of STAP cells comes to mind). It is pretty clear that many cases of “fudging” and/or fabrication occur that go unpunished and are never brought to light.

One aspect of the book that I found particularly disturbing is the effect that Schon’s results had on some careers of young scientists. Many graduate students spent years attempting to replicate his results without success in what is considered the most important years of one’s scientific development. Some young scientific careers were no doubt destroyed because of Schon’s outlandish claims.

One cannot stress enough the importance of scientific integrity and reporting accurate, reproducible data. This book may not be the best-written, but it serves an important purpose in opening one’s eyes to the ridiculous lengths to which one must go before being found out as a fraudster. This book has left no doubt in my mind that I have read papers containing “fudged” data and also that I will do so in the future. I just hope that I don’t spend years attempting to reproduce such a result.