Tag Archives: optics

Whence we know the photon exists

In my previous post, I laid out the argument discussing why the photoelectric effect does not imply the existence of photons. In this post, I want to outline, not the first, but probably the conceptually simplest experiment that showed that photons do indeed exist. It was performed by Grangier, Roger and Aspect in 1986, and the paper can be found at this link (PDF!).

The idea can be described by considering the following simple experiment. Imagine I have light impinging on a 50/50 beamsplitter and detectors at both of the output ports, as pictured below. In this configuration, 50% of the light will be transmitted, labelled t below, and 50% of the light will be reflected, labeled r below.

BeamSplitter

Now, if a discrete and indivisible packet of light, i.e. a photon, is shone on the beam splitter, then it must either be reflected (and hit D1) or be transmitted (and hit D2). The detectors are forbidden from clicking in coincidence. However, there is one particularly tricky thing about this experiment. How do I ensure that I only fire a single photon at the beam splitter?

This is where Aspect, Roger and Grangier provide us with a rather ingenious solution. They used a two-photon cascade from a calcium atom to solve the issue. For the purpose of this post, one only needs to know that when a photon excites the calcium atom to an excited state, it emits two photons as it relaxes back down to the ground state. This is because it relaxes first to an intermediate state and then to the ground state. This process is so fast that the photons are essentially emitted simultaneously on experimental timescales.

Now, because the calcium atom relaxes in this way, the first photon can be used to trigger the detectors to turn them on, and the second photon can impinge on the beam splitter to determine whether there are coincidences among the detectors. A schematic of the experimental arrangement is shown below (image taken from here; careful, it’s a large PDF file!):

GRA experiment

Famously, they were essentially able to extrapolate their results and show that the photons are perfectly anti-correlated, i.e. that when a photon reflects off of the beam splitter, there is no transmitted photon and vice versa. Alas the photon!

However, they did not stop there. To show that quantum mechanical superposition applies to single photons, they sent these single photons through a Mach-Zehnder interferometer (depicted schematically below, image taken from here).

mach-zehnder

They were able to show that single photons do indeed interfere. The fringes were observed with visibility of about 98%. A truly triumphant experiment that showed not only the existence of photons cleanly, but that their properties are non-classical and can be described by quantum mechanics!

The photoelectric effect does not imply photons

When I first learned quantum mechanics, I was told that we knew that the photon existed because of Einstein’s explanation of the photoelectric effect in 1905. As the frequency of light impinging on the cathode material was increased, electrons came out with higher kinetic energies. This led to Einstein’s famous formula:

K.E. = \hbar\omega - W.F.

where K.E. is the kinetic energy of the outgoing electron, \hbar\omega is the photon energy and W.F. is the material-dependent work function.

Since the mid-1960s, however, we have known that the photoelectric effect does not definitively imply the existence of photons. From the photoelectric effect alone, it is actually ambiguous whether it is the electronic levels or the impinging radiation that should be quantized!

So, why do we still give the photon explanation to undergraduates? To be perfectly honest, I’m not sure whether we do this because of some sort of intellectual inertia or because many physicists don’t actually know that the photoelectric effect can be explained without invoking photons. It is worth noting that Willis E. Lamb, who played a large role in the development of quantum electrodynamics, implored other physicists to be more cautious when using the word photon (see for instance his 1995 article entitled Anti-Photon, which gives an interesting history of the photon nomenclature and his thoughts as to why we should be wary of its usage).

Let’s return to 1905, when Einstein came up with his explanation of the photoelectric effect. Just five years prior, Planck had heuristically explained the blackbody radiation spectrum and, in the process, evaded the ultraviolet catastrophe that plagued explanations based on the classical equipartition theorem. Planck’s distribution consequently provided the first evidence of “packets of light”, quantized in units of \hbar. At the time, Bohr had yet to come up with his atomic model that suggested that electron levels were quantized, which had to wait until 1913. Thus, from Einstein’s vantage point in 1905, he made the most reasonable assumption at the time — that it was the radiation that was quantized and not the electronic levels.

Today, however, we have the benefit of hindsight.

According to Lamb’s aforementioned Anti-Photon article, in 1926, G. Wentzel and G. Beck showed that one could use a semi-classical theory (i.e. electronic energy levels are quantized, but light is treated classically) to reproduce Einstein’s result. In the mid- to late 1960’s, Lamb and Scully extended the original treatment and made a point of emphasizing that one could get out the photoelectric effect without invoking photons. The main idea can be sketched if you’re familiar with the Fermi golden rule treatment to a harmonic electric field perturbation of the form \textbf{E}(t) = \textbf{E}_0 e^{-i \omega t}, where \omega is the frequency of the incoming photon. In the dipole approximation, we can write the potential as V(t) = -e\textbf{x}(t)\cdot\textbf{E}(t) and we get that the transition rate is:

R_{i \rightarrow f} = \frac{1}{t} \frac{1}{\hbar^2}|\langle{f}|e\textbf{x}(t)\cdot\textbf{E}_0|i \rangle|^2 [\frac{\textrm{sin}((\omega_{fi}-\omega)t/2)}{(\omega_{fi}-\omega)/2}]^2

Here, \hbar\omega_{fi} = (E_f - E_i) is the difference in energies between the initial and final states. Now, there are a couple things to note about the above expression. Firstly, the term in brackets (containing the sinusoidal function) peaks up when \omega_{fi} \approx \omega. This means that when the incoming light is resonant between the ground state and a higher energy level, the transition rate sharply increases.

Let us now interpret this expression with regard to the photoelectric effect. In this case, there exists a continuum of final states which are of the form \langle x|f\rangle \sim e^{i\textbf{k}\cdot\textbf{r}}, and as long as \hbar\omega > W.F., where W.F. is the work function of the material, we recover \hbar\omega = W.F. + K.E., where K.E. represents the energy given to the electron in excess of the work function. Thus, we recover Einstein’s formula from above!

In addition to this, however, we also see from the above expression that the current on the photodetector is proportional to \textbf{E}^2_0, i.e. the intensity of light impinging on the cathode. Therefore, this semi-classical treatment improves upon Einstein’s treatment in the sense that the relation between the intensity and current also naturally falls out.

From this reasoning, we see that the photoelectric effect does not logically imply the existence of photons.

We do have many examples that non-classical light does exist and that quantum fluctuations of light play a significant role in experimental observations. Some examples are photon anti-bunching, spontaneous emission, the Lamb shift, etc. However, I do agree with Lamb and Scully that the notion of a photon is indeed a challenging one and that caution is needed!

A couple further interesting reads on this subject at a non-technical level can be found here: The Concept of the Photon in Physics Today by Scully and Sargent and The Concept of the Photon – Revisited in OPN Trends by Muthukrishnan, Scully and Zubairy (pdf!)

An Undergraduate Optics Problem – The Brewster Angle

Recently, a lab-mate of mine asked me if there was an intuitive way to understand Brewster’s angle. After trying to remember how Brewster’s angle was explained to me from Griffiths’ E&M book, I realized that I did not have a simple picture in my mind at all! Griffiths’ E&M book uses the rather opaque Fresnel equations to obtain the Brewster angle. So I did a little bit of thinking and came up with a picture I think is quite easy to grasp.

First, let me briefly remind you what Brewster’s angle is, since many of you have probably not thought of the concept for a long time! Suppose my incident light beam has both components, s– and p-polarization. (In case you don’t remember, p-polarization is parallel to the plane of incidence, while s-polarization is perpendicular to the plane of incidence, as shown below.) If unpolarized light is incident on a medium, say water or glass, there is an angle, the Brewster angle, at which the light comes out perfectly s-polarized.

An addendum to this statement is that if the incident beam was perfectly p-polarized to begin with, there is no reflection at the Brewster angle at all! A quick example of this is shown in this YouTube video:

So after that little introduction, let me give you the “intuitive explanation” as to why these weird polarization effects happen at the Brewster angle. First of all, it is important to note one important fact: at the Brewster angle, the refracted beam and the reflected beam are at 90 degrees with respect to each other. This is shown in the image below:

Why is this important? Well, you can think of the reflected beam as light arising from the electrons jiggling in the medium (i.e. the incident light comes in, strikes the electrons in the medium and these electrons re-radiate the light).

However, radiation from an oscillating charge only gets emitted in directions perpendicular to the axis of motion. Therefore, when the light is purely p-polarized, there is no light to reflect when the reflected and refracted rays are orthogonal — the reflected beam can’t have the polarization in the same direction as the light ray! This is shown in the right image above and is what gives rise to the reflectionless beam in the YouTube video.

This visual aid enables one to use Snell’s law to obtain the celebrated Brewster angle equation:

n_1 \textrm{sin}(\theta_B) = n_2 \textrm{sin}(\theta_2)

and

\theta_B + \theta_2 = 90^o

to obtain:

\textrm{tan}(\theta_B) = n_2/n_1.

The equations also suggest one more thing: when the incident light has an s-polarization component, the reflected beam must come out perfectly polarized at the Brewster angle. This is because only the s-polarized light jiggles the electrons in a way that they can re-radiate in the direction of the outgoing beam. The image below shows the effect a polarizing filter can therefore have when looking at water near the Brewster angle, which is around 53 degrees for water.

To me, this is a much simpler way to think about the Brewster angle than dealing with the Fresnel equations.

Nonlinear Response and Harmonics

Because we are so often solving problems in quantum mechanics, it is sometimes easy to forget that certain effects also show up in classical physics and are not “mysterious quantum phenomena”. One of these is the problem of avoided crossings or level repulsion, which can be much more easily understood in the classical realm. I would argue that the Fano resonance also represents a case where a classical model is more helpful in grasping the main idea. Perhaps not too surprisingly, a variant of the classical harmonic oscillator problem is used to illustrate the respective concepts in both cases.

There is also another cute example that illustrates why overtones of the natural harmonic frequency components result when subject to slightly nonlinear oscillations. The solution to this problem therefore shows why harmonic distortions often affect speakers; sometimes speakers emit frequencies not present in the original electrical signal. Furthermore, it shows why second harmonic generation can result when intense light is incident on a material.

First, imagine a perfectly harmonic oscillator with a potential of the form V(x) = \frac{1}{2} kx^2. We know that such an oscillator, if displaced from its original position, will result in oscillations at the natural frequency of the oscillator \omega_o = \sqrt{k/m} with the position varying as x(t) = A \textrm{cos}(\omega_o t + \phi). The potential and the position of the oscillator as a function of time are shown below:

harmpotentialrepsonse

(Left) Harmonic potential as a function of position. (Right) Variation of the position of the oscillator with time

Now imagine that in addition to the harmonic part of the potential, we also have a small additional component such that V(x) = \frac{1}{2} kx^2 + \frac{1}{3}\epsilon x^3, so that the potential now looks like so:

nonlinearharm

The equation of motion is now nonlinear:

\ddot{x} = -c_1x - c_2x^2

where c_1 and c_2 are constants. It is easy to see that if the amplitude of oscillations is small enough, there will be very little difference between this case and the case of the perfectly harmonic potential.

However, if the amplitude of the oscillations gets a little larger, there will clearly be deviations from the pure sinusoid. So then what does the position of the oscillator look like as a function of time? Perhaps not too surprisingly, considering the title, is that not only are there oscillations at \omega_0, but there is also an introduction of a harmonic component with 2\omega_o.

While the differential equation can’t be solved exactly without resorting to numerical methods, that the harmonic component is introduced can be seen within the framework of perturbation theory. In this context, all we need to do is plug the solution to the simple harmonic oscillator, x(t) = A\textrm{cos}(\omega_0t +\phi) into the nonlinear equation above. If we do this, the last term becomes:

-c_2A^2\textrm{cos}^2(\omega_0t+\phi) = -c_2 \frac{A^2}{2}(1 + \textrm{cos}(2\omega_0t+2\phi)),

showing that we get oscillatory components at twice the natural frequency. Although this explanation is a little crude — one can already start to see why nonlinearity often leads to higher frequency harmonics.

With respect to optical second harmonic generation, there is also one important ingredient that should not be overlooked in this simplified model. This is the fact that frequency doubling is possible only when there is an x^3 component in the potential. This means that the potential needs to be inversion asymmetric. Indeed, second harmonic generation is only possible in inversion asymmetric materials (which is why ferroelectric materials are often used to produce second harmonic optical signals).

Because of its conceptual simplicity, it is often helpful to think about physical problems in terms of the classical harmonic oscillator. It would be interesting to count how many Nobel Prizes have been given out for problems that have been reduced to some variant of the harmonic oscillator!

An Interesting Research Avenue, an Update, and a Joke

An Interesting Research Avenue: A couple months ago, Stephane Mangin of the Insitut Jean Lamour gave a talk on all-optical helicity-dependent magnetic switching (what a mouthful!) at Argonne, which was fascinating. I was reminded of the talk yesterday when a review article on the topic appeared on the arXiv. The basic phenomenon is that in certain materials, one is able to send in a femtosecond laser pulse onto a magnetic material and switch the direction of magnetization using circularly polarized light. This effect is reversible (in the sense that circularly polarized light in the opposite direction will result in a magnetization in the opposite direction) and is reproducible. During the talk, Mangin was able to show us some remarkable videos of the phenomenon, which unfortunately, I wasn’t able to find online.

The initial study that sparked a lot of this work was this paper by Beaurepaire et al., which showed ultrafast demagnetization in nickel films in 1996, a whole 20 years ago! The more recent study that triggered most of the current work was this paper by Stanciu et al. in which it was shown that the magnetization direction could be switched with a circularly polarized 40-femtosecond laser pulse on ferromagnetic film alloys of GdFeCo. For a while, it was thought that this effect was specific to the GdFeCo material class, but it has since been shown that all-optical helicity-dependent magnetic switching is actually a more general phenomenon and has been observed now in many materials (see this paper by Mangin and co-workers for example). It will be interesting to see how this research plays out with respect to the magnetic storage industry. The ability to read and write on the femtosecond to picosecond timescale is definitely something to watch out for.

Update: After my post on the Gibbs paradox last week, a few readers pointed out that there exists some controversy over the textbook explanation that I presented. I am grateful that they provided links to some articles discussing the subtleties involved in the paradox. Although one commenter suggested Appendix D of E. Atlee Jackson’s textbook, I was not able to get a hold of this. It looks like a promising textbook, so I may end up just buying it, however!

The links that I found helpful about the Gibbs paradox were Jaynes’ article (pdf!) and this article by R. Swendsen. In particular, I found Jaynes’ discussion of Whifnium and Whoofnium interesting in the role that ignorance and knowledge plays our ability to extract work from a partitioned gases. Swendsen’s tries to redefine entropy classically (what he calls Boltzmann’s definition of entropy), which I have to think about a little more. But at the moment, I don’t think I buy his argument that this resolves the Gibbs paradox completely.

A Joke: 

Q: What did Mrs. Cow say to Mr. Cow?

A: Hubby, could you please mooo the lawn?

Q: What did Mr. Cow say back to Mrs. Cow?

A: But, sweetheart, then what am I going to eat?