Tag Archives: optics

An Undergraduate Optics Problem – The Brewster Angle

Recently, a lab-mate of mine asked me if there was an intuitive way to understand Brewster’s angle. After trying to remember how Brewster’s angle was explained to me from Griffiths’ E&M book, I realized that I did not have a simple picture in my mind at all! Griffiths’ E&M book uses the rather opaque Fresnel equations to obtain the Brewster angle. So I did a little bit of thinking and came up with a picture I think is quite easy to grasp.

First, let me briefly remind you what Brewster’s angle is, since many of you have probably not thought of the concept for a long time! Suppose my incident light beam has both components, s– and p-polarization. (In case you don’t remember, p-polarization is parallel to the plane of incidence, while s-polarization is perpendicular to the plane of incidence, as shown below.) If unpolarized light is incident on a medium, say water or glass, there is an angle, the Brewster angle, at which the light comes out perfectly s-polarized.

An addendum to this statement is that if the incident beam was perfectly p-polarized to begin with, there is no reflection at the Brewster angle at all! A quick example of this is shown in this YouTube video:

So after that little introduction, let me give you the “intuitive explanation” as to why these weird polarization effects happen at the Brewster angle. First of all, it is important to note one important fact: at the Brewster angle, the refracted beam and the reflected beam are at 90 degrees with respect to each other. This is shown in the image below:

Why is this important? Well, you can think of the reflected beam as light arising from the electrons jiggling in the medium (i.e. the incident light comes in, strikes the electrons in the medium and these electrons re-radiate the light).

However, radiation from an oscillating charge only gets emitted in directions perpendicular to the axis of motion. Therefore, when the light is purely p-polarized, there is no light to reflect when the reflected and refracted rays are orthogonal — the reflected beam can’t have the polarization in the same direction as the light ray! This is shown in the right image above and is what gives rise to the reflectionless beam in the YouTube video.

This visual aid enables one to use Snell’s law to obtain the celebrated Brewster angle equation:

$n_1 \textrm{sin}(\theta_B) = n_2 \textrm{sin}(\theta_2)$

and

$\theta_B + \theta_2 = 90^o$

to obtain:

$\textrm{tan}(\theta_B) = n_2/n_1$.

The equations also suggest one more thing: when the incident light has an s-polarization component, the reflected beam must come out perfectly polarized at the Brewster angle. This is because only the s-polarized light jiggles the electrons in a way that they can re-radiate in the direction of the outgoing beam. The image below shows the effect a polarizing filter can therefore have when looking at water near the Brewster angle, which is around 53 degrees for water.

To me, this is a much simpler way to think about the Brewster angle than dealing with the Fresnel equations.

Nonlinear Response and Harmonics

Because we are so often solving problems in quantum mechanics, it is sometimes easy to forget that certain effects also show up in classical physics and are not “mysterious quantum phenomena”. One of these is the problem of avoided crossings or level repulsion, which can be much more easily understood in the classical realm. I would argue that the Fano resonance also represents a case where a classical model is more helpful in grasping the main idea. Perhaps not too surprisingly, a variant of the classical harmonic oscillator problem is used to illustrate the respective concepts in both cases.

There is also another cute example that illustrates why overtones of the natural harmonic frequency components result when subject to slightly nonlinear oscillations. The solution to this problem therefore shows why harmonic distortions often affect speakers; sometimes speakers emit frequencies not present in the original electrical signal. Furthermore, it shows why second harmonic generation can result when intense light is incident on a material.

First, imagine a perfectly harmonic oscillator with a potential of the form $V(x) = \frac{1}{2} kx^2$. We know that such an oscillator, if displaced from its original position, will result in oscillations at the natural frequency of the oscillator $\omega_o = \sqrt{k/m}$ with the position varying as $x(t) = A \textrm{cos}(\omega_o t + \phi)$. The potential and the position of the oscillator as a function of time are shown below:

(Left) Harmonic potential as a function of position. (Right) Variation of the position of the oscillator with time

Now imagine that in addition to the harmonic part of the potential, we also have a small additional component such that $V(x) = \frac{1}{2} kx^2 + \frac{1}{3}\epsilon x^3$, so that the potential now looks like so:

The equation of motion is now nonlinear:

$\ddot{x} = -c_1x - c_2x^2$

where $c_1$ and $c_2$ are constants. It is easy to see that if the amplitude of oscillations is small enough, there will be very little difference between this case and the case of the perfectly harmonic potential.

However, if the amplitude of the oscillations gets a little larger, there will clearly be deviations from the pure sinusoid. So then what does the position of the oscillator look like as a function of time? Perhaps not too surprisingly, considering the title, is that not only are there oscillations at $\omega_0$, but there is also an introduction of a harmonic component with $2\omega_o$.

While the differential equation can’t be solved exactly without resorting to numerical methods, that the harmonic component is introduced can be seen within the framework of perturbation theory. In this context, all we need to do is plug the solution to the simple harmonic oscillator, $x(t) = A\textrm{cos}(\omega_0t +\phi)$ into the nonlinear equation above. If we do this, the last term becomes:

$-c_2A^2\textrm{cos}^2(\omega_0t+\phi) = -c_2 \frac{A^2}{2}(1 + \textrm{cos}(2\omega_0t+2\phi))$,

showing that we get oscillatory components at twice the natural frequency. Although this explanation is a little crude — one can already start to see why nonlinearity often leads to higher frequency harmonics.

With respect to optical second harmonic generation, there is also one important ingredient that should not be overlooked in this simplified model. This is the fact that frequency doubling is possible only when there is an $x^3$ component in the potential. This means that the potential needs to be inversion asymmetric. Indeed, second harmonic generation is only possible in inversion asymmetric materials (which is why ferroelectric materials are often used to produce second harmonic optical signals).

Because of its conceptual simplicity, it is often helpful to think about physical problems in terms of the classical harmonic oscillator. It would be interesting to count how many Nobel Prizes have been given out for problems that have been reduced to some variant of the harmonic oscillator!

An Interesting Research Avenue, an Update, and a Joke

An Interesting Research Avenue: A couple months ago, Stephane Mangin of the Insitut Jean Lamour gave a talk on all-optical helicity-dependent magnetic switching (what a mouthful!) at Argonne, which was fascinating. I was reminded of the talk yesterday when a review article on the topic appeared on the arXiv. The basic phenomenon is that in certain materials, one is able to send in a femtosecond laser pulse onto a magnetic material and switch the direction of magnetization using circularly polarized light. This effect is reversible (in the sense that circularly polarized light in the opposite direction will result in a magnetization in the opposite direction) and is reproducible. During the talk, Mangin was able to show us some remarkable videos of the phenomenon, which unfortunately, I wasn’t able to find online.

The initial study that sparked a lot of this work was this paper by Beaurepaire et al., which showed ultrafast demagnetization in nickel films in 1996, a whole 20 years ago! The more recent study that triggered most of the current work was this paper by Stanciu et al. in which it was shown that the magnetization direction could be switched with a circularly polarized 40-femtosecond laser pulse on ferromagnetic film alloys of GdFeCo. For a while, it was thought that this effect was specific to the GdFeCo material class, but it has since been shown that all-optical helicity-dependent magnetic switching is actually a more general phenomenon and has been observed now in many materials (see this paper by Mangin and co-workers for example). It will be interesting to see how this research plays out with respect to the magnetic storage industry. The ability to read and write on the femtosecond to picosecond timescale is definitely something to watch out for.

Update: After my post on the Gibbs paradox last week, a few readers pointed out that there exists some controversy over the textbook explanation that I presented. I am grateful that they provided links to some articles discussing the subtleties involved in the paradox. Although one commenter suggested Appendix D of E. Atlee Jackson’s textbook, I was not able to get a hold of this. It looks like a promising textbook, so I may end up just buying it, however!

The links that I found helpful about the Gibbs paradox were Jaynes’ article (pdf!) and this article by R. Swendsen. In particular, I found Jaynes’ discussion of Whifnium and Whoofnium interesting in the role that ignorance and knowledge plays our ability to extract work from a partitioned gases. Swendsen’s tries to redefine entropy classically (what he calls Boltzmann’s definition of entropy), which I have to think about a little more. But at the moment, I don’t think I buy his argument that this resolves the Gibbs paradox completely.

A Joke:

Q: What did Mrs. Cow say to Mr. Cow?

A: Hubby, could you please mooo the lawn?

Q: What did Mr. Cow say back to Mrs. Cow?

A: But, sweetheart, then what am I going to eat?

Lunar Eclipse and the 22 Degree Halo

The beautiful thing about atmospheric optics is that (almost) everyone can look up at the sky and see stunning optical phenomena from the sun, moon or some other celestial object. In this post I’ll focus on two particularly striking phenomena where the physical essence can be captured with relatively simple explanations.

The 22 degree halo is a ring around the sun or moon, which is often observed on cold days. Here are a couple images of the 22 degree halo around the sun and moon respectively:

22 degree halo around the sun

22 degree halo around the moon

Note that the 22 degree halo is distinct from the coronae, which occur due to different reasons. While the coronae arise due to the presence of water droplets, the 22 degree halo arises specifically due to the presence of hexagonal ice crystals in the earth’s atmosphere. So why 22 degrees? Well, it turns out that one can answer the question using rather simple undergraduate-level physics. One of the most famous questions in undergraduate optics is that of light refraction through a prism, illustrated below:

Fig. 1: The Snell’s Law Prism Problem

But if there were hexagonal ice crystals in the atmosphere, the problem is exactly the same, as one can see below. This is so because a hexagon is just an equilateral triangle with its ends chopped off. So as long as the light enters and exits on two sides of the hexagon that are spaced one side apart, the analysis is the same as for the triangle.

Equilateral triangle with ends chopped off, making a hexagon

It turns out that $\theta_4$ in Fig. 1 can be solved as a function of $\theta_1$ with Snell’s law and some simple trigonometry to yield (under the assumption that $n_1 =1$):

$\theta_4 = \textrm{sin}^{-1}(n_2 \times \textrm{sin}(60-\textrm{sin}^{-1}(\textrm{sin}(\theta_1)/n_2)))$

It is then pretty straightforward to obtain $\delta$, the difference in angle between the incident and refracted beam as a function of $\theta_1$. I have plotted this below for the index of refraction of ice crystals for three different colors of light, red, green and blue ($n_2 =$ 1.306, 1.311 and 1.317 respectively):

The important thing to note in the plot above is that there is a minimum angle below which there is no refracted beam, and this angle is precisely 21.54, 21.92 and 22.37 degrees for red, green and blue light respectively. Because there is no refracted beam below 22 degrees, this region appears darker, and then there is a sudden appearance of the refracted beam at the angles listed above. This is what gives rise to the 22 degree halo and also to the reddish hue on the inside rim of the halo.

Another rather spectacular celestial occurrence is the lunar eclipse, where the earth completely obscures the moon from direct sunlight. This is the geometry for the lunar eclipse:

Geometry of the lunar eclipse

The question I wanted to address is the reddish hue of the moon, despite it lying in the earth’s shadow. It would naively seem like the moon should not be observable at all. However, there is a similar effect occurring here as with the halo. In this case, the earth’s atmosphere is the refracting medium. So just as light incident on the prism was going upward and then exited going downward, the sun’s rays similarly enter the atmosphere on a trajectory that would miss the moon, but then are bent towards the moon after interacting with the earth’s atmosphere.

But why red? Well, this has the same origins as the reddish hue of the sunset. Because light scatters from atmospheric particles as $1/\lambda^4$, blue light gets scattered away much more easily than red light. Hence, the only color of light left by the time the light reaches the moon is primarily of red color.

It is interesting to imagine what the earth looks like from the moon during a lunar eclipse — it likely looks completely dark apart from a spectacular red halo around the earth. Anyway, one should realize that Snell’s law was first formulated in 984 by Arab scientist Ibn Sahl, and so it was possible to come to these conclusions more than a thousand years ago. Nothing new here!

Diffraction, Babinet and Optical Transforms

In an elementary wave mechanics course, the subject of Fraunhofer diffraction is usually addressed within the context of single-slit and double-slit interference. This is usually followed up with a discussion of diffraction from a grating. In these discussions, one usually has the picture that light is “coming through the slits” like in the image below:

Now, if you take a look at Ashcroft and Mermin or a book like Elements of Modern X-ray Physics by Als-Nielsen and McMorrow, one gets a somewhat different picture. These books make it seem like X-ray diffraction occurs when the “scattered radiation from the atoms add in phase”, as in the image below (from Ashcroft and Mermin):

So in one case it seems like the light is emanating from the free space between obstacles, whereas in the other case it seems like the obstacles are scattering the radiation. I remember being quite confused about this point when first learning X-ray diffraction in a solid-state physics class, because I had already learned Fraunhofer diffraction in a wave mechanics course. The two phenomena seemed different somehow. In their mathematical treatments, it almost seemed as if for optics, light “goes through the holes” but for X-rays “light bounces off the atoms”.

Of course, these two phenomena are indeed the same, so the question arises: which picture is correct? Well, they both give correct answers, so actually they are both correct. The answer as to why they are both correct has to do with Babinet’s principle. Wikipedia summarizes Babinet’s principle, colloquially, as so:

the diffraction pattern from an opaque body is identical to that from a hole of the same size and shape except for the overall forward beam intensity.

To get an idea of what this means, let’s look at an example. In the images below, consider the white space as openings (or slits) and the black space as obstacles in the following optical masks:

What would the diffraction pattern from these masks look like? Well, below are the results (taken from here):

Apart from minute differences close to the center, the two patterns are basically the same! If one looks closely enough at the two images, there are some other small differences, most of which are explained in this paper.

Hold on a second, you say. They can’t be the exact same thing! If I take the open space in the optical mask on the left and add it to the open space on the mask to the right, I just have “free space”. And in this case there is no diffraction! You don’t get the diffraction pattern with twice the intensity. This is of course correct. I have glossed over one small discrepancy. First, one needs to realize that intensity is related to amplitude as so:

$I \propto |A|^2$

This implies that the optical mask on the left and the one on the right give the same diffraction intensity, but that the amplitudes are 180 degrees out of phase. This phase doesn’t affect the intensity, though, as in the formula above intensity is only related to the magnitude of the amplitude. Therefore the masks, while giving the same intensity, are actually slightly different. The diffraction pattern will then cancel when the optically transparent parts of the two masks are added together. It’s strange to think that “free space” is just a bunch of diffraction patterns cancelling each other out!

With this all in mind, the main message is pretty clear though: optical diffraction through slits and the Ashcroft and Mermin picture of “bouncing off atoms” are complementary pictures of basically the same diffraction phenomenon. The diffraction pattern obtained will be the same in both cases because of Babinet’s principle.

This idea has been exploited to generate the excellent Atlas of Optical Transforms, where subtleties in crystal structures can be manipulated at the optical scale. Below is an example of such an exercise (taken from here). The two images in the first row are the optical masks, while the bottom row gives the respective diffraction patterns. In the first row, the white dots were obtained by poking holes in the optical masks.

Basically, what they are doing here is using Babinet’s principle to image the diffraction from a crystal with stacking faults along the vertical direction. The positions of the atoms are replaced with holes. One can clearly see that the effect of these stacking faults is to smear out and broaden some of the peaks in the diffraction pattern along the vertical direction. This actually turns out to gives one a good intuition of how stacking faults in a crystal can distort a diffraction pattern.

In summary, the Ashcroft and Mermin picture and the Fraunhofer diffraction picture are really two ways to describe the same phenomenon. The link between the two explanations is Babinet’s principle.

The Relationship Between Causality and Kramers-Kronig Relations

The Kramers-Kronig relations tell us that the real and imaginary parts of causal response functions are related. The relations are of immense importance to optical spectroscopists, who use them to obtain, for example, the optical conductivity from the reflectivity. It is often said in textbooks that the Kramers-Kronig relations (KKRs) follow from causality. The proof of this statement usually uses contour integration and the role of causality is then a little obscured. Here, I hope to use intuitive arguments to show the meaning behind the KKRs.

If one imagines applying a sudden force to a simple harmonic oscillator (SHO) and then watches its response, one would expect that the response will look something like this:

We expect the SHO to oscillate for a little while and eventually stop due to friction of some kind. Let’s call the function in the plot above $\chi(t)$. Because $\chi(t)$ is zero before we “kick” the system, we can play a little trick and write $\chi(t) = \chi_s(t)+\chi_a(t)$ where the symmetrized and anti-symmetrized parts are plotted below:

Since for $t<0$, the symmetrized and anti-symmetrized parts will cancel out perfectly, we recover our original spectrum. Just to convince you (as if you needed convincing!) that this works, I have explicitly plotted this:

Now let’s see what happens when we take this over to the frequency domain, where the KKRs apply, by doing a Fourier transform. We can write the following:

$\tilde{\chi}(\omega) = \int_{-\infty}^\infty e^{i \omega t} \chi(t) \mathrm{d}t$ $= \int_{-\infty}^\infty (\mathrm{cos}(\omega t) + i \mathrm{sin}(\omega t)) (\chi_s (t)+\chi_a(t))\mathrm{d}t$

where in the last step I’ve used Euler’s identity for the exponential and I’ve decomposed $\chi(t)$ into its symmetrized and anti-symmetrized parts as before. Now, there is something immediately apparent in the last integral. Because the domain of integration is from $-\infty$ to $\infty$, the area under the curve of any odd (a.k.a anti-symmetric) function will necessarily be zero. Lastly, noting that anti-symmetric $\times$ symmetric = anti-symmetric and symmetric (anti-symmetric) $\times$ symmetric (anti-symmetric) = symmetric, we can write for the equation above:

$\tilde{\chi}(\omega) = \int_{-\infty}^\infty \mathrm{cos}(\omega t) \chi_s(t) \mathrm{d}t$ + $i \int_{-\infty}^\infty \mathrm{sin}(\omega t) \chi_a(t) \mathrm{d}t$ = $\tilde{\chi}_s(\omega) + i \tilde{\chi}_a(\omega)$

Before I continue, some remarks are in order:

1. Even though we now have two functions in the frequency domain (i.e. $\tilde{\chi}_s(\omega)$ and  $\tilde{\chi}_a(\omega)$), they actually derive from one function in the time-domain, $\chi(t)$. We just symmetrized and anti-symmetrized the function artificially.
2. We actually know the relationship between the symmetric and anti-symmetric functions in the time-domain because of causality.
3. The symmetrized part of $\chi(t)$ corresponds to the real part of $\tilde{\chi}(\omega)$. The anti-symmetrized part of $\chi(t)$ corresponds to the imaginary part of $\tilde{\chi}(\omega)$.

With this correspondence, the question then naturally arises:

How do we express the relationship between the real and imaginary parts of $\tilde{\chi}(\omega)$, knowing the relationship between the symmetrized and anti-symmetrized functions in the time-domain?

This actually turns out to not be too difficult and involves just a little more math. First, let us express the relationship between the symmetrized and anti-symmetrized parts of $\chi(t)$ mathematically.

$\chi_s(t) = \mathrm{sgn}(t) \times \chi_a(t)$

where $\mathrm{sgn} (t)$ just changes the sign of the $t<0$ part of the plot and is shown below.

Now let’s take this expression over to the frequency domain. Here, we must use the convolution theorem. This theorem says that if we have two functions multiplied by each other, e.g. $h(t) = f(t)g(t)$, the Fourier transform of this product is expressed as a convolution in the frequency domain as so:

$\tilde{h}(\omega)$=$\mathcal{F}(f(t)g(t)) = \int \tilde{f}(\omega-\omega')\tilde{g}(\omega') \mathrm{d}\omega'$

where $\mathcal{F}$ means Fourier transform. Therefore, all we have left to do is figure out the Fourier transform of $\mathrm{sgn}(t)$. The answer is given here (in terms of frequency not angular frequency!), but it is a fun exercise to work it out on your own. The answer is:

$\mathcal{F}(sgn(t)) = \frac{2}{i\omega}$

With this answer, and using the convolution theorem, we can write:

$\tilde{\chi_s}(\omega) = \int_{-\infty}^{\infty} \frac{2}{i(\omega-\omega')} \tilde{\chi}_a(\omega')\mathrm{d}\omega'$

Hence, up to some factors of $2\pi$ and $i$, we can now see better what is behind the KKRs without using contour integration. We can also see why it is always said that the KKRs are a result of causality. Thinking about the KKRs this way has definitely aided in my thinking about response functions more generally.

I hope to write a post in the future talking a little more about the connection between the imaginary part of the response function and dissipation. Let me know if you think this will be helpful.

A lot of this post was inspired by this set of notes, which I found to be very valuable.

Plasma Frequency, Screening Response Time and the Independent Electron Approximation

The plasma frequency in the study of solids arises in many different contexts. One of the most illuminating ways to look at the plasma frequency is as a measure of the screening response time in solids. I’ve discussed this previously in reference to the screening of longitudinal phonons in semiconductors, but I think it is worth repeating and expanding upon.

What I mean by “screening response time” is that in any solid, when one applies a perturbing electric field, the electrons take a certain amount of time to screen this field. This time can usually be estimated by using the relation:

$t_p = \frac{2\pi}{\omega_p}$

Now, suppose I introduce a time-varying electric field perturbation into the solid that has angular frequency $\omega$. The question then arises, will the electrons in the solid be able to respond fast enough to be able to screen this field? Well, for frequencies $\omega < \omega_p,$ the corresponding perturbation variation time is $t = 2\pi/\omega > t_p$. This means the the perturbation variation time is longer than the time it takes for the electrons in the solid to screen the perturbation. So the electrons have no problem screening this field. However, if $\omega > \omega_p$ and $t < t_p$, the electronic plasma in the solid will not have enough time to screen out the time-varying electric field.

This screening time interpretation of the plasma frequency is what leads to what is called the plasma edge in the reflectivity spectra in solids. Seen below is the reflectivity spectrum for aluminum (taken from Mark Fox’s book Optical Properties of Solids):

One can see that below the plasma edge at ~15eV, the reflectivity is almost perfect, resulting in the shiny and reflective quality of aluminum metal in the visible range. However, above $\hbar\omega$=15eV, the reflectivity suddenly drops and light is able to pass through the solid virtually unimpeded as the electrons can no longer respond to the quickly varying electric field.

Now that one can see the effect of the screening time on an external electric field such as light, the question naturally arises as to how the electrons screen the electric field generated from other electrons in the solid. It turns out that much of what I have discussed above also works for the electrons in the solid self-consistently. Therefore, it turns out that the electrons near the Fermi energy also have their electric fields, by and large, screened out in a similar manner. The distance over which the electric field falls by $1/e$ is usually called the Thomas-Fermi screening length, which for most metals is about half a Bohr radius. That the Thomas-Fermi approximation works well is because one effectively assumes that $\omega_p \rightarrow \infty$, which is not a bad approximation for the low-energy effects in solids considering that the plasma frequency is often 10s of eV.

Ultimately, the fact that the low-energy electrons near the Fermi energy are well-screened by other electrons self-consistently permits one to use the independent electron approximation — the foundation upon which band theory is built. Therefore, in many instances that the independent electron approximation is used to describe physical phenomena in solids, it should be kept in mind the hidden role the plasmon actually plays in allowing these ideas to work.

Naively, from my discussion above, it would seem like the independent electron approximation would then break down in a band insulator. However, this is not necessarily so. There are two things to note in this regard: (i) there exists an “interband plasmon” at high energies that plays essentially the same role that a free-carrier plasmon does in a metal for energies $E_g << E < \hbar\omega_p$ and (ii) whether the kinetic or Coulomb energy dominates will determine the low energy phenomenology. An image below is taken from this paper on lithium fluoride, which is a band insulator with a band gap of about 5eV and exhibits a plasmon at ~22eV:

The interband plasmon ultimately contributes to the background dielectric function, $\epsilon$, which reduces the Coulomb energy between the electrons in the form:

$V_{eff} = \frac{e^2}{\epsilon_0 \epsilon r}$

For example, this is the Coulomb interaction felt between an electron and hole when an exciton is formed (with opposite sign), as can be seen for LiF in the above image.

Now, the kinetic energy can be approximated by the band width, $W$, which effectively gives the amount of “wavefunction overlap” between the neighboring orbitals. Now, if $W >> V_{eff}$, then the independent electron approximation remains a good approximation. In this limit, one can get a band insulator, that is adequately described using the independent electron approximation. In the opposite limit, however, often one gets what is called a Mott insulator. Because d- and f-electrons tend to be closely bound to the atomic site, there is usually less wavefunction overlap between the electrons, leading to a small band width. This is why Mott insulators tend to occur in materials that have d- and f-electrons near the Fermi energy

Most studies on strongly correlated electron systems tend to concentrate on low-energy phenomenology.  While this is no doubt important, in light of this post, I think it may be worth looking up from time to time as well.