Tag Archives: Quantum Mechanics

Jahn-Teller Distortion and Symmetry Breaking

The Jahn-Teller effect occurs in molecular systems, as well as solid state systems, where a molecular complex distorts, resulting in a lower symmetry. As a consequence, the energy of certain occupied molecular states is reduced. Let me first describe the phenomenon before giving you a little cartoon of the effect.

First, consider, just as an example, a manganese atom with valence 3d^4, surrounded by an octahedral cage of oxygen atoms like so (image taken from this thesis):


The electrons are arranged such that the lower triplet of orbital states each contain a single “up-spin”, while the higher doublet of orbitals only contains a single “up-spin”, as shown on the image to the left. This scenario is ripe for a Jahn-Teller distortion, because the electronic energy can be lowered by splitting both the doublet and the triplet as shown on the image on the right.

There is a very simple, but quite elegant problem one can solve to describe this phenomenon at a cartoon level. This is the problem of a two-dimensional square well with adjustable walls. By solving the Schrodinger equation, it is known that the energy of the two-dimensional infinite well has solutions of the form:

E_{i,j} = \frac{h}{8m}(i^2/a^2 + j^2/b^2)                where i,j are integers.

Here, a and b denote the lengths of the sides of the 2D well. Since it is only the quantity in the brackets that determine the energy levels, let me factor out a factor of \gamma = a/b and write the energy dependence in the following way:

E \sim i^2/\gamma + \gamma j^2

Note that \gamma is effectively an anisotropy parameter, giving a measure of the “squareness of the well”. Now, let’s consider filling up the levels with spinless electrons that obey the Pauli principle. These electrons will fill up in a “one-per-level” fashion in accordance with the fermionic statistics. We can therefore write the total energy of the N-fermion problem as so:

E_{tot} \sim \alpha^2/ \gamma + \gamma \beta^2

where \alpha and \beta parameterize the energy levels of the N electrons.

Now, all of this has been pretty simple so far, and all that’s really been done is to re-write the 2D well problem in a different way. However, let’s just systematically look at what happens when we fill up the levels. At first, we fill up the E_{1,1} level, where \alpha^2 = \beta^2 = 1^2. In this case, if we take the derivative of E_{1,1} with respect to \gamma, we get that \gamma_{min} = 1 and the well is a square.

For two electrons, however, the well is no longer a square! The next electron will fill up the E_{2,1} level and the total energy will therefore be:

E_{tot} \sim 1/\gamma (1+4) + \gamma (1+1),

which gives a \gamma_{min} = \sqrt{5/2}!

Why did this breaking of square symmetry occur? In fact, this is very closely related to the Jahn-Teller effect. Since the level is two-fold degenerate (i.e. E_{2,1} =  E_{1,2}), it is favorable for the 2D well to distort to lower its electronic energy.

Notice that when we add the third electron, we get that:

E_{tot} \sim 1/\gamma (1+4+1) + \gamma (1+1+4)

and \gamma_{min} = 1 again, and we return to the system with square symmetry! This is also quite similar to the Jahn-Teller problem, where, when all the states of the degenerate levels are filled up, there is no longer an energy to be gained from the symmetry-broken geometry.

This analogy is made more complete when looking at the following level scheme for different d-electron valence configurations, shown below (image taken from here).


The black configurations are Jahn-Teller active (i.e. prone to distortions of the oxygen octahedra), while the red are not.

In condensed matter physics, we usually think about spontaneous symmetry breaking in the context of the thermodynamic limit. What saves us here, though, is that the well will actually oscillate between the two rectangular configurations (i.e. horizontal vs. vertical), preserving the original symmetry! This is analogous to the case of the ammonia (NH_3) molecule I discussed in this post.

Wannier-Stark Ladder, Wavefunction Localization and Bloch Oscillations

Most people who study solid state physics are told at some point that in a totally pure sample where there is no scattering, one should observe an AC response to a DC electric field, with oscillations at the Bloch frequency (\omega_B). These are the so-called Bloch oscillations, which were predicted by C. Zener in this paper.

However, the actual observation of Bloch oscillations is not as simple as the textbooks would make it seem. There is an excellent Physics Today article by E. Mendez and G. Bastard that outline some of the challenges associated with observing Bloch oscillations (which was written while this paper was being published!). Since the textbook treatments often use semi-classical equations of motion to demonstrate the existence of Bloch oscillations in a periodic potential, they implicitly assume transport of an electron wave-packet. To generate this wave-packet is non-trivial in a solid.

In fact, if one undertakes a full quantum mechanical treatment of electrons in a periodic potential under the influence of an electric field, one arrives at the Wannier-Stark ladder, which shows that an electric field can localize electrons! It is this ladder and the corresponding localization which was key to observing Bloch oscillations in semiconductor superlattices.

Let me use the two-well potential to give you a picture of how this localization might occur. Imagine symmetric potential wells, where the lowest energy eigenstates look like so (where S and A label the symmetric and anti-symmetric states):

Now, imagine that I start to make the wells a little asymmetric. What happens in this case? Well, it turns out that that the electrons start to localize in the following way (for the formerly symmetric and anti-symmetric states):

G. Wannier was able to solve the Schrodinger equation with an applied electric field in a periodic potential in full and showed that the eigenstates of the problem form a Stark ladder. This means that the eigenstates are of identical functional form from quantum well to quantum well (unlike in the double-well shown above) and the energies of the eigenstates are spaced apart by \Delta E=\hbar \omega_B! The potential is shown schematically below. It is also shown that as the potential wells slant more and more (i.e. with larger electric fields), the wavefunctions become more localized (the image is taken from here (pdf!)):


A nice numerical solution from the same document shows the wavefunctions for a periodic potential well profile with a strong electric field, exhibiting a strong wavefunction localization. Notice that the wavefunctions are of identical form from well to well.


What can be seen in this solution is that the stationary states are split by \hbar \omega_B, but much like the quantum harmonic oscillator (where the levels are split by \hbar \omega), nothing is actually oscillating until one has a wavepacket (or a linear superposition of eigenstates). Therefore, the Bloch oscillations cannot be observed in the ground state (which includes the the applied electric field) in a semiconducting superlattice since it is an insulator! One must first generate a wavepacket in the solid.

In the landmark paper that finally announced the existence of Bloch oscillations, Waschke et. al. generated a wavepacket in a GaAs-GaAlAs superlattice using a laser pulse. The pulse was incident on a sample with an applied electric field along the superlattice direction, and they were able to observe radiation emitted from the sample due to the Bloch oscillations. I should mention that superlattices must be used to observe the Wannier-Stark ladder and Bloch oscillations because \omega_B, which scales with the width of the quantum well, needs to be fast enough that the electrons don’t scatter from impurities and phonons. Here is the famous plot from the aforementioned paper showing that the frequency of the emitted radiation from the Bloch oscillations can be tuned using an electric field:


This is a pretty remarkable experiment, one of those which took 60 years from its first proposal to finally be observed.

Consistency in the Hierarchy

When writing on this blog, I try to share nuggets here and there of phenomena, experiments, sociological observations and other peoples’ opinions I find illuminating. Unfortunately, this format can leave readers wanting when it comes to some sort of coherent message. Precisely because of this, I would like to revisit a few blog posts I’ve written in the past and highlight the common vein running through them.

Condensed matter physicists of the last couple generations have grown up ingrained with the idea that “More is Different”, a concept first coherently put forth by P. W. Anderson and carried further by others. Most discussions of these ideas tend to concentrate on the notion that there is a hierarchy of disciplines where each discipline is not logically dependent on the one beneath it. For instance, in solid state physics, we do not need to start out at the level of quarks and build up from there to obtain many properties of matter. More profoundly, one can observe phenomena which distinctly arise in the context of condensed matter physics, such as superconductivity, the quantum Hall effect and ferromagnetism that one wouldn’t necessarily predict by just studying particle physics.

While I have no objection to these claims (and actually agree with them quite strongly), it seems to me that one rather (almost trivial) fact is infrequently mentioned when these concepts are discussed. That is the role of consistency.

While it is true that one does not necessarily require the lower level theory to describe the theories at the higher level, these theories do need to be consistent with each other. This is why, after the publication of BCS theory, there were a slew of theoretical papers that tried to come to terms with various aspects of the theory (such as the approximation of particle number non-conservation and features associated with gauge invariance (pdf!)).

This requirement of consistency is what makes concepts like the Bohr-van Leeuwen theorem and Gibbs paradox so important. They bridge two levels of the “More is Different” hierarchy, exposing inconsistencies between the higher level theory (classical mechanics) and the lower level (the micro realm).

In the case of the Bohr-van Leeuwen theorem, it shows that classical mechanics, when applied to the microscopic scale, is not consistent with the observation of ferromagnetism. In the Gibbs paradox case, classical mechanics, when not taking into consideration particle indistinguishability (a quantum mechanical concept), is inconsistent with the idea the entropy must remain the same when dividing a gas tank into two equal partitions.

Today, we have the issue that ideas from the micro realm (quantum mechanics) appear to be inconsistent with our ideas on the macroscopic scale. This is why matter interference experiments are still carried out in the present time. It is imperative to know why it is possible for a C60 molecule (or a 10,000 amu molecule) to be described with a single wavefunction in a Schrodinger-like scheme, whereas this seems implausible for, say, a cat. There does again appear to be some inconsistency here, though there are some (but no consensus) frameworks, like decoherence, to get around this. I also can’t help but mention that non-locality, à la Bell, also seems totally at odds with one’s intuition on the macro-scale.

What I want to stress is that the inconsistency theorems (or paradoxes) contained seeds of some of the most important theoretical advances in physics. This is itself not a radical concept, but it often gets neglected when a generation grows up with a deep-rooted “More is Different” scientific outlook. We sometimes forget to look for concepts that bridge disparate levels of the hierarchy and subsequently look for inconsistencies between them.

Kapitza-Dirac Effect

We are all familiar with the fact that light can diffract from two (or multiple) slits in a Young-type experiment. After the advent of quantum mechanics and de Broglie’s wave description of matter, it was shown by Davisson and Germer that electrons could be diffracted by a crystal. In 1927, P. Kapitza and P. Dirac proposed that it should in principle be possible for electrons to be diffracted by standing waves of light, in effect using light as a diffraction grating.

In this scheme, the electrons would interact with light through the ponderomotive potential. If you’re not familiar with the ponderomotive potential, you wouldn’t be the only one — this is something I was totally ignorant of until reading about the Kapitza-Dirac effect. In 1995, Anton Zeilinger and co-workers were able to demonstrate the Kapitza-Dirac effect with atoms, obtaining a beautiful diffraction pattern in the process which you can take a look at in this paper. It probably took so long for this effect to be observed because it required the use of high-powered lasers.

Later, in 2001, this experiment was pushed a little further and an electron-beam was used to demonstrate the effect (as opposed to atoms), as Dirac and Kapitza originally proposed. Indeed, again a diffraction pattern was observed. The article is linked here and I reproduce the main result below:


(Top) The interference pattern observed in the presence of a standing light wave. (Bottom) The profile of the electron beam in the absence of the light wave.

Even though this experiment is conceptually quite simple, these basic quantum phenomena still manage to elicit awe (at least from me!).

Bohr-van Leeuwen Theorem and Micro/Macro Disconnect

A couple weeks ago, I wrote a post about the Gibbs paradox and how it represented a case where, if particle indistinguishability was not taken into account, led to some bizarre consequences on the macroscopic scale. In particular, it suggested that entropy should increase when partitioning a monatomic gas into two volumes. This paradox therefore contained within it the seeds of quantum mechanics (through particle indistinguishability), unbeknownst to Gibbs and his contemporaries.

Another historic case where a logical disconnect between the micro- and macroscale arose was in the context of the Bohr-van Leeuwen theorem. Colloquially, the theorem says that magnetism of any form (ferro-, dia-, paramagnetism, etc.) cannot exist within the realm of classical mechanics in equilibrium. It is quite easy to prove actually, so I’ll quickly sketch the main ideas. Firstly, the Hamiltonian with any electromagnetic field can be written in the form:

H = \sum_i \frac{1}{2m_i}(\textbf{p}_i - e\textbf{A}_i)^2 + U_i(\textbf{r}_i)

Now, because the classical partition function is of the form:

Z \propto \int_{-\infty}^\infty d^3\textbf{r}_1...d^3\textbf{r}_N\int_{-\infty}^\infty d^3\textbf{p}_1...d^3\textbf{p}_N e^{-\beta\sum_i \frac{1}{2m_i}(\textbf{p}_i - e\textbf{A}_i)^2 + U_i(\textbf{r}_i)}

we can just make the substitution:

\textbf{p}'_i = \textbf{p}_i - e\textbf{A}_i

without having to change the limits of the integral. Therefore, with this substitution, the partition function ends up looking like one without the presence of the vector potential (i.e. the partition function is independent of the vector potential and therefore cannot exhibit any magnetism!).

This theorem suggests, like in the Gibbs paradox case, that there is a logical inconsistency when one tries to apply macroscale physics (classical mechanics) to the microscale and attempts to build up from there (by applying statistical mechanics). The impressive thing about this kind of reasoning is that it requires little experimental input but nonetheless exhibits far-reaching consequences regarding a prevailing paradigm (in this case, classical mechanics).

Since the quantum mechanical revolution, it seems like we have the opposite problem, however. Quantum mechanics resolves both the Gibbs paradox and the Bohr-van Leeuwen theorem, but presents us with issues when we try to apply the microscale ideas to the macroscale!

What I mean is that while quantum mechanics is the rule of law on the microscale, we arrive at problems like the Schrodinger cat when we try to apply such reasoning on the macroscale. Furthermore, Bell’s theorem seems to disappear when we look at the world on the macroscale. One wonders whether such ideas, similar to the Gibbs paradox and the Bohr-van Leeuwen theorem, are subtle precursors suggesting where the limits of quantum mechanics may actually lie.

Precision in Many-Body Systems

Measurements of the quantum Hall effect give a precise conductance in units of e^2/h. Measurements of the frequency of the AC current in a Josephson junction give us a frequency of 2e/h times the applied voltage. Hydrodynamic circulation in liquid 4He is quantized in units of h/m_{4He}. These measurements (and similar ones like flux quantization) are remarkable. They yield fundamental constants to a great degree of accuracy in a condensed matter setting– a setting which Murray Gell-Mann once referred to as “squalid state” systems. How is this possible?

At first sight, it is stunning that physics of the solid or liquid state could yield a measurement so precise. When we consider the defects, impurities, surfaces and other imperfections in a macroscopic system, these results become even more astounding.

So where does this precision come from? It turns out that in all cases, one is measuring a quantity that is dependent on the single-valued nature of the (appropriately defined) complex scalar  wavefunction. The aforementioned quantities are measured in integer units, n, usually referred to as the winding number. Because the winding number is a topological quantity, in the sense that it arises in a multiply-connected space, these measurements do not particularly care about the small differences that occur in its surroundings.

For instance, the leads used to measure the quantum Hall effect can be placed virtually anywhere on the sample, as long as the wires don’t cross each other. The samples can be any (two-dimensional) geometry, i.e. a square, a circle or some complicated corrugated object. In the Josephson case, the weak links can be constrictions, an insulating oxide layer, a metal, etc. Imprecision of experimental setup is not detrimental, as long as the experimental geometry remains the same.

Another ingredient that is required for this precision is a large number of particles. This can seem counter-intuitive, since one expects quantization on a microscopic rather than at a macroscopic level, but the large number of particles makes these effects possible. For instance, both the Josephson effect and the hydrodynamic circulation in 4He depend on the existence of a macroscopic complex scalar wavefunction or order parameter. In fact, if the superconductor becomes too small, effects like the Josephson effect, flux quantization and persistent currents all start to get washed out. There is a gigantic energy barrier preventing the decay from the n=1 current-carrying state to the n=0 current non-carrying state due to the large number of particles involved (i.e. the higher winding number state is meta-stable). As one decreases the number of particles, the energy barrier is lowered and the system can start to tunnel from the higher winding number state to the lower winding number state.

In the quantum Hall effect, the samples need to be macroscopically large to prevent the boundaries from interacting with each other. Once the states on the edges are able to do that, they may hybridize and the conductance quantization gets washed out. This has been visualized in the context of 3D topological insulators using angle-resolved photoemission spectroscopy, in this well-known paper. Again, a large sample is needed to observe the effect.

It is interesting to think about where else such a robust quantization may arise in condensed matter physics. I suspect that there exist similar kinds of effects in different settings that have yet to be uncovered.

Aside: If you are skeptical about the multiply-connected nature of the quantum Hall effect, you can read about Laughlin’s gauge argument in his Nobel lecture here. His argument critically depends on a multiply-connected geometry.

Friedel Sum Rule and Phase Shifts

When I took undergraduate quantum mechanics, one of the most painful subjects to study was scattering theory, due to the usage of special functions, phase shifts and partial waves. To be honest, the sight of those words still makes me shudder a little.

If you have felt like that at some point, I hope that this post will help alleviate some fear of phase shifts. Phase shifts can be seen in many classical contexts, and I think that it is best to start thinking about them in that setting. Consider the following scenarios: a wave pulse on a rope is incident on a (1) fixed boundary and (2) a movable boundary. See below for a sketch, which was taken from here.

animation of wave pulse reflecting from hard boundary

Fixed Boundary Reflection

animation of wave pulse reflecting from soft boundary

Movable Boundary Reflection

Notice that in the fixed boundary case, one gets a phase shift of \pi, while in the movable boundary case, there is no phase shift. The reason that there is a phase shift of \pi in the former case is that the wave amplitude must be zero at the boundary. Therefore, when the wave first comes in and reflects, the only way to enforce the zero is to have the wave reflect with a \pi phase shift and interfere destructively with the incident pulse, cancelling it out perfectly.

The important thing to note is that for elastic scattering, the case we will be considering in this post, the amplitude of the reflected (or scattered) pulse is the same as the incident pulse. All that has changed is the phase.

Let’s now switch to quantum mechanics. If we consider the same setup, where an incident wave hits an infinitely high wall at x=0, we basically get the same result as in the classical case.


Elastic scattering from an infinite barrier

If the incident and scattered wavefunctions are:

\psi_i = Ae^{ikx}      and      \psi_s=Be^{-ikx}

then B = -A = e^{i\pi}A because, as for the fixed boundary case above, the incident and scattered waves destructively interfere (i.e. \psi_i(0) + \psi_s(0) =0). The full wavefunction is then:

\psi(x) = A(e^{ikx}-e^{-ikx}) \sim \textrm{sin}(kx)

The last equality is a little misleading since the wavefunction is not normalizable, but let’s just pretend we have an infinite barrier at large but not quite infinite (-x). Now consider a similar-looking, but pretty arbitrary potential:


Elastic scattering from an arbitrary potential

What happens in this case? Well, again, the scattering is elastic, so the incident and reflected amplitudes must be the same away from the region of the potential. All that can change, therefore, is the phase of the reflected (scattered) wavefunction. We can therefore write, similar to the case above:

\psi(x) = A(e^{ikx}-e^{i(2\delta-kx)}) \sim \textrm{sin}(kx+\delta)

Notice that the sine term has now acquired a phase. What does this mean? It means that the energy of the wavefunction has changed, as would be expected for a different potential. If we had used box normalization for the infinite barrier case, kL=n\pi, then the energy eigenvalues would have been:

E_n = \hbar^2n^2\pi^2/2mL^2

Now, with the newly introduced potential, however, our box normalization leads to the condition, kL+\delta(k)=n\pi so that the new energies are:

E_n = \hbar^2n^2\pi^2/2mL^2-\hbar^2\delta(k)^2/2mL^2

The energy eigenvalues move around, but since \delta(k) can be a pretty complicated function of k, we don’t actually know how they move. What’s clear, though, is that the number of energy eigenvalues are going to be the same — we didn’t make or destroy any new eigenvalues or energy eigenstates.

Let’s now move onto some solid state physics. In a metal, one usually fills up N states in accordance with the Pauli principle up to k_F. If we introduce an impurity with a different number of valence electrons into the metal, we have effectively created a potential where the electrons of the Fermi gas/liquid can scatter. Just like in the cases above, this potential will cause a phase shift in the electron wavefunctions present in the metal, changing their energies. The amplitudes for the incoming and outgoing electrons again will be the same far from the scattering impurity.

This time, though, there is something to worry about — the phase shift and the corresponding energy shift can potentially move states from above the Fermi energy to below the Fermi energy, or vice versa. Suppose I introduced an impurity with an extra valence electron compared to that of the host metal. For instance, suppose I introduce a Zn impurity into Cu. Since Zn has an extra electron, the Fermi sea has to accommodate an extra energy state. I can illustrate the scenario away from, but in the neighborhood of, the Zn impurity schematically like so:


E=0 represents the Fermi Energy. An extra state below the Fermi energy because of the addition of a Zn impurity

It seems quite clear from the description above, that the phase shift must be related somehow to the valence difference between the impurity and the host metal. Without the impurity, we fill up the available states up to the Fermi wavevector, k_F=N_{max}\pi/L, where N_{max} is the highest occupied state. In the presence of the impurity, we now have k_F=(N'_{max}\pi-\delta(k_F))/L. Because the Fermi wavevector does not change (the density of the metal does not change), we have that:

N'_{max} = N_{max} + \delta(k_F)/\pi

Therefore, the number of extra states needed now to fill up the states to the Fermi momentum is:

N'_{max}-N_{max}=Z = \delta(k_F)/\pi,

where Z is the valence difference between the impurity and the host metal. Notice that in this case, each extra state that moves below the Fermi level gives rise to a phase shift of \pi. This is actually a simplified version of the Friedel sum rule. It means that the electrons at the Fermi energy have changed the structure of their wavefunctions, by acquiring a phase shift, to exactly screen out the impurity at large distances.

There is just one thing we are missing. I didn’t take into account degeneracy of the energy levels of the Fermi sea electrons. If I do this, as Friedel did assuming a spherically symmetric potential in three dimensions, we get a degeneracy of 2(2l+1) for each l where the factor of 2 comes from spin and (2l+1) comes from the fact that we have azimuthal symmetry. We can write the Friedel sum rule more precisely, which states:

Z = \frac{2}{\pi} \sum_l (2l+1)\delta_l(k_F),

We just had to take into consideration the fact that there is a high degeneracy of energy states in this system of spherical symmetry. What this means, informally, is that each energy level that moves below the Fermi energy has the \pm\pi phase shift distributed across each relevant angular momentum channel. They all get a little slice of some phase shift.

An example: If I put Ni  (which is primarily a d-wave l=2 scatterer in this context) in an Al host, we get that Z=-1. This is because Ni has valence 3d^94s^1.  Now, we can obtain from the Friedel sum rule that the phase shift will be \sim -\pi/10. If we move onto Co where Z=-2, we get \sim -2\pi/10, and so forth for Fe, Mn, Cr, etc. Only after all ten d-states shift above the Fermi energy do we acquire a phase shift of -\pi.

Note that when the phase shift is \sim\pm\pi/2 the impurity will scatter strongly, since the scattering cross section \sigma \propto |\textrm{sin}(\delta_l)|^2. This is referred to as resonance scattering from an impurity, and again bears a striking resemblance to the classical driven harmonic oscillator. In the case above, it would correspond to Cr impurities in the Al host, which has phase shift of \sim -5\pi/10. Indeed, the resistivity of Al with Cr impurities is the highest among the first row transition metals, as shown below:

aluminum impurties

Hence, just by knowing the valence difference, we can get out a non-trivial phase shift! This is a pretty remarkable result, because we don’t have to use the (inexact and perturbative) Born approximation. And it comes from (I hope!) some pretty intuitive physics.