Elena & Fabrice's Web

(Part of the Wolverhampton Lectures of Physics's Quantum Physics Course)

We have seen in the last lecture how the algebra of angular momentum

$$[L_k,L_l]=i\hbar\varepsilon_{klm}L_m$$

could be classified, choosing the $z$-axis as a reference, through two compatible observables:

\begin{align} L_z\psi&=\mu\psi\\ L^2\psi&=\lambda\psi \end{align}

with $\lambda$ and $\mu$ connected by the relation $\lambda=\mu^2+\hbar\mu$, with also ladder operators bringing in a minimum and maximum value for $\lambda$ so that the possible $z$ projection of angular momentum (eigenvalues of $L_z$) form a chain as follows:

$$-\hbar\mu\quad-\hbar(\mu-1)\quad-\hbar(\mu-2)\quad\cdots\quad \hbar(\mu-2)\quad \hbar(\mu-1)\quad \hbar\mu$$

so that, if it has $N$ steps, we find for the maximum possible values of $\mu$:

$$\mu={N\over 2}\,.$$

This gives rise to half-integer angular momenta solutions, which, however, have no realizations for the real-space differential equation $\sin\theta{d\over d\theta}\left(\sin\theta{d\Theta\over d\theta}\right)+[l(l+1)\sin^2\theta-m^2]\Theta=0$ that requires $l$ to be a full integer. One would typically, in such a case, regard these algebraically correct solutions which, however, have no physical realization in space as, precisely, unphysical, and just forget about them.

However, it appears that Nature took advantage of all the possibilities, since it so happens, indeed, that elementary particles do carry an intrinsic angular momentum, and this one can take, interestingly, half-integer values. This is called *spin*, in analogy to the rotation of a body on itself, although it does not correspond to such a real-space rotation (as far as we know, elementary particles are point-like).

The formalism for this intrinsic spin is however a carbon copy of the one for angular momentum that corresponds to the actual rotation of the particle in space (say of the electron orbiting around a proton). This does not refer to the algebra of the $L$ operator which is linked to spatial derivatives but to another, abstract, spaceless operator $S$, enforcing in some internal degree of freedom of the particles, this angular momentum algebra. $S$ is also a vector:

$$\vec S=S_x\hat\imath+S_y\hat\jmath+S_z\hat k\,.$$

The main difference between $L$ and $S$ is that the former applies to wavefunctions that have spatial coordinates $\psi(r,\theta,\phi)$ while $S$ applies on wavefunctions that have no spatial coordinates, and which we thus, conveniently, if not compulsorily, write as Dirac ket vectors.

The algebra of the $S$ operators is, from our previous discussion, otherwise the same, i.e., the *spin algebra* reads:

$$\left[S_j,S_k\right]=i\hbar\varepsilon_{jkl}S_l\,.$$

We already have done all the algebraic work. Retracing all our steps from the previous lecture, we have:

\begin{align} S^2|sm\rangle&=\hbar^2 s(s+1)|sm\rangle\\ S_z|sm\rangle&=\hbar m|sm\rangle \end{align}

where $\ket{sm}$ is the quantum state for a particle of total spin $s$ and $z$-projection (in its internal, intrinsic, abstract space) $m$. We, again, use the $z$-axis by convention, any other axis would also work.

Another important difference as compared to real-space angular momentum is that while $l$ can change (by knocking off the particle sideways, for instance, or "immobilizing" it), the magnitude of spin $s$ for a particle turns out to be a fundamental feature of the object which cannot be changed, like its mass or electric charge or other intrinsic properties. An electron, for instance, is always a spin-1/2 particle and a photon is always a spin-1 particle.

The creation and destruction (ladder) operators that add and remove a quantum of angular momentum along the $z$-axis read, similarly:

$$S_\pm=S_x\pm iS_y$$

and applying them on the states leads to:

$$S_\pm|sm\rangle\propto|s, m\pm 1\rangle$$

which we need to normalize, i.e.,

$$\langle{sm}|S_\mp S_\pm|{sm}\rangle=\alpha^2\langle{s,m\pm1}|{s,m\pm1}\rangle=1$$

since $S^\dagger_\pm=S_\mp$. If you wonder why $|{s,m\pm1}\rangle$ becomes $\langle{s,m\pm1}|$ and not $\langle{s,m\mp1}|$ it is because its duality is taken care of by the bra so the label remains the same and indeed $\langle{s,m}|S_\pm\propto\langle{s,m\mp1}|$ which, if that disturbs you, should be turned into a dual-space operator (adjoint), i.e., see it as $\langle{s,m}|S_\pm=\langle{s,m}|S_\mp^\dagger$ and then the operator acts on the bra as its sign mandates: dual $-$ decreases the bra while dual $+$ increases it: $\langle{s,m}|S^\dagger_\mp\propto\langle{s,m\mp1}|$. Anyway, we will compute $S_\mp S_\pm$ on the ket. Let us compute, for instance (we leave the other case to you):

\begin{align} S_- S_+|sm\rangle&=(S_x^2+i[S_x,S_y]+S_y^2)|sm\rangle\\ &=(S_x^2+S_y^2+\hbar S_z)|sm\rangle\\ &=(S_x^2+S_y^2+S_z^2-S_z^2+\hbar S_z)|sm\rangle\\ &=(S^2-S_z(S_z-\hbar))|sm\rangle\\ &=(\hbar^2(s(s+1))-\hbar^2m^2-\hbar^2 m)|sm\rangle\\ &=\hbar^2(s(s+1)-m(m-1))|sm\rangle \end{align}

so that the normalization constant is the square root of this and we are left with the final (normalized) result:

$$S_\pm|sm\rangle=\hbar\sqrt{s(s+1)-m(m\pm 1)}|s, m\pm 1\rangle\,.$$

We now work out the particular case $s=\frac12$. This is the simplest and also possibly the most important case, not only of spin but of quantum states at large, since this embeds pretty much most of the complexity/phenomenology of such objects (the spin zero is, instead, trivial, as it has no room to change: $|00\rangle$ and that's it).

The vector-space of states is of dimension 2, since it has two basis vectors:

$$|{1\over 2}, -{1\over 2}\rangle$$ $$|{1\over 2}, +{1\over 2}\rangle$$

out of which one can reconstruct any other state, as a linear combination of the basis vectors:

$$|\psi\rangle=\alpha|{1\over 2}, -{1\over 2}\rangle+\beta|{1\over 2}, {1\over 2}\rangle\,.$$

Since $s$ never changes, it is convenient to use a shorthand notation for these vectors. It is the power of Dirac's notation that one can use graphically-inspiring representations of what the state represents. This is one of the popular notations for spin-up and down:

$$|\uparrow\rangle\equiv|{1\over 2}, {1\over 2}\rangle$$ $$|\downarrow\rangle\equiv|{1\over 2}, -{1\over 2}\rangle$$

Besides, as this defines a vector-space of dimension 2, we can map them to the canonical vector space of 2D, which basis vectors are $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$, so we also have this identification:

$$|\uparrow\rangle=\begin{pmatrix}1\\0\end{pmatrix}$$ $$|\downarrow\rangle=\begin{pmatrix}0\\1\end{pmatrix}$$

We can now construct the spin operators that act on these states according to the rules above.

Operators on 2D vectors are $2\times2$ matrices. So we have, starting with $S^2$, the relations $S^2|\uparrow\rangle=\hbar^2{3\over4}|\uparrow\rangle$ and $S^2|\downarrow\rangle=\hbar^2{3\over4}|\downarrow\rangle$ for our basis states. Knowing the result on the basis vectors means that we know the full operator. $S^2$ thus leaves the states invariant, therefore, up to the constant, it is the identity operator:

$$S^2={3\hbar^2\over4}\begin{pmatrix}1&0\\0&1\end{pmatrix}\,.$$

From $S_z|\uparrow\rangle={\hbar\over2}|\uparrow\rangle$ and $S_z|\downarrow\rangle=-{\hbar\over2}|\downarrow\rangle$, we similarly find:

$$S_z={\hbar\over2}\begin{pmatrix}1&0\\0&-1\end{pmatrix}\,.$$

Note here that however pleasing the arrow-notation is, it sometimes is more convenient to stick with more traditional objects that allow us to collect results together. Keeping $m$ as the label, i.e., $\pm\frac{1}{2}$, we could indeed write instead $S_z|\pm{1\over2}\rangle=\pm{\hbar\over2}|\pm{1\over2}\rangle$.

To find $S_x$ and $S_y$, we pass by $S_\pm$, which effect on $|\uparrow\rangle$ and $|\downarrow\rangle$ we can work out from $S_\pm|sm\rangle=\hbar\sqrt{s(s+1)-m(m\pm 1)}|s, m\pm 1\rangle$ as

\begin{align} S_+|\uparrow\rangle&=0\\ S_+|\downarrow\rangle&=\hbar|\uparrow\rangle\\ S_-|\uparrow\rangle&=\hbar|\downarrow\rangle\\ S_-|\downarrow\rangle&=0 \end{align}

so that

$$S_+={\hbar}\begin{pmatrix}0&1\\0&0\end{pmatrix}$$

and

$$S_-={\hbar}\begin{pmatrix}0&0\\1&0\end{pmatrix}$$

which gives us finally

$$S_x={\hbar\over2}\begin{pmatrix}0&1\\1&0\end{pmatrix}$$

and

$$S_y={\hbar\over2}\begin{pmatrix}0&-i\\i&0\end{pmatrix}$$

The matrices $S_x$, $S_y$ and $S_z$ stripped down from their constant $\hbar/2$ are denoted $\sigma_x$, $\sigma_y$ and $\sigma_z$ and are known as the *Pauli matrices*. They have interesting algebraic properties, besides that, now familiar, of angular momentum ${\displaystyle [\sigma _{j},\sigma _{k}]=2i\varepsilon _{jk\ell }\,\sigma _{\ell }~,}
$. We can list, e.g., their anticommutation rules:

$$\{\sigma _{j},\sigma _{k}\}=2\delta _{jk}\,\mathbb{1}$$

from which two results we can get:

$$\sigma _{j}\sigma _{k}=\delta _{jk}\mathbb{1}+i\varepsilon _{jk\ell }\,\sigma _{\ell }$$

Like spin itself, we can define a Pauli vector ${\vec {\sigma }}=\sigma _{1}{\hat {\imath}}+\sigma _{2}{\hat {\jmath}}+\sigma _{3}{\hat {k}}$ which can enter into scalar products with "normal" vectors ${\vec {a}}=a_x{\hat {\imath}}+a_y{\hat {\jmath}}+a_z{\hat {k}}$ to yield:

$${\displaystyle {\begin{aligned}{\vec {a}}\cdot {\vec {\sigma }}&=~a_{x}\;{\begin{pmatrix}0&1\\1&0\end{pmatrix}}~+~a_{y}\;i{\begin{pmatrix}0&-1\\1&\;\;0\end{pmatrix}}~+~a_{z}\;{\begin{pmatrix}1&0\\0&-1\end{pmatrix}}~=~{\begin{pmatrix}a_{z}&a_{x}-ia_{y}\\a_{x}+ia_{y}&-a_{z}\end{pmatrix}}\end{aligned}}}$$

from which one finds that ${\displaystyle \det {\bigl (}{\vec {a}}\cdot {\vec {\sigma }}{\bigr )}=-{\vec {a}}\cdot {\vec {a}}=-\left|{\vec {a}}\right|^{2},} $ which, together with the previous relations, provides a nice relation between vector and spin algebra:

$${\displaystyle ~~\left({\vec {a}}\cdot {\vec {\sigma }}\right)\left({\vec {b}}\cdot {\vec {\sigma }}\right)=\left({\vec {a}}\cdot {\vec {b}}\right)\,\mathbb{1}+i\left({\vec {a}}\times {\vec {b}}\right)\cdot {\vec {\sigma }}~~.}$$

Other notable properties include the fact that the Pauli matrices are involutary:

$${\displaystyle \sigma _{x}^{2}=\sigma _{y}^{2}=\sigma _{z}^{2}=-i\,\sigma _{x}\sigma _{y}\sigma _{z}={\begin{pmatrix}1&0\\0&1\end{pmatrix}}=\mathbb{1}}$$

Their determinants and traces are such that:

$${\displaystyle {\begin{aligned}\det \sigma _{j}&~=\,-1\,,\\\operatorname {tr} \sigma _{j}&~=~{}~{}~\;0~.\end{aligned}}}$$

The latter property is known as being "traceless". You can also check the traces of products:

$${\displaystyle {\begin{aligned}\operatorname {tr} \left(\sigma _{j}\sigma _{k}\right)&=2\delta _{jk}\\\operatorname {tr} \left(\sigma _{j}\sigma _{k}\sigma _{\ell }\right)&=2i\varepsilon _{jk\ell }\\\operatorname {tr} \left(\sigma _{j}\sigma _{k}\sigma _{\ell }\sigma _{m}\right)&=2\left(\delta _{jk}\delta _{\ell m}-\delta _{j\ell }\delta _{km}+\delta _{jm}\delta _{k\ell }\right)\end{aligned}}}$$

We will conclude with a property that shall prove useful later on when looking at the propagation of photons in optical circuits, the complex exponential of a Pauli matrix:

$${\displaystyle ~~e^{ia\left({\hat {n}}\cdot {\vec {\sigma }}\right)}=\mathbb{1}\cos {a}+i({\hat {n}}\cdot {\vec {\sigma }})\sin {a}~~,}$$

which is left for you to prove as an exercise.