phy231

Next: About this document ...

Physics 231

Mathematical Physics

and

Classical Electrodynamics

Part II

Spring Semester, 1997

Robert G. Brown, Instructor

Duke University Physics Department¹

Physics 231

Mathematical Physics

and

Classical Electrodynamics

Part II

Instructor: Robert G. Brown Room 206

Texts: Arfken, Jackson (and Wyld, optional). Phone: 660-2567

Course Description In this course we will cover the following basic topics:

ODE's (starting at 2nd order - assuming that 1st order was completed last semester).
Curvilinear coordinates and tensors.
Orthogonal functions and the idea of functional analysis.
PDE's of physics: Laplace, Poisson, Helmholtz, Schrödinger, Wave (and maybe even Diffusion/Heat) equations, mostly in three dimensions and spherical or cylindrical coordinates.
Focus on PDE's of Electrodynamics: Laplace, Poisson, Helmholtz equations, Maxwell's equations, boundary value problems of electro- and magnetostatics.
Green's Function Methods! Integral Equation Solutions to PDE's of Physics! Multipoles!

There will be, as you may have guessed, lots of problems. Actually, there won't as many as last year, but they'll seem like a lot as they'll be considerably harder. The structure and organization of the course will be (approximately!):

: 30% of grade Homework.
: 30% Take-Home Midterm Exam.
: 30% Take-Home Final Exam.
: 10% Research/Computing project.

In more detail, Homework is Homework, the Exams are Homework (fancied up a bit and with more stringent rules) and the Research Project is described below. These figures are only approximate. I may make homework worth a little more or less, but this is about right.

The way my grading scheme typically works is that if you get below a 50 and have not religiously done (well!) and handed in your homework, you fail (U). If you get less than a 60 and have not religiously handed in your homework, you get an (S). If you get 60 or more you get a G or E of some sort and ``pass''. If you have religiously done your homework, but have somehow managed to end up less than a 60 or (worse) 50 you may make sorrowful and wounded noises and perhaps get a G- or S, respectively. If you have not done and handed in your homework on time or have not followed the rules with respect to your homework, don't bother me about your grade - it will likely be bad and you will deserve it. Note that if you get as little as 80 per cent of your homework credit, you will only need 40 percent of your exam credit to get at least a G-.

The Rules

The RULES are very serious. In previous years, certain students have reportedly betrayed the trust inherent in the rules. This has led to calls that the rules for this course (and all graduate courses here) be stringently tightened. I would prefer NOT to see this happen, as I think that the rules optimize the learning process as the stand and minimize Mickey Mouse interactions as well, but IF I get any hint of misbehavior (verified or not) I'll tighten things up very, very quickly and we'll all have to work harder and be more frustrated while working. So, PLEASE! Follow the spirit as well as the letter of the rules. You are here to learn your chosen profession, and it has never been truer that choosing an easy path is ultimately cheating yourself. Besides, it will show up on prelims!

Rules:

You may collaborate with your classmates in any permutation on the homework. In fact, I encourage you to work in groups, as you will probably all learn more that way. However, you must each work out and individually write up all the solutions even if your ``figure them out'' (and they are thus all nearly the same) within a group. Working them out and writing them up from scratch without looking (once you get the idea of the solution) provides invaluable learning reinforcement. Unless the assignment is a computer or mathematica project, I want handwritten solutions neatly done on white (laserprinter) paper.
You may not get worked out solutions to specific problems from more advanced graduate students, the (any) solution manual (if you can find it) or anyplace else. It obviously removes the whole point of the homework in the first place.
You may ask me, your grader, more advanced students, other faculty, personal friends, or your household pets for help or tutoring on particular problems, as long as no worked-out solutions to the assigned problems are present.
You may (indeed must) use the library and all available non-human resources to help solve the problems. I don't even care if you find the solution somewhere and copy it verbatim provided that you understand it afterwards (which is the goal), cite your source, and provided that you do not use the solution manual for Jackson problems (which exists, floating around somewhere), or Arfken (which may or may not exist) see second item above.
You may NOT collaborate with each other or get outside human help on the take home exam problems. They are to be done alone. There will be a time limit (typically 24 hours total working time) on the take home exams, spread out over four days or so.
You may still use the library and non-human resources on the take home exam problems, provided that (as usual) you don't look up the answer to e.g. - a Jackson problem in a Jackson solution set of one sort or another. Since solutions to difficult problems may well be in the literature, this is an invaluable aid! Last semester, the answer to 6.19c (for example) was the verbatim contents of a paper by Brill and Goodman (as noted in both text and the lectures at the appropriate time). I recommend immediately obtaining a copy of any paper I refer to in lecture, as it may well figure prominently in an exam question...

Research Project: I'm offering you instead the following assignment, with several choices. You may prepare any of:

A set of lecture notes on a topic, relevant to the material we will cover, that interests you. If you select this option you may be asked to present the lecture for that topic in my place, time permitting.
A review paper on a mathematical physics topic, relevant to the material we will cover (or not cover!), that interests you. Typically, in the past, students going into (e.g.) FEL have prepared review papers on the electromechanism of the FEL. That is, relevance to your future research is indicated but not mandated.
A computer demonstration or simulation of some important electrodynamical principle or system. Possible projects here include solving the Poisson and inhomogeneous Helmholtz equation numerically, evaluating and plotting radiation patterns and cross-sections for complicated but interesting time dependent charge density distributions, etc. Resources here include Mathematica, maple, SuperMongo, Numerical Recipes, and more. Obviously now is not the time to learn to program; presumably you are all competent in f77 or c or both. Better late than never, if not, and this is a possible learning project.

If you choose to do a project, it is due TWO WEEKS before the last class² so don't blow them off until the end. It is strongly recommended that you clear the topic with me, too.

I will grade you on: doing a decent job (good algebra), picking an interesting topic (somewhat subjective, but I can't help it and that's why I want to talk to you about it ahead of time), adequate preparation (enough algebra), adequate documentation (where did you find the algebra), organization, and Visual Aids (pictures or interactive demos are sometimes worth a thousand equations). Those of you who do numerical calculations (applying the algebra) must also write it up and (ideally) submit some nifty graphics, if possible.

I'm not going to grade you particularly brutally on this -- it is supposed to be fun as well as educational. However, if you do a miserable job on the project, it doesn't count. If you do a decent job (evidence of more than 20 hours of work) you get your ten percent of your total grade (which works out to maybe a third-of-a-grade credit and may be promoted from, say, a G+ to a E-).

I will usually be available for questions after class. It is best to make appointments to see me via e-mail. My third department job is managing the computer network (teaching this is my second and doing research is my first) so I'm usually on the computer and always insanely busy. However, I will nearly always try to answer questions if/when you catch me. That doesn't mean that I will know the answers, of course ...

Our grader is Matt Sexton, Room 203A, 660-2566, and is also generally available for help in the coursework. He has tentatively set ``office hours'' at 2:30 to 3:30 Tuesday afternoons (plus runover allowance). He will answer questions too, when He can. Otherwise He'll look busy and say that He needs to think about it and come bug me. I, in turn, will look puzzled and say I'll think about it and spend all night in the library trying to figure it out so I can tell her and she can tell you. I find (e.g.) Jackson problems just as hard as you do -- and I've done a bunch of them a bunch of times.

I welcome feedback and suggestions at any time during the year. I would prefer to hear constructive suggestions early so that I have time to implement them this semester.

I will TRY to put a complete set of lecture notes, printed out like this up on the Web in both PS and html/gif form. Exemplary problems from other sources may also be included.

Lecture 1

General, Nth Order Linear Homogeneous ODE's

$\begin{displaymath} a_N(x) \frac{d^{N} y}{dx^{N}} + a_{N-1}(x) \frac{d^{N-1} y}{dx^{N-1}} + \ldots + a_0(x)y = 0 \end{displaymath}$

(1)

If we divide out the $a_N(x) \to A_k(x) = \frac{a_k(x)}{a_N(x)}$ , this becomes:

$\begin{displaymath} \frac{d^{N} y}{dx^{N}} + A_{N-1}(x) \frac{d^{N-1} y}{dx^{N-1}} + \ldots + A_0(x)y = 0 \end{displaymath}$

(2)

(making the leading order linearity obvious). Let

be complex - most general abelian algebra. Then a point

is:

ordinary if are all analytic at .
singular if not:
1. regular singular if $(x - x_0)^{N-k}A_k(x)$ are all analytic, so that $A_{N-m}(x)$ has no more than an th order pole.
2. irregular point or essential singularity (otherwise).

Fuch's Theorem

At an ordinary point there are linearly independent solutions.
The Taylor series for these, $\sum_{n=0}^{\infty} a_n (x-x_0)^n$ converges at least up to the nearest singular point.
At a regular singular point, there is at least one solution of the form:

$\begin{displaymath} (x - x_0)^k \sum_{n=0}^{\infty} a_n (x - x_0)^n \end{displaymath}$ (3)

where can be any complex number. is called the indicial exponent. This expansion, too, converges at least to the nearest singular point.

Example: Bessel (Differential) Equation

$\begin{displaymath} x^2\frac{d^{2} y}{dx^{2}} + x\frac{d^{} y}{dx^{}} + (x^2 - m^2)y = 0 \end{displaymath}$

(4)

$\begin{displaymath} \frac{d^{2} y}{dx^{2}} + \frac{1}{x}\frac{d^{} y}{dx^{}} + (1 - \frac{m^2}{x^2})y = 0 \end{displaymath}$

(5)

so (obviously)

is a regular singular point.

Solutions: Frobenius Method

...follows directly from Fuch's Theorem:

If you want to expand around $x_0 \ne 0$ , change variables so that you are expanding around 0: or (if $x_0 = \infty$ ) . This set of coordinates defines the radius of convergence of the solution derived.
Decide which case/kind of point is:
- Ordinary: Use $y(x) = \sum_n a_n x^n$ .
- Regular singular: Use $y(x) = x^k \sum_n a_n x^n$ with $a_0 \ne 0$ .
- Irregular singular, essential: Use Famous Mathematician Method (i.e. - look it up if you can find it) or Give Up.
Substitute into ODE, also expanding any coefficients (e.g. - . Match up coefficients of each power of .
- Take especial care with the first coefficients (usually 2, for our purposes).
- After that, write a recurrence relation and hence the whole solution. Make pithy observations, if any exist to be made.
- For the regular singular point case, the lowest remaining power of determines via the indicial equation.

Examples

Airy's Equation

$\begin{displaymath} \frac{d^{2} y}{dx^{2}} - xy = 0 \end{displaymath}$ (6)

Note that is ordinary, no singularities but .

$\displaystyle y(x)$ $\textstyle =$ $\displaystyle \sum_{n=0}^{\infty} a_n x^n$ (7)

$\displaystyle \frac{d^{1} y}{dx^{1}}(x)$ $\textstyle =$ $\displaystyle \sum_{n=1}^{\infty} n a_n x^{n-1}$ (8)

$\displaystyle \frac{d^{2} y}{dx^{2}}(x)$ $\textstyle =$ $\displaystyle \sum_{n=2}^{\infty} n(n-1) a_n x^{n-2}$ (9)

or

$\begin{displaymath} \left(\sum_{n=2}^{\infty} n(n-1)a_n x^{n-2}\right) = \left(\sum_{n=0}^{\infty} a_n x^{n+1}\right) = 0 \end{displaymath}$ (10)

Examine the coefficients of:
- : $(2)\times(1)a_2 = 0$ or .
- : $(3)\times(2)a_3 - a_0 = 0$
- $x^{m+1}$ : $(m+3)\times(m+2) a_{m+3} - a_m = 0$
or

$\begin{displaymath} \frac{a_{m+3}}{a_m} = \frac{1}{(m+3)(m+2)} \end{displaymath}$ (11)

We can now reconstruct the entire solution:

$\begin{displaymath} y(x) = a_0\left(1 + \frac{x^3}{6} + \frac{x^6}{360} + \ldots... ...x^4}{12} + \frac{x^7}{504} + \ldots\right) + (a_2 = 0 \ldots) \end{displaymath}$ (12)

Remarks:
- Two linearly independent solutions with coefficients and .
- Converges everywhere (ratio test for or Fuch's Theorem).
- and (Airy functions/Airy integrals) correspond to two particular choices for . These are not too important in E&M, but are important in quantum theory.
Legendre Equation:

$\begin{displaymath} (1 - x^2)\frac{d^{2} y}{dx^{2}} - 2x\frac{d^{} y}{dx^{}} + \ell(\ell+1)y = 0 \end{displaymath}$ (13)

is ordinary. $x = \pm 1$ are regular singular points. Start with . Using same expansions for as above:

$\begin{displaymath} (1 - x^2)\sum_{n = 2}^{\infty} n (n-1)a_n x^{n-2} - 2x\sum_{... ...} n a_n x^{n-1} + \ell(\ell+1)\sum_{n=0}^{\infty} a_n x^n = 0 \end{displaymath}$ (14)

With a little work (and treating the first two carefully) we find that

$\displaystyle \frac{a_{n+2}}{a_n}$ $\textstyle =$ $\displaystyle \frac{-\ell^2 - \ell + n^2 + n}{(n+2)(n+1)}$

$\textstyle =$ $\displaystyle - \frac{(\ell-n)(\ell + n + 1)}{(n+1)(n+2)}$ (15)

or

$\displaystyle y(x)$ $\textstyle =$ $\displaystyle a_0\left(1 - \ell(\ell+1)\frac{x^2}{2!} + \ell(\ell+1)(\ell-2)(\ell+3)\frac{x^4}{4!} + \ldots\right)$ (16)

$\textstyle +$ $\displaystyle a_1\left(x - (\ell - 1)(\ell + 2)\frac{x^3}{3!} + (\ell-1)(\ell+2)(\ell-3)(\ell+4)\frac{x^5}{5!} + \ldots\right)$

Remarks:
The solution is supposed to exist on the interval (right up to the nearest singular points). If one examines the limit of the ratio of two successive terms (the ratio test) one finds that:

$\displaystyle \lim_{n \to \infty} \frac{a_{n+2}x^{n+2}}{a_n x^n}$ $\textstyle =$ $\displaystyle \frac{(-\ell^2 - \ell + n^2 + n)x^2}{(n+2)(n+1)}$

$\textstyle =$ $\displaystyle \lim_{n \to \infty} \frac{a_{n+2}x^{n+2}}{a_n x^n} - \frac{(\ell-n)(\ell + n + 1)x^2}{(n+1)(n+2)}$

$\textstyle =$ $\displaystyle x^2$ (17)

This indeed does converge for $\left \vert x \right \vert < 1$ , but the end point behavior is ambiguous.
If $\ell$ is an integer, one of the two independent series terminates, giving a finite polynomial solution that is defined at the end points. The other (as it turns out, see e.g. - Courant and Hilbert for discussion) diverges and must be rejected. We can tabulate these solutions for each $\ell$ (including some optional normalization so that $P_\ell(x) = 1$ ):

$\displaystyle 1 P_0(x)$ $\textstyle =$ $\displaystyle 1$ (18)

$\displaystyle 1 P_1(x)$ $\textstyle =$ $\displaystyle x$ (19)

$\displaystyle -2 P_2(x)$ $\textstyle =$ $\displaystyle 1 - 3x^2$ (20)

$\displaystyle -\frac{2}{3} P_3(x)$ $\textstyle =$ $\displaystyle x - \frac{5}{3}x^3$ (21)

$\displaystyle \ldots$ (22)

or in more familar form we have the Legendre Polynomials:

$\displaystyle P_0(x)$ $\textstyle =$ $\displaystyle 1$ (23)

$\displaystyle P_1(x)$ $\textstyle =$ $\displaystyle x$ (24)

$\displaystyle P_2(x)$ $\textstyle =$ $\displaystyle \frac{1}{2}(3x^2-1)$ (25)

$\displaystyle P_3(x)$ $\textstyle =$ $\displaystyle \frac{1}{2}(5x^3-3x)$ (26)

$\displaystyle \ldots$ (27)

Another solution exists around the $x = \pm 1$ points with a radius of converge of 1, $\lim \left\vert \frac{a_{n+2}}{a_n} \right\vert = 1$ . We'll look at this later, maybe.
Bessel's Equation (with m = 0):

$\begin{displaymath} x^2 \frac{d^{2} y}{dx^{2}} + x\frac{d^{} y}{dx^{}} + x^2 y = 0 \end{displaymath}$ (28)

is a regular singular point, so we try:

$\displaystyle y(x)$ $\textstyle =$ $\displaystyle x^k\sum_{n = 0}{\infty} a_n x^n = \sum{n = 0}{\infty} a_n x^{n+k}$ (29)

$\displaystyle y'(x)$ $\textstyle =$ $\displaystyle x^k\sum_{n = 0}{\infty} a_n x^n = \sum{n = 0}{\infty} a_n (n+k)x^{n+k-1}$ (30)

$\displaystyle y''(x)$ $\textstyle =$ $\displaystyle x^k\sum_{n = 0}{\infty} a_n x^n = \sum{n = 0}{\infty} a_n (n+k)(n+k-1)x^{n+k-2}$ (31)

Substituting these into the ODE we obtain:

$\begin{displaymath} \sum_{n = 0}^{\infty} \left\{(n+k)(n+k-1)a_n x^{n+k} + (n+k)a_n x^{n+k} \right\} + \sum_{n = 0}^{\infty} a_n x^{n+k+2} = 0 \end{displaymath}$ (32)

The lowest power of in these sums is (or ):

$\begin{displaymath} x^k:\quad\quad k(k-1)a_0 + k a_0 = 0 \to k^2 a_0 = 0 \to (k = 0, a_0 \ne 0) \end{displaymath}$ (33)

where the latter follows because we insist that $a_0 \ne 0$ (or we would be starting the series somewhere else). With , we look at the power of $x^{k+1} = x$ :

$\begin{displaymath} x^{k+1}:\quad\quad \left[(k+1)k + k\right] a_1 = 0 \to a_1 = 0. \end{displaymath}$ (34)

Finally, for general :

$\begin{displaymath} x^n:\quad\quad \left[n(n-1) + n\right] a_n + a_{n-2} = 0 \end{displaymath}$ (35)

which is the recursion relation:

$\begin{displaymath} \frac{a_n}{a_{n-2}} = -\frac{1}{n^2} \end{displaymath}$ (36)

and we see that all odd terms vanish.
Reconstructing the series is now easy:

$\begin{displaymath} y(x) = a_0\left(1 - \frac{x^2}{2^2} + \frac{x^4}{2^2 4^2} - \frac{x^6}{2^2 4^2 6^2} + \ldots \right) \end{displaymath}$ (37)

and we identify the part in parentheses with:

$\begin{displaymath} J_0(x) = \sum_{n = 0}^{\infty} \frac{(-1)^n}{(n!)^2} \left(\frac{x}{2}\right)^{2n} \end{displaymath}$ (38)

We will shortly solve the general case for $m \ne 0$ . Or, I should say, you will for homework. The general theory of solutions around a regular singular point suggests that we should try a second solution of the form:

$\begin{displaymath} N_0 = J_0(x) \ln(x) + \sum_{n = 0}^{\infty} a_n x^n \end{displaymath}$ (39)

(which has a logarithmic singularity at !) and substitute to try to solve for the or proceed with the Wronskian method or proceed according to Wyld, Section 4.3. Eventually we'll probably do one of these things, but if we don't you can always look it up.

Orthogonal Functions, Representations

Suppose we are given (by God, if you like) an interval , a ``weight function'' (or density function) , and a set of reasonably smooth linearly independent functions defined on the entire interval. These functions are orthogonal if:

$\begin{displaymath} \left< u_m \right\vert\left. u_n \right> = \int_a^b u_m^*(x) u_n(x) w(x) dx = 0 \quad\quad\quad(m \ne n) \end{displaymath}$

(40)

Let:

$\begin{displaymath} \left< u_n \right\vert\left. u_n \right> = \int_a^b u_n^*(x) u_n(x) w(x) dx = N_n > 0 \end{displaymath}$

(41)

then

$\begin{displaymath} \left< u_n \right\vert\left. u_n \right> = N_n \delta_{mn} \end{displaymath}$

(42)

and we can always construct an orthonormal set:

$\begin{displaymath} \phi_n(x) = \frac{1}{\sqrt{N}} u_n(x) \end{displaymath}$

(43)

such that

$\begin{displaymath} \left< \phi_n \right\vert\left. \phi_n \right> = \delta_{mn} \end{displaymath}$

(44)

We could also (if we wished) absorb into the orthogonal representation, but it is not always useful to do so. In fact, it is not always useful to normalize to unity - sometimes we will use an -dependent normalization to help us cancel or improve a factor that appears elsewhere in our algebraic travails.

Note that orthogonal (or orthonormal) functions are very useful in physics! They are the basis of functional analysis, both in quantum mechanics and in the DE's of the rest of physics as well. Hence our formalization of the process:

Notation: Let $\left\vert f \right> = f(x)$ , $\left\vert \phi_n \right> = \phi_n(x)$ , $\left< \phi_n \right\vert = \phi_n^*(x)$ , etc. Then we can notationally formalize the relationship between functional orthogonality and the underlying vector space with its suitably defined norm.

That is, suppose is a piecewise continuous function on and suppose further that the $u_n(x) \to \phi_n(x)$ are orthogonal and complete. Then:

$\begin{displaymath} f(x) = \left\vert f \right> = \sum_{n = 0}^{\infty} a_n \left\vert \phi_n \right> \end{displaymath}$

(45)

where

$\begin{displaymath} a_n = \left< \phi_n \right\vert\left. f \right> = \int_a^b \phi_n^*(x) f(x) w(x) dx \end{displaymath}$

(46)

so that

$\displaystyle \left\vert f \right>$	$\textstyle =$	$\displaystyle \sum_{n = 0}^{\infty} a_n \left\vert \phi_n \right>$
	$\textstyle =$	$\displaystyle \sum_{n = 0}^{\infty} \left< \phi_n \right\vert\left. f \right> \left\vert \phi_n \right>$
	$\textstyle =$	$\displaystyle \sum_{n = 0}^{\infty} \left\vert \phi_n \right>\left< \phi_n \right\vert\left\vert f \right>$	(47)

is a consistent statement for general

only if

$\begin{displaymath} \sum_{n = 0}^{\infty} \left\vert \phi_n \right>\left< \phi_n \right\vert \equiv 1 \end{displaymath}$

(48)

In these equations we repeatedly write down series sums over a (possibly and indeed usually infinite) set of functions. We must, therefore, address the issue of whether, and how, these sums converge. Noting that we were pretty slack in our requirements on - it only had to be piecewise continuous, for example, and we said nothing about how it behaved at the points and themselves.

Clearly we cannot converge uniformly (at each and every point) since at certain points the ``function'' really isn't - whether or not you like to think that it has two values at the discontinuities, it clearly approaches different values from the two sides in a limiting sense that makes its value at the limit point hard to uniquely define. It turns out that the kind of convergence that we can expect is at least:

$\begin{displaymath} \lim_{N \to \infty} \int_a^b \left\vert f(x) - \sum_{n = 0}^{N} a_n \phi_n(x) \right\vert^2 w(x)dx \to 0. \end{displaymath}$

(49)

This is called convergence in the mean since the mean square error measured in this way goes to zero. This is a wee bit weaker than plain old ``convergence'' or ``uniform convergence'', but for a continuous, smooth, well-behaved function

the convergence is effectively uniform.

Note the $\epsilon,\delta$ (Cauchy criterion for convergence) inherent in this statement. This must exist for us to be able to talk about the sum with a straight face instead of a smirk. This is because in the real world we cannot do infinite sums. If we make a finite approximation:

$\begin{displaymath} f(x) \approx \sum_{n = 0}^{N} a_n \phi_n(x) \end{displaymath}$

(50)

then a very important conclusion that can be shown (we won't show it) is that the

's that minimize the least squares error (the integral above) do not depend on ${\bf N}$ and are in fact

$\begin{displaymath} a_n = \left< \phi_n \right\vert\left. f \right>. \end{displaymath}$

(51)

This is not true if the $\phi_n(x)$ are not orthogonal, for example, if they are

so the sum is a power series, adding each term requires that one rearrange all the

to obtain the new ``best fit''. You can check this for homework, too.

In the meantime, we can understand the fundamental idea by multiplying out the convergence relation and noting that (since ):

$\displaystyle \int_a^b \left\vert f(x) \right.$	$\textstyle -$	$\displaystyle \left. \sum_{n = 0}^{N} a_n \phi_n(x)\right\vert^2 w(x)dx = \int \left\vert f(x) \right\vert^2 w(x) dx$
	$\textstyle -$	$\displaystyle \sum_n \int f(x)^* a_n \phi_n(x) w(x) dx - \sum_n \int f(x) a_n^* \phi_n^*(x) w(x) dx$
	$\textstyle +$	$\displaystyle \sum_n\sum_m \int a_m^a_n \phi_m^(x) \phi_n^*(x) w(x) dx. \ge 0$	(52)

Rearranging, we get Bessel's Inequality:

$\begin{displaymath} \int \left\vert f(x) \right\vert^2 w(x) dx \ge -\sum_n \left... ...ert a_n \right\vert^2 \ge \sum_n \left\vert a_n \right\vert^2 \end{displaymath}$

(53)

If and only if the representation is complete, we obtain

$\begin{displaymath} \int \left\vert f(x) \right\vert^2 w(x) dx = \sum_{n} \left\vert a_n \right\vert^2. \end{displaymath}$

(54)

This is known as Parseval's Theorem.

Let us examine all this in the context of a well-known example. Consider a Fourier Series for on the interval $0 \le x \le 2\pi$ :

$\begin{displaymath} f(x) = \frac{a_0}{2} + \sum_{n = 1}^{\infty} a_n \cos(nx) + \sum_{n = 1}^{\infty} b_n \sin(nx) \end{displaymath}$

(55)

Thus the

functions are ${\frac{1}{2},\cos(nx),\sin(nx)}$ , with

. We can normalize these functions easily enough:

$\displaystyle \phi_0(x)$	$\textstyle =$	$\displaystyle \sqrt\frac{2}{\pi} \times \frac{1}{2} \quad\quad (n = 0)$	(56)
$\displaystyle \phi_n(x)$	$\textstyle =$	$\displaystyle \sqrt\frac{1}{\pi} \times u_n(x) \quad\quad (n \ne 0).$	(57)

With these definitions,

$\displaystyle a_n$	$\textstyle =$	$\displaystyle \frac{1}{\pi} \int_0^{2\pi} f(x) \cos(nx) dx$	(58)
$\displaystyle b_n$	$\textstyle =$	$\displaystyle \frac{1}{\pi} \int_0^{2\pi} f(x) \sin(nx) dx$	(59)
$\displaystyle \int_0^{2\pi} \left\vert f(x) \right\vert^2$	$\textstyle =$	$\displaystyle \left[ \frac{a_0^2}{2} + \sum_{n = 1}^{\infty}\left(\left\vert a_n \right\vert^2 + \left\vert b_n \right\vert^2 \right) \right] \pi$	(60)

where the first relation works for

too.

If we use only the $\sin(nx)$ functions, they are orthogonal but not (by themselves) complete. The expansion may only poorly approximate the function and

$\begin{displaymath} \int_0^{2\pi} \left\vert f(x) \right\vert^2 \ge \sum_{n = 1}^{\infty}\left\vert b_n \right\vert^2 \end{displaymath}$

(61)

Closure and Completeness

Now we can state the closure relation that is directly and intimately connected to completeness. Assume that an orthonormal set has (or that we have absorbed the weight factor into the normalized functions). Then:

$\displaystyle a_n$	$\textstyle =$	$\displaystyle \int_a^b \phi_n^*(y) f(y) dy$	(62)
$\displaystyle f(x)$	$\textstyle =$	$\displaystyle \sum_n a_n \phi_n(x)$
	$\textstyle =$	$\displaystyle \sum_n \int_a^b \phi_n^*(y) f(y) dy \phi_n(x)$
	$\textstyle =$	$\displaystyle \int_a^b \left(\sum_n \phi_n^*(y) \phi_n(x) \right) f(y) dy.$	(63)

But:

$\begin{displaymath} f(x) = \int_a^b \delta(x - y) f(y) dy \end{displaymath}$

(64)

(by definition of the Dirac delta function!) Thus:

$\displaystyle \left(\sum_n \phi_n^*(y) \phi_n(x) \right)$	$\textstyle =$	$\displaystyle \delta(x - y)$	(65)
$\displaystyle \int_a^b \phi_m^*(x) \phi_n(x) dx$	$\textstyle =$	$\displaystyle \delta_{mn}$	(66)

together make the closure relation. It is in this sense that we can write the identity as the sum of unit vector projections:

$\begin{displaymath} \sum_n \left\vert \phi_n \right>\left< \phi_n \right\vert = 1 \end{displaymath}$

(67)

To continue using Fourier representations as an example, consider the normalized function:

$\begin{displaymath} \phi_n(x) = \frac{1}{\sqrt{2 \pi}} e^{inx} \end{displaymath}$

(68)

on the interval $(0,2\pi)$ with

. We can read off of the above that:

$\displaystyle \frac{1}{2\pi} \sum_{n = -\infty}^{\infty} e^{-iny}e^{inx}$	$\textstyle =$	$\displaystyle \frac{1}{2\pi} \sum_{n = -\infty}^{\infty} e^{in(x-y)}$	(69)
	$\textstyle =$	$\displaystyle \delta(x - y) = \delta(y - x)$	(70)

which is a crucial relation in physics. At some point in the near future I will probably lecture you to tears on how $\delta(x-y)$ is not a function but a distribution, or an integral operator, and how you can get into real trouble treating it as a function. However, you have probably heard this lecture four or five times already, so we'll pass it by for now.

Gram-Schmidt Orthogonalization

In many cases one can obtain a linearly independent set of functions , but find upon examination that they are (alas!) not orthogonal. For example, . Fortunately, it is always possible to orthogonalize the set of functions via a simple procedure. Let us define $\{u_n(x)\}$ to be the set of non-orthogonal but linearly independent functions. We will take suitable linear combinations of them to generate an orthogonal set which we will call $\{\psi_n(x)\}$ . Finally, we can normalize this orthogonal set any way we like to form a orthonormal representation $\{\phi_n(x)\}$ .

To understand the Gram-Schmidt procedure, it is easiest to consider it for ordinary Cartesian vectors. Suppose $\vec{A}$ and $\vec{B}$ are two non-orthogonal, but linearly independent vectors that span a two-dimensional plane as drawn below:

In this figure, we see that we can systematically construct $\vec{C} \perp \vec{A}$ by projecting $\vec{B}$ onto $\vec{A}$ and subtracting its $\vec{A}$ -directed component from $\vec{B}$ itself. What is left is necessarily orthogonal to $\vec{A}$ . Algebraically:

$\begin{displaymath} \left\vert A \right\vert\left\vert B \right\vert \ge \vec{A} \cdot \vec{B} > 0 \end{displaymath}$

(71)

We want to find $\vec{C}$ such that $\vec{C} \cdot \vec{A} = 0$ . We try

$\begin{displaymath} \vec{C} = \vec{B} + \lambda \vec{A}. \end{displaymath}$

(72)

Putting this into the condition,

$\begin{displaymath} \vec{A} \cdot \vec{C} = \vec{A} \cdot \vec{B} + \lambda \left\vert A \right\vert^2 = 0 \end{displaymath}$

(73)

$\begin{displaymath} \lambda = -\frac{\vec{A}\cdot\vec{B}}{\left\vert A \right\vert^2} \end{displaymath}$

(74)

so that

$\begin{displaymath} \vec{C} = \vec{B} -\frac{\vec{A}}{\left\vert A \right\vert} ... ...t(\frac{\vec{A}}{\left\vert A \right\vert}\cdot\vec{B}\right) \end{displaymath}$

(75)

exactly as we expected from the figure.

This procedure works just as well for sequential operations and with functional ``vectors'' instead of real space vectors. Pick any function . Let

$\begin{displaymath} \left\vert \psi_1 \right> = \left\vert u_1 \right> \end{displaymath}$

(76)

and go ahead and normalize it:

$\begin{displaymath} \left\vert \phi_1 \right> = \frac{\left\vert \psi_1 \right>}{\sqrt{\left< \psi_1 \right\vert\left. \psi_1 \right>}} \end{displaymath}$

(77)

(I won't explicitly include the normalization step in the future, but you'll see how/where it occurs). Assume

$\begin{displaymath} \left\vert \psi_2 \right> = \left\vert u_2 \right> - a\left\vert \phi_1 \right> \end{displaymath}$

(78)

The condition $\left< \phi_1 \right\vert\left. \psi_2 \right> = 0$ leads to

$\begin{displaymath} a = -\left< \phi_1 \right\vert\left. \psi_2 \right> \end{displaymath}$

(79)

(since $\left\vert \phi_1 \right>$ is normalized) and

$\begin{displaymath} \left\vert \psi_2 \right> = \left\vert u_2 \right> - \left\vert \phi_1 \right>\left< \phi_1 \right\vert{u_2} \end{displaymath}$

(80)

We can now normalize this into $\left\vert \phi_2 \right>$ .

We then try to make

$\begin{displaymath} \left\vert \psi_3 \right> = \left\vert u_3 \right> - b\left\vert \phi_1 \right> - c\left\vert \phi_2 \right> \end{displaymath}$

(81)

such that

$\displaystyle \left< \phi_1 \right\vert\left. \psi_3 \right>$	$\textstyle =$	$\displaystyle 0$	(82)
$\displaystyle \left< \phi_2 \right\vert\left. \psi_3 \right>$	$\textstyle =$	$\displaystyle 0$	(83)

These are two equations, there are two unknowns, and everything is hunky-dory if $\left\vert u_3 \right>$ is linearly independent of the first two (true by hypothesis and revealed even if untrue!). So iterate until you run out of linearly independent functions to orthogonalize or you get bored.

There is a clever example in both Arfken and Wyld where they show that if you take and apply this procedure on the interval with , you obtain the (gasp!) Legendre Polynomials! In fact, if one varies the interval and the weight function, one can obtain all the known orthogonal polynomials in this manner!

Insert Table

These are quite useful for both expansions and numerical integration (quadrature).

The Sturm-Liouville Theorem

Suppose one has a general 2nd order linear homogeneous ODE:

$\begin{displaymath} p_0(x)y''(x) + p_1(x) y'(x) + [p_2(x) + \lambda p_3(x)]y(x) = 0 \end{displaymath}$

(84)

where we assume only that the

's are analytic on

(although a regular singularity at

is ok). We also insist that $p_0(x) \ne 0$ on the interior - all interior points must be ``ordinary''. This sort of 2OLHODE occurs very, very frequently in physics, and hence in this course. It can always (by suitable algebraic manipulations) be put in the self adjoint or Sturm-Liouville form:

$\begin{displaymath} \frac{d^{} }{dx^{}}\left(p(x)\frac{d^{} y}{dx^{}}\right) + q(x)y(x) + \lambda w(x)y(x) = 0 \end{displaymath}$

(85)

Once it is in this form, we can easily show that it has a really nifty property. Let us define the linear operator ${\cal L}$ such that:

$\displaystyle {\cal L}u$	$\textstyle =$	$\displaystyle \frac{d^{} }{dx^{}}\left[p(x)\frac{d^{} u}{dx^{}}\right] + q(x)u$	(86)
$\displaystyle {\cal L}v$	$\textstyle =$	$\displaystyle \frac{d^{} }{dx^{}}\left[p(x)\frac{d^{} v}{dx^{}}\right] + q(x)v$	(87)

(etc.) where

and

are (almost) arbitrary twice differentiable functions! We can integrate by parts twice:

$\begin{displaymath} \int_a^b dx \left[v{\cal L}u - u{\cal L}v\right] = \left[p\l... ...ac{d^{} u}{dx^{}} - u \frac{d^{} v}{dx^{}}\right) \right]_a^b \end{displaymath}$

(88)

(where the

terms cancel on the boundary, as do all the terms with derivatives of both

and

). If we assume only the very weak condition that

and

are well-enough-behaved on the boundaries so that

$\displaystyle Au(a) + Bu'(a)$	$\textstyle =$	$\displaystyle 0$	(89)
$\displaystyle Cu(b) + Du'(b)$	$\textstyle =$	$\displaystyle 0$	(90)

(etc.) the boundary term vanishes and

$\begin{displaymath} \int_a^b dx \left[v{\cal L}u - u{\cal L}v\right] = 0 \end{displaymath}$

(91)

Thus

$\begin{displaymath} \int_a^b dx v{\cal L}u = \int_a^b dx u{\cal L}v \end{displaymath}$

(92)

This is called the Hermitian property of the differential operator; Hermitian and Self-Adjoint mean almost the same thing.

With this observation in hand, we can easily proceed to prove 2/3 of the Sturm-Liouville Theorem for solutions to nearly general (self-adjoint) 2OLHODE's.

We assume (here as above):

real, analytic in
$p \ne 0$ and $w \ne 0$ on
Since $w \ne 0$ it doesn't change sign, and we can always arrange for it to be positive, so $w \ge 0$ on .
That satisfies ``suitable'' homogeneous B.C.'s at and . We require that if is a solution, so is $\alpha y(x)$ . Almost any homogeneous BC works, any BC that works is homogeneous. Exceptions will be clear in the context of physics.

Then solutions to S-L 2OLHODE are a discrete set of eigenfunctions () and corresponding eigenvalues ( $\lambda_n$ ) where:

$\lambda_n$ are all real! (Self-adjoint property)
's are all orthogonal, $\left< u_m \right\vert\left. u_n \right> = \int_a^b u_m^*(x)u_n(x)w(x)dx = 0$ if $\lambda_n \ne \lambda_m$ .
Note that sometimes $\lambda_m = \lambda_n$ for distinct 's so you might think that the 's don't have to be orthogonal. However, they do have to be linearly independent or they are distinct solutions with a vanishing Wronskian! So, we can GSO them to orthogonalize the $\lambda$ -degenerate subspace. So for all practical purposes, after a bit of work, all the are orthogonal or can be made so even if their eigenvalues are the same.
The set of $\{u_n(x)\}$ are complete!

In summary, given a set of 's that solve a 2OLHODE, we can always write them as a complete orthonormal set $\{\phi_n(x)\}$ .

This is a really useful theorem. Immediately implies the orthogonality of all the common ODE solution sets and orthogonal function sets in use in physics, e.g. $y'' + \lambda y = 0$ , with its well-known Fourier solutions on the interval with Dirichlet boundary conditions , , $y_n(x) = \sin(\frac{n \pi x}{L}$ with $\lambda = \frac{n^2\pi^2}{L^2}$ . Whew!

It is easy and instructive to prove the first two properties predicted by the SL theorem. The Hermitian property above implies:

$\displaystyle \int_a^b u^*{\cal L}v dx$	$\textstyle =$	$\displaystyle \left(\int_a^b v^* {\cal L}u dx\right)^*$	(93)
	$\textstyle =$	$\displaystyle \int_a^b v \left({\cal L}u \right)^* dx$	(94)

for any

that satisfy the B.C.'s (not necessarily the ODE). Note well that there is no

in these equations. So we define:

$\begin{displaymath} \left(u\vert{\cal L}v\right) = \int_a^b u^*{\cal L}v dx \end{displaymath}$

(95)

in analogy with the braket notation. In this shorthand, Hermitian means:

$\begin{displaymath} \left(u\vert{\cal L}v\right) = \left(v\vert{\cal L}u\right)^* = \left({\cal L}u\vert v\right) \end{displaymath}$

(96)

and we can conclude that $\left(u\vert{\cal L}v\right)$ is real.

NOW, let and be any two solutions and corresponding to $\lambda_1$ and $\lambda_2$ . Then:

$\displaystyle \left(u_1\vert{\cal L}u_2\right)$	$\textstyle =$	$\displaystyle \left(u_1\vert-\lambda_2 w u_2\right) = -\lambda_2 \left(u_1\vert w u_2\right) = -\lambda_2 \left< u_1 \right\vert\left. u_2 \right> =$
$\displaystyle \left(u_2\vert{\cal L}u_1\right)^*$	$\textstyle =$	$\displaystyle \left(u_2\vert-\lambda_1 w u_1\right)^* = -\lambda_1^* \left< u_2 \right\vert\left. u_1 \right>^* = -\lambda_1^*$	(97)

or (subtracting)

$\begin{displaymath} (\lambda_2 - \lambda_1^*) \left< u_1 \right\vert\left. u_2 \right> = 0 \end{displaymath}$

(98)

THUS

$\begin{displaymath} u_1 = u_2, \quad\quad \left< u_1 \right\vert\left. u_1 \right> \ne 0, \quad\quad \lambda_1 = \lambda_1^* \end{displaymath}$

(99)

(and we conclude that the $\lambda_i$ are real).

$\begin{displaymath} \lambda_1 \ne \lambda_2, \quad\quad \left< u_1 \right\vert\left. u_2 \right> ] = 0 \end{displaymath}$

(100)

(and we conclude that the $\{u_n(x)\}$ are all orthogonal).

It turns out to be quite difficult and involved to prove completeness. One basically has to show closure of one sort or another, and closure is not immediately obvious. It has long since been proven, however, and is shown in serious books like Hilbert and Courant if you want/need to look over the proof some day.

From this day forth, then, I will assume that you just know that the solutions to nearly every 2OLHODE that we treat in this course (and the rest of your courses form an orthogonal (appropriately normalized, in practice) basis, out of which it is perfectly clear that general solutions can be built via superposition.

It is time now for a short interlude, first on tensors (to get the trivia out of the way), then on curvilinear coordinates (to get you to where you can appreciate separation of variables in the Big Three coordinate systems) and we'll hop on back to ODE's, this time in the context of PDE's and ``real physics''.