The Principle Of Superposition
The Need for a Quantum Theory.
Since its development by Newton, Classical Mechanics has been applied to a growing range of dynamical systems, including the Electromagnetic Field.
The ideas and laws contained within are elegant, and it would seem that they could not be altered without ruining the benefits of them, however, it is possible and necessary to do so, in order to describe events at the atomic scale, which we call Quantum Mechanics.
This departure is necessitated by the results of experiment: it is known that the forces of Electrodynamics are not sufficient to describe the observed stability of atoms and molecules, which provides definite physical and chemical properties to all material/matter.
This cannot be remedied by the introduction of new forces, as there are general principles within classical mechanics, that govern all the forces, yet still disagree with experiment.
For example, the disturbance of the equilibrium of some atomic system, will lead to its oscillation, and these oscillations will impress themselves upon the surrounding electromagnetic field, and these oscillations can be observed via a spectroscope. Whatever laws of force govern these oscillations, one would assume the ability to include the various frequencies of the oscillation, in a scheme derived from fundamental frequencies and the associated harmonics, but this is not the case.
What is in fact observed, is a new relationship between the frequencies, which is the Ritz Combination Law of Spectroscopy, which states frequencies can be expressed as differences between terms, the number of which is greater than the number of frequencies, which is untenable from the classical standpoint.
One way to dissolve the conflict would be to assume each of the spectroscopically observed frequencies to be fundamental, with its own degree of freedom, with the laws of force leading to harmonic vibration not occurring.
This however, will not solve the problem, providing no explanation of the Combination Law, because it would be in conflict with the experimental data for specific heats.
Classical statistical mechanics, allows for defining a connexion between the total number of degrees of freedom of some configuration of some number of vibrating physical systems, and its specific heat. If we assume that all of the spectroscopic frequencies of an atom correspond to different degrees of freedom, the value for the specific heat would be greater than the observed value.
However, the values observed are given rather accurately by a theory that considers simply each atoms motion as a whole, assigning no internal motion to it.
This results in a new contradiction between classical mechanics and the experimental data: there must be some internal motion of an atom in order to account for its specific heat, but the internal degrees of freedom do not contribute to it, and the reason for this cannot be explained classically. A similar contradiction exists in the relationship between the energy oscillation of an electromagnetic field in a vacuum: Classical Mechanics requires specific heats that correspond to this energy to be infinite, but the observed values are obviously finite.
From this we can conclude that high frequency oscillations do not contribute the expected classical value to the specific heat of the atom. Additionally, classical mechanics fails to accurately explain the behavior of light: this behavior is subject to interference and diffraction, which requires a wave theory to explain, however, other phenomena such as the Photoelectric Effect, and free electron scattering, indicate that light is composed of particles. These particles, referred to as Photons, have definite energies and momentums, which is dependent upon the frequency of the light, and they can be considered as real as Electrons.
Experiment has shown that this is not specific to light, but is rather general to all material particles, which exhibit wave like behavior under certain conditions. This is not just an inaccuracy of the laws of motion to describe this, but also and inadequacy in the ability of classical concepts to describe atomic phenomena. The requirement of departing from classical mechanics in order to fully account for the fundamental structure of matter is not only evidenced by experimental data, but also on philosophical grounds.
Classical explanations of the structure of matter, allow one to assume it to be comprised from a number of constituent parts, postulate laws for the behavior of these components, and wholly derive the laws of matter from them. The issue here is that this provides no answer for the structure and stability of the constituent parts.
To explore this, it becomes necessary to postulate that constituent parts, are themselves composed of smaller parts, which is the beginning of an infinite regression: there's no logical end to this process. The issue becomes that, as long as big and small are relative concepts, explaining the big in terms of the small is ineffective: the ideas of classical physics must be modified to give an absolute meaning to size.
At this point, it is important to remember that science concerns itself with the observable, and that we can only observe objects by letting it interact with external forces, and an act of observance is accompanied by some sort of disturbance of the observed object.
We can then define an object to be big when the act of observation carries with it a disturbance that is negligible, and small when the disturbance is non-negligible. We assume that by being careful, we can minimize the extent of the disturbance as much as we'd like: the concepts of big and small are thus relative, and really refer to the degree to which our observation creates a disturbance in the observed object.
In order to give an absolute definition of size for a fundamental theory of matter, we have to assume then that there is a limit to how finely we can make observations, and how negligible the subsequent disturbance can be.
A limit which is inherent in the nature of things and can never be surpassed by improved technique or increased skill on the part of the observer.
If the observed object is not affected by the limiting disturbance, it is big, and we can consider it classically, however if the disturbance is not negligible, then we require a new theory for dealing with it. A consequence of this is that causality must be revised, and considered as only applying to systems which are negligibly disturbed by observation. If a system is small, it cannot be observed without being seriously disturbed, leading to a lack of causal connexion between the results of observations. Causality can still be assumed with regards to undisturbed systems, and the equations necessary to describe these systems will be differential, defining a causal relationship between conditions at one time, and conditions at later times. These equations will be closely related to classical mechanics but will only be related to the results of observations indirectly.
There is an unavoidable indeterminacy in the calculation of observational results, the theory enabling us to calculate in general only the probability of our obtaining a particular result when we make an observation.
The Polarization of Photons.
Preceding this, it has become obvious that there is a limit to the gentleness of observation, and that consequently, the results observed do not provide a quantitative basis for building up quantum mechanics: this requires a new set of accurate laws about the inner workings of nature.
One of the most fundamental of these new laws is the Principle of Superposition of States, a general formulation of which can be formed by considering certain cases, such as the polarization of light. Experiment shows that when plane polarized light is used for ejecting photo-electrons, there is a preferred direction for the emission of the photons. This property of light is closely related to its particle nature, requiring each photon being prescribed a polarization: Given a beam of plane polarized light, in some direction, and a beam of circularly polarized light, one must consider the nature of the constituent photons of each beam.
Each photon is in a state of polarization, with the problem now being fitting this information into the facts about the resolution of light into polarized components, and recombining them. If there is a beam of light, shone through tourmaline, which has thqe property of letting through only light which is plane polarized perpendicular to the axis, classical Electrodynamics indicates what will happen with the incident beam with any polarization.
If this beam is polarized perpendicularly, it will go through the crystal, however if it is parallel to the axis, none of it will, while if it is polarized at some angle to the axis, the percentage equivalent to , will travel through: the question is now, how to make sense of this behavior at the level of individual photons.
A beam of plane polarized light must be comprised of similarly polarized photons, and this presents no difficulty if the incident beam is polarized perpendicularly: we assume that these photons travel through, and ones polarized parallel to the axis are absorbed: however the arbitrarily polarized beam presents a problem. Each of the photons are arbitrarily polarized and it becomes unclear what effect the crystal will have on them. Susskind references this very same experiment using the spin of electrons and some apparatus in Spins and Qubits.
An inquiry about the behavior of an individual photon under these circumstances is generally not precise. Precision requires an experiment performed, which has some bearing on the question, about which one can consider the outcomes. In this case, the obvious experiment is to take a beam, consisting of a single photon, and observe what takes place behind the crystal: according to quantum mechanics, the result will be that sometimes a photon will be observed there, and other times, it will not.
If one repeats the experiment however, the photon will be observed there roughly , of the number of times the experiment was repeated, and times, it will be absorbed: these values are statistically equivalent to the expected values from classical theory. This is the way in which the individuality of photons are preserved, although this is only doable because we abandon the determinism, of classical theory. The most that one can predict is the probability of the occurrence of each outcome.
Thoughts:
Questions about what decides whether the photon is to go through or not and how it changes its direction of polarization when it does go through cannot be investigated by experiment and should be regarded as outside the domain of science.
Furthermore, we can suppose that an arbitrarily polarized photon may be regarded as being both in the parallel and perpendicular polarization state, which can be considered as a sort of superposition applied to the two states of polarization. This implies a unique relationship between the states of polarization, which allows for any state of polarization to be resolved into, or expressed as a superposition of two mutually perpendicular states of polarization. Furthermore, when we introduce the photon to crystal, we are subjecting it to the act of observation, about the nature of its polarization relative to the axis of the crystal: the effect of this observation is to force the photon entirely into one state of polarization, having made a jump there from a superposed state, though which of the two states it will choose, cannot be predicted, and is governed only by probability.
Interference of Photons.
Another example of superposition concerns the location and momentum of photons in space: if we take a beam of monochromatic light, we know something about the location and momentum of the photons. The information we have about them is that they each are located in the region of space through which the beam is traveling, and they have a momentum which is given by the frequency times a universal constant, according to the photo-electric law(Einstein's Photo-Electric Law). This information constitutes what is called a translational state.
We can speak about the description quantum mechanics gives us of this interference, via considering an experiment: Say we have an interferometer, passed through a beam, splitting it into two components, and we make the two components interfere. Just as with the preceding section we can take a single photon beam, and inquire about the effect of the interferometer, which will give us an example of the conflict between wave and particle descriptions of matter.
Again, as with the preceding section we consider the photon to be partly in each of the component beams, which is a translational state that is a superposition of the translational states of the two component beams. This leads us to a generalization of the term translational state as it is applied to photons. For a photon to be in a definite translational state it does not require it to be restricted to one beam, but may be associated with two or more beams. In the mathematics of the theory, each translational state is associated with some wave function of ordinary wave optics, meaning they are superposable. We also consider what takes place when we attempt to measure the energy of one of the components, the result of which will correspond to the whole photon, or nothing at all. This means the photon must suddenly change from being associated partly with a number of beams, to being associated with only one. This sudden change is due to the disturbance that observation necessarily makes.
It is impossible to predict in which beam the whole photon will be found- only the probability of either result can be calculated from the distribution of the photon over two beams. It could be possible to make an energy measurement without destroying the component beam, opting to reflect the beam from a mirror and observe its recoil, but after such a measurement it will be impossible to bring the two component beams to interfere.
In this manner, quantum mechanics is able to reconcile the wave and particle behavior of light, with the solution being to associate each translational state of a photon with one of the wave functions of wave optics. This association, however, cannot be explained in terms of classical theory, and is entirely novel. As opposed to the behavior of particles and waves in classical mechanics, this association can only be interpreted statistically, with the wave function providing us the information necessary to make only probabilistic statements about where the photon may be located after measurement.
Prior to the advent of quantum mechanics, people were aware of the statistical nature of the connexion between waves of light and photons, however, they were not aware of was that the wave function represents the probability of one photon being in a particular place, as opposed to the number of photons in said space. This distinction is important: if we split a beam of a large number of photons into equal components, we could assume that the intensity of the beam is related to the number of photons in it, which would indicate a need to halve the number of photons in each beam.
If the two beam components are made to interfere, the constituent photons will do so as well, meaning sometimes these photons would have to annihilate one another, and other times they would have to produce four photons.
The behavior of these photons represents the constructive and destructive interference resulting from the wave behavior of matter, where crests and troughs cancel out, and crests meeting crests combine.
The issue here is that this would violate the law of conservation of energy.
The new theory avoids this violation by connecting wave functions with probability for a single photon, where each photon is partially within the beam components, and thus only interferes with itself.
The association of particles with waves is not peculiar to light, but rather is of universal applicability: all kinds of particles are associated with waves, and thus all wave motion is associated with particles.
Superposition and Indeterminacy.
From within the preceding sections, a very novel idea has been presented, namely superposition, which may not give a satisfactory fundamental picture of single-photon processes, leading to an inquiry about the use of such a strange idea. The first question can be dissolved by the realization that physical science is chiefly concerned with generating laws that govern phenomena, not generating meaningful pictures of these phenomena: the picture is of "secondary importance".
Additionally, with regard to atomic phenomena, there very well is little in the way of a picture, in the classical sense, which mirrors macroscopic intuition. However, one may be able to acquire a picture of sorts, regarding atomic phenomena, by becoming familiar with the inner workings of the theory, which lends itself to the extension of the meaning of the word picture, to include any way of looking at the fundamental laws which make their self consistency obvious. In terms of experimental results, for many simple experiments, and elementary, statistical explanation would be sufficient to explain the results, however, for more complex phenomena/physical systems, a richer theory is needed, such as is provided by quantum mechanics.
It should also be noted that while quantum mechanics is unnecessary for use with simpler phenomena, it can explain them successfully. Another concern might be that the removal of determinacy may bring about undue complications, and while these complications do exist, they are a tradeoff for the simplification provided by the principle of superposition of states. However, it is also a requirement that we precisely define what we mean by state of an atomic system.
An atomic system is comprised of particles (bodies with specific properties of Mass, Inertia), and subject to the laws governing forces acting upon objects. Each component of this system will have various possible motions it could take, and each motion is considered a state. Classically, one can describe this state via the coordinates of the system and the velocities of the components at some point in time, with the complete motion thus being determined. Given that we know that we cannot observe a small system, with that amount of detail, due to the limit on how finely we can make observations: this corresponds to a limitation on the amount of data that can be assigned to a state.
In short, the state of an atomic system must be specified by by fewer, or more indefinite data than a complete set of numerical values for all the coordinates and velocities at some instant of time. If the system is a single photon, the state would be specified entirely by the translational state and the polarization.
A state of a system can be defined as some undisturbed motion, restricted by as many conditions or pieces of data possible, without mutual interference or contradiction: these conditions could be imposed by suitable preparation of the system, passing it through apparatus, so that it is left undisturbed after preparation.
The general principle of quantum superposition, applies to states, with either the meaning being the state at one time, or the state throughout the entirety of the preparation of the system. This requires us to assume that between states, exists a peculiar relationship such that if a system should be defined in one state, it can be considered as being partly in one or two other states.
The original state thus can be considered a superposition of the two or more new states, which cannot be reconciled with classical theory. Not only can a state be considered as a result of superposing two other states, but they can be superposed in an infinite number of ways. The nature of these superposed relationships is a relationship that cannot be represented by physically intuitive concepts: in the classical sense, there is no corollary for a system being partially in two states, such that it is equivalent to the system being in one state: thus it requires the conception of an entirely new mathematical theory.
When a state is formed from superposing two states it will have properties that are vague and intermediate between the two original states, and will approach either of them according to a 'weight' attached to the states during the process of superposing them: the new state is defined entirely by the original states when their relative weights are known, together with a specific phase difference, the exact meanings of which will be given by the mathematical theory. In the case of a polarized photon, the meaning is provided by the classical wave optics, so that two perpendicularly plane polarized photons are superposed with equal weights, the new state may be circularly polarized, or linearly polarized at an angle , or else elliptically polarized according to the phase difference.
The non-classical nature of this phenomenon is clearly presented if we consider the superposition of two states and , such that an observation of the system in state leads to a specific result, labeled , and correspondingly for state the observation leads to the result : what will the result of the observation be if the system is superposed between the two states? The answer is that the result will be found according to a probability law, which depends on the relative weights of states and .
The intermediate character of the state formed by superposition thus expresses itself through the probability of a particular result for an observation being intermediate between the corresponding probabilities for the original states, not through the result itself being intermediate between the corresponding results for the original state.
The probability of a particular result for the state formed by superposition is not always intermediate between those for the original states, in the general case when those for the original states are not zero or unity, so there are restrictions on the intermediateness of a state formed by superposition.
From this we can observe the degree of departure from classical theory, as the superposition of states is only allowable on account of the recognition of the importance of the effect disturbances that accompany observations have, as well as the resulting indeterminacy. When an observation is made on an atomic system in a given state, generally, the result will not be determinate, meaning if the experiment is repeated several times, several different results may be obtained. It is a law of nature, however, that should the experiment be repeated a large number of times, each result will be obtained in a definite fraction of the total number of times, so that we can define a probability of its being obtained: this probability is what the theory sets out to calculate.
The assumption of superposition between states leads to a mathematical theory within which the equations that define states are linear in terms of the unknowns. The consequence of this is that people have attempted to establish analogies between this and classical systems, such as vibrating strings or membranes (Thoughts: it is super interesting that the fundamental objects of String Theory were around in Dirac's day).
Mathematical Formulation of The Principle.
There has been a profound change in the relationship physicists have with the mathematics underlying the subject. Prior, the assumption was that Newtonian mechanics would provide the basis for describing the entirety of physical phenomena, and that the real work was about developing and applying those principles. Upon realizing that there was no reason to assume Newtonian and similar classical principles should be valid outside of their domain, where they have been verified by experiment, the departure was necessitated. These departures are expressed by the development of new formalisms and axioms, of which quantum mechanics is a great example: it requires the states of dynamical systems and the associated variables to be related in unique and novel, non-classical ways, as well as requiring the dynamical variables to be different kinds of mathematical quantities than those ordinarily found in physics.
The new scheme and physical theory become precise when the axioms and rules of manipulating the associated quantities are specified, and in addition, when laws are laid down that connect physical facts with the mathematical formalism, so that either may be inferred from the other. In application of the theory, one would have some physical information, which could be expressed by equations between the quantities: from this, new equations can be deduced via the axioms and rules of manipulation. Finally, the justification for the scheme depends on both internal consistency and agreement with experiment.
We shall begin building out the scheme by handling the mathematical relationships between the states of a dynamical system at some point in time, which will come from the mathematical formulation of the principle of superposition. Superposition is an additive process, which implies states can be added together to form new states, which must be connected with mathematical quantities of a kind that can be added to produce a quantity of the same kind. The most obvious of such quantities are vectors, however, ordinary vectors are insufficient for describing the dynamical systems of quantum mechanics. We have to make a generalization to vectors of in a space of infinite dimensions, and thus the mathematical treatment becomes complicated by questions regarding convergence.
First we will lay out some general properties of these vectors, which can be deduced from the associated axioms: it is also desirable to have a name for the class of vectors which are connected to the states of dynamical systems found within quantum mechanics, whether they are in a finite or infinite space. The term for these vectors will be kets, and they will be denoted by the symbol , and they can be specified with labels inserted between the two symbols: .
Ket vectors can be multiplied by complex numbers, and added together to give new ket vectors:
, where and are complex numbers.
In addition to this, we can perform more general linear processes, such as summing an infinite sequence of them, and given a ket vector , dependent upon some parameter x, and the values the parameter can take, then we can integrate it with respect to x, to get another vector: . A ket vector which can be described linearly in terms of other vectors is said to be dependent upon said vectors, and are referred to as independent if this cannot be done.
"...each state of a dynamical system at a particular time corresponds to a ket vectors, the correspondence being such that if a state results from the superposition of certain other states, its corresponding ket vector is expressible linearly in terms of the corresponding ket vectors of the other states, and conversely."
Thus, the state is a superposition of the states and , when the vectors are connected by (1).
From the preceding assumptions follow certain properties of the process of superposing states (or vectors): when two more states are superposed, the order in which this is done is irrelevant, so the process is symmetrical between the superposed states.
Again we see from (1) (save from the case where one of the coefficients are zero), if can be formed from superposing states , then, can logically be formed from a superposition of , and , from .
(1) A state which is the result of superposing two states, is dependent upon those states, if the ket vector corresponding is dependent upon the vectors of the set of states, and can be said to be independent if no one of them is dependent upon the other.
To proceed with building the mathematical formulation of the principle of superposition, we must introduce further an assumption which states that superposing a state with itself results in the same state: In the event that , then the result of the superposition process is nothing at all, with the two components having canceled each other out due to interference.
Our new assumption means that, aside from the special case mentioned above, must correspond to the same state does.
Because is a complex number, we can conclude that "...if the ket vector corresponding to a state is multiplied by a complex number, the resulting ket vector will correspond to the same state".
Thus a state is determined by the direction of a ket vector, and the length of the vector is thus irrelevant. All of the states of a dynamical system are a one to one mapping with all of the possible directions of a vector. This assumption shows very clearly the fundamental difference between quantum and classical superposition: In the case of a classical system, for which the principle can be applied, such as a vibrating membrane, when one superposes a state with itself, the result is a different state, with a different oscillation magnitude.
There is no physical characteristic of a quantum state that corresponds to the magnitude of classical oscillations, described by ratios of the amplitudes at different points on the membrane. While there does exist a classical state with zero amplitude of oscillation, the state of rest, there is no analogue to this state for a quantum system, with the zero vector being associate with no state at all.
Given two states which correspond to the ket vectors , the state formed by superposing them corresponds to the ket vector , which is determined by the complex coefficients in (1). Should these two complex factors be multiplied by another complex factor, the corresponding ket vector will be multiplied by this factor, but the state will be left unaltered: "Thus, only the ratio of the two coefficients is effective in determining the state . Hence, this state is determined by one complex number, or by two real parameters. Thus, from only two given states, a twofold infinity of states may be obtained by superposition.
This result is confirmed via the example in The Polarization of Photos, and The Interference of Photons: In section two there are only two independent states of polarization of photons, which are the states of plane polarization, parallel and perpendicular to some fixed direction, and from superposing these, a twofold infinity of states of photon polarization can be generated. In section three, from the superposition of two given translational states for a photon, a twofold infinity of translational states may be obtained, the general one of which is described by two parameters, which can be considered the ratio of the amplitudes of the two wave functions that are added together, and their phase relationship. This shows the need for introducing complex numbers, as restricting these coefficients to being real would only allow for a singular infinity of states attainable from superposition.
Bra and Ket Vectors**
Whenever a mathematical theory makes use of the concept of a vector, a second set of vectors can be introduced, called dual vectors.
Assume we have some number , which is a linear function(Linear Functions) of , meaning for each ket, there is a corresponding , then we say that the corresponding to , is the sum of the numbers corresponding to , and also, the number associated with , is , for the associated with .
This means that the value , which is associated with some , can be considered the scalar product of with some other vector, with there being one of these vectors for each linear function of the ket vectors. The dual vectors in this context, will be referred to as bra vectors, denoted by the mirror of the ket vector symbol , labeled in the same way: .
The scalar product of a bra and ket vector is denoted as . The scalar product of these vectors is a complete bracket expression, whereas either a bra or ket vector is an incomplete bracket expression. Any complete bracket expression denotes a number, and incomplete bracket expressions denote vectors.
The condition that the scalar product of is a linear function(Linear Functions), is expressed by the equations: .
A bra vector is defined completely when its scalar product with every ket is given, so that if its scalar product with every ket vanishes, the vector itself must be considered as vanishing, shown symbolically as: If , all .
The sum of two bra vectors , is given by the condition that scalar product of this sum, with any , is the sum of the scalar products of with : .
The product of a bra vector , and some number is defined by the condition that its scalar product with any ket vector is times the scalar product of with : Bra vectors, are different from the ket vectors, and are so far only connected by the definition of the scalar product between the two.
A further assumption is introduced that there is a one to one correspondence between bras and kets, such that the bra corresponding to is the sum of the bras corresponding to and , and also, the bra corresponding to is times the bra corresponding to being the complex conjugate of .