Welcome to e Signal Processing! From Merriam-Webster s Collegiate Dictionary: Main Entry: ad ap ta tion Pronunciation: a- dap- ta-sh&n, -d&p- Function: noun Date: 6 : the act or process of adapting : the state of being adapted : adjustment to environmental conditions: as a : adjustment of a sense organ to the intensity or quality of stimulation b : modification of an organism or its parts that makes it more fit for existence under the conditions of its environment 3 : something that is adapted; specifically : a composition rewritten into a new form - ad ap ta tion al /-shn&l, -sh&-n&l/ adjective - ad ap ta tion al ly adverb Lectures and exercises Lectures: Tuesdays 8.5-. in room E:46 Exercises: Wednesdays 8.5-. in room E:45 Computer exercises: Wednesdays 3.5-5. in room E:45, or Thursdays.5-. in room E:45 Laborations: Lab I: e channel equalizer in room E:45 Lab II: e filter on a DSP in room E:45 Sign up on lists on webpage from Monday Nov. Course literature 3 Contents - References in the 4:th edition 4 Book: Simon Haykin, e Filter Theory, 4th edition, Prentice-Hall,. ISBN: -3-96- (Hardcover) Kapitel: Backgr., (), 4, 5, 6, 7, 8, 9, 3., 4. (3:e edition: Intr.,, (5), 8, 9,, 3, 6., 7. ) Exercise material:exercise compendium (course home page) Other material: Computer exercises (course home page) Laborations (course home page) Lecture notes (course home page) Vecka : Repetition of OSB (Hayes, or chap.), The method of the Steepest descent (chap.4) Vecka : The LMS algorithm (chap.5) Vecka 3: Modified LMS-algorithms (chap.6) Vecka 4: Freqency adaptive filters (chap.7) Vecka 5: The RLS (chap.8 9) Vecka 6: Tracking and implementation aspects (chap.3., 4.) Vecka 7: Summary Matlab code (course home page)
Contents - References in the 3:rd edition 5 Lecture 6 Vecka : Repetition of OSB (Hayes, or chap.5), The method of the Steepest descent (chap.8) Vecka : The LMS algorithm (chap.9) Vecka 3: Modified LMS-algorithms (chap.9) This lecture deals with Repetition of the course Optimal signal processing (OSB) The method of the Steepest descent Vecka 4: Freqency adaptive filters (chap., ) Vecka 5: The RLS (chap.) Vecka 6: Tracking and implementation aspects (chap.6., 7.) Vecka 7: Summary Recap of Optimal signal processing (OSB) 7 Optimal Linear Filtrering 8 The following problems were treated in OSB Signal modeling Either a model with both poles and zeros or a model with only poles (vocal tract) or only zeros (lips). Invers filter of type Deconvolution or equalization of a channel. Input signal u(n) Filter w Desired signal Output signal + y(n) Σ Estimation error = y(n) Wiener filter Filtrering, equalization, prediction och deconvolution. The filter w = ˆw w w... T which minimizes the estimation error, such that the output signal y(n) resembles the desired signal as much as possible is searched for.
Optimal Linear Filtrering 9 Optimal Linear Filtrering In order to determine the optimal filter a cost function J, which punish the deviation, is introduced. The larger, the higher cost. From OSB you know some different strategies, e.g., The total squared error (LS) Deterministic description of the signal. X n J = e (n) n Mean squared error (MS) Stochastic description of the signal. The cost function J(n) = E{ p } can be used for any p, but most oftenly for p =. This choice gives a convex cost function which is refered to as the Mean Squared Error. J = E{e (n)} = E{ } MSE J = E{ } Mean squared error with extra contraint J = E{ }+λ u(n) Optimal Linear Filtrering Optimal Linear Filtrering In order to find the optimal filter coefficients J is minimized with regard to themselves. This is done by differentiating J with regard to w, w,..., and then by setting the derivative to zero. Here, it is important that the cost function is convex, i.e., so that there is a global minimum. The minimization is expressed in terms of the gradient operator, J = where J is called gradient vector. In particular, the choice of the squared cost function Mean Squared Error leads to the Wiener-Hopf equation-system. In matrix form, the cost function J = E{ } can be written J(w) = E{[ w H u(n)][ w H u(n)] } = σd wh p p H w+w H Rw där w = ˆw w... w M T M u(n) = ˆu(n) u(n )... u(n M + ) T M 3 r() r()... r(m ) R = E{u(n)u H r () r() r(m ) (n)} = 6 4.... 7.. r (M ) r 5 (M )... r() p = E{u(n)d (n)} = ˆp() p( )... p( (M )) T M σ d = E{d (n)}
Optimal Linear Filtrering 3 Optimal Linear Filtrering 4 The gradient operator yields J(w) = J(w) w = w (σ d wh p p H w+w H Rw) = p + Rw If the gradient vector is set to zero, the Wiener-Hopf equation system results Rw o = p Wiener-Hopf which solution is the Wiener filter. The cost function s dependence on the filter coefficients w can be made clear if written in canonical form J(w) = σ d wh p p H w+w H Rw = σ d ph R p+(w w o ) H R(w w o ) Here, Wiener-Hopf and the expression of the Wienerfilter have been used in addition to the fact that the following decomposition can be made w o = R p Wienerfilter w H Rw = (w w o ) H R(w w o ) w H o Rw o+w H o Rw+wH Rw o In other words, the Wiener filter is optimal when the cost is controlled by MSE. With the optimal filter w = w o the minimal error J min is achieved: J min J(w o ) = σ d ph R p MMSE Optimal Linear Filtrering 5 Steepest Descent 6 Error-Performance Surface för -filter with two coefficients, w = [w, w ] T 5 4 3.5 3 4 The method of the Steepest descent is a recursive method to find the Wienerfiltret when the statistics of the signals are known. J(w) 5 4 w 4 3.5 3 w w w.5.5 w o 3 4 8 6 4 The method of the Steepest descent is not an adaptive filter, but serves as a basis for the LMS algorithm which is presented in Lecture. p = ˆ.57.4458 T R =»..5.5. σ d =.9486 w o = ˆ.836.7853 T Jmin =.579
Steepest Descent 7 Convergence, filter coefficients 8 The method of the Steepest descent is a recursive method that leads to the Wiener-Hopfs equations. The statistics are known (R, p). The purpose is to avoid inversion of R. (saves computations) Set start values for the filter coefficients, w() (n = ) Determine the gradient J(n) that points in the direction in which the cost function increases the most. J(n) = p+ Rw(n) Adjust w(n + ) in the opposite direction to the gradient, but weight down the adjustment with the stepsize parameter µ Repete steps and 3. w(n+) = w(n) + µ[ J(n)] Since the method of the Steepest Descent contains feedback, there is a risk that the algorithm diverges. This limits the choices of the stepsize parameter µ. One example of the critical choice of µ is given below. The statistics are the same as in the previous example. w(n) wo wo µ =.5 µ =. µ =. µ =.5 4 6 8 Iteration n».57 p =.4458»..5 R =.5.».836 w o =.7853» w() = Convergence, error surface 9 Convergence analysis The influence of the stepsize parameter on the convergence can be seen when analyzing J(w). The example below illustrates the convergence towards J min for different choices of µ. w.5.5 -.5 - -.5 wo µ =. µ =.5 µ =. w() - - -.5 - -.5.5.5 w».57 p =.4458»..5 R =.5.».836 w o =.7853» w() =.7 How should µ be chosen? A small value gives slow convergence, while a large value constitutes a risk for divergence. Perform an eigenvalue decomposition of R in the expression of J(w(n)) J(n) = J min +(w(n) w o ) H R(w(n) w o ) = J min + (w(n) w o ) H QΛQ H (w(n) w o ) = J min + ν H (n)λν(n) = J min + X k λ k ν k (n) The convergence of the cost function depends on ν(n), i.e., the convergence of w(n) through the relationship ν(n) = Q H (w(n) w o ).
Convergence analysis Convergence, time constants With the observation that w(n) = Qν(n)+w o the update of the cost function can be derived: w(n+) = w(n)+µ[p Rw(n)] Qν(n+)+w o = Qν(n)+w o +µ[p RQν(n) Rw o ] ν(n+)= ν(n) µq H RQν(n) = (I µλ)ν(n) ν k (n + ) = ( µλ k )ν k (n) ( Element k i ν(n) ) The latter is a :st order difference equation, with the solution ν k (n) = ( µλ k ) n ν k () For this equation to converge it is required that µλ k <, which leads to the stability criterion of the method of the Steepest Descent: < µ < λ max Stabilitet, S.D. The time constants indicates how many iterations it takes until the respective error has decreased by the factor e, where e denotes the base of the natural logarithm. The smaller time constant, the better. The time constant τ k for eigenmode k (eigenvalueλ k ) is τ k = ln( µλ k ) µ<< µλ k Time constant τ k If the entire convergence for whole coefficient vector w(n) is considered, the speed of convergence is limited by the largest and the smallest eigenvalue of R, λ max och λ min. This time constant is denoted τ a : ln( µλ max ) τ a ln( µλ min ) Time constant τ a Learning Curve 3 Summary Lecture 4 Kostnadsfunktion J(n).8.6.4 J. min µ =. µ =.5 µ =. 3 4 5 6 Iteration n».57 p =.4458»..5 R =.5.».836 w o =.7853» w() = Repetition av OSB Quadratic cost function, J = E{ } Definition of u(n), w, R and p The correlations R and p is assumed to be known in advance. The Gradient vector J Optimal filter coefficients is given by w o = R p Optimal (minimal) cost J min How fast an adaptive filter converges is usually shown in a learning curve, which is a plot of J(n) as a function of the iteration n. For the SD, J(n) approaches J min. Since SD is deterministic it is misleading to talk about learning curves in this case.
Summary Lecture 5 To read 6 Summary of the method of the Steepest Descent Recursiv solution to the Wiener filter The statistics (R och p) is assumed to be known The gradient vector J(n) is timedependent but deterministic, oand points in the direction in which the cost function increases the most. Recursion of filter veights: w(n+) = w(n) + µ[ J(n)] The cost function J(n) J min, n, dvs w(n) w o, n For convergence it is required that the stepsize satisfies < µ < λ max The speed of convergence is determined by µ and the eigenvalues of R ln( µλ max ) τ a ln( µλ min ) Repetition OSB: Hayes, or Haykin kapitel. Background on adaptive filters, Haykin Background and Preview. Steepest descent, Haykin kapitel 4. Exercises:.,.,.5, 3., 3.3, 3.5, 3.7, (3., 3.4, 3.6) Computer exercise, theme: Implementation of the method of the Steepest descent. Exempel på adaptiva system 7 Exempel: Inversmodellering 8 Här följer ett antal exempel på tillämpningar som kommer att diskuteras under kursens gång. Materialet här är långt ifrån heltäckande; ytterligare exempel återfinns i Haykin Background and Preview. Exciterande signal Fördröjning z Tanken med dessa exempel är att du redan nu ska börja fundera på hur adaptiva filter kan användas i olika sammanhang. Formuleringar av typen använd ett filter av samma ordning som systemet förtjänar ett förtydligande. Normalt känner man inte systemets ordning, utan man får undersöka flera alternativ. Ökas ordningen successivt, så kommer man i förekommande fall att märka att efter en viss längd, så ger ytterligare ökning av filtrets längd ingen ytterligare förbättring. Undersökt u(n) + system filter b Σ Vid inversmodellering kopplas det adaptiva filtret i kaskad med det undersökta systemet. Har systemet endast poler, kan ett adaptivt filter av motsvarande ordning användas.
Exempel: Modellering/Identifiering 9 Exempel: Ekosläckare I 3 u(n) Undersökt system filter b + Σ Vid modellering/identifiering kopplas det adaptiva filtret parallellt med det undersökta systemet. Om det undersökta systemet endast har nollställen, så är det lämpligt att använda motsvarande längd på filtret. Har systemet både poler och nollställen krävs i regel ett långt adaptivt -filter. u(n) Talare filter b Hybrid Σ + Talare Vid telefoni så läcker talet från Talare igenom vid hybriden. Talare kommer då att höra sin egen röst som ett eko. Detta vill man ta bort. Normalt stoppas adaptionen då Talare pratar. Filtreringen sker dock under hela tiden. Exempel: Ekosläckare II 3 Exempel: e Line Enhancer 3 u(n) Talare filter b Σ + Impulssvar Ekoväg + Talare Σ + Denna struktur är tillämpbar på högtalartelefoner, videokonferenssystem och dylikt. Precis som vid telefonifallet hör Talare sig själv i form av ett eko. Dock är denna effekt mer påtaglig här, eftersom mikrofonen fångar upp det som sänds ut av högtalaren. Adaptionen stoppas normalt när Talare pratar, men filtreringen sker under hela tiden. v(n) Färgad störning Periodisk signal s(n) + + Σ z Fördröjning u(n) filter b + Σ Information i form av en periodisk signal störs av ett färgat brus som är korrelerat med sig själv inom en viss tidsram. Genom att fördröja signalen så pass mycket att bruset (i u(n) respektive ) blir okorrelerat, kan bruset tryckas ned genom linjär prediktion.
Exempel: Kanalutjämnare (Equalizer) 33 Sluten under träning z Fördröjning p(n) Kanal C(z) Brus v(n) + u(n) Σ + filter b + Σ En känd pseudo noise-sekvens används för att skatta en invers modell av kanalen ( träning ). Därefter stoppas adaptionen, men filtret fortsätter att verka på den översända signalen. Syftet är att ta bort kanalens inverkan på den översända signalen.