By:
Ron Nicholson
Chipmunk Basic
HotPaw Productions
Here's my recipe for a simple Q&D Windowed-Sinc FIR filter generator in about ten lines of Basic, plus setup and some comments: : rem Some example filter parameters: fsr = 44100 : rem set fsr = sample rate fc = 0 : rem set fc = 0 for lowpass : rem set fc = center frequency for bandpass filter : rem set fc = fsr/2 for a highpass bw = 3000 : rem bw = bandwidth, range 0 .. fsr/2 and bw >= fsr/n : rem bw = 3 db corner frequency for a lowpass : rem bw = half the 3 db passband for a bandpass filter nt = 128 : rem nt = number of taps + 1 (nt must be even) : rem nt should be > fsr / bw : rem transition band will be around 3.5 * fsr / nt : rem depending on the window applied and ripple spec. g = 1 : rem g = filter gain for bandpass : rem g = 0.5 , half the gain for a lowpass filter : rem fir() is the result array of FIR taps dim fir(nt) : fir(0) = 0 : rem fir(1) is the first tap : rem fir(nt/2) is the middle tap : rem fir(nt-1) is the last tap rem R. Nicholson's QDDS FIR filter generator cookbook recipe rem QDDS = Quick, Dirty, Dumb and Short rem version 0.6b - 2006-Dec-14, 2007-Sep-30 rem No warranties implied. Error checking, optimization, and quality rem assessment of the "results" is left as an exercise for the student. rem (consider this code Open Source under a BSD style license) sub wsfiltgen(fir, nt, fc, bw, g, fsr) local i,a,ys,yg,yw,yf : rem local variables for i = 1 to nt-1 a = (i - nt/2) * 2.0*pi*bw/fsr : rem scale Sinc width ys = 1 : if a <> 0 then ys = sin(a)/a : rem calculate Sinc function yg = g * (4.0 * bw/fsr) : rem correct window gain yw = 0.54 - 0.46 * cos(i * 2.0*pi/nt) : rem Hamming window yf = cos((i - nt/2) * 2.0*pi*fc/fsr) : rem spectral shift to fc fir(i) = yf * yw * yg * ys : rem assign fir coeff. next i end subExplanation:
A Sinc function (sin(x)/x) is the Fourier transform of a rectangular function. Using a windowed Sinc is one method of designing a FIR filter, starting with an arbitrary rectangle in the frequency domain, and then creating an approximation of that rectangle's Fourier transform in the time domain by cutting a Sinc's width using a finite width window.
Note that the bw parameter sets the width of a rectangle in the frequency domain. fc sets the location of the center of the rectangle. A Sinc is then the Fourier transform of that frequency domain rectangle into the time domain (or the impulse response). A window is required to make a finite length filter out of a Sinc, which has infinite extent. The filter length nt sets this window width, which inversely affects the filter transition band. As the Sinc gets shortened by the window length, the sides of the rectangle in the frequency domain get less steep, or have a wider transition band. The wider the windowed Sinc, the narrower the transition band.
The type of window function selected (Hamming, von Hann, Blackman, Nuttall, etc.) will affect the pass band and stop band response. A Kaiser window will give you one more degree of freedom, at the cost of computing (or approximating from a table) Bessel functions. Note that the Sinc function itself can also be estimated or approximated from a table for implementation in a system without fast math libraries.
Here's someones java applet which creates similar FIR filters: FIR Digital Filter Design Applet
rem - QDSS Windowed-Sinc ReSampling subroutine in Basic rem rem - This function can also be used for interpolation of FFT results rem rem function parameters rem : x = new sample point location (relative to old indexes) rem (e.g. every other integer for 0.5x decimation) rem : indat = original data array rem : alim = size of data array rem : fmax = low pass filter cutoff frequency rem : fsr = sample rate rem : wnwdth = width of windowed Sinc used as the low pass filter rem rem resamp() returns a filtered new sample point sub resamp(x, indat, alim, fmax, fsr, wnwdth) local i,j, r_g,r_w,r_a,r_snc,r_y : rem some local variables r_g = 2 * fmax / fsr : rem Calc gain correction factor r_y = 0 for i = -wnwdth/2 to (wnwdth/2)-1 : rem For 1 window width j = int(x + i) : rem Calc input sample index : rem calculate von Hann Window. Scale and calculate Sinc r_w = 0.5 - 0.5 * cos(2*pi*(0.5 + (j - x)/wnwdth)) r_a = 2*pi*(j - x)*fmax/fsr r_snc = 1 : if (r_a <> 0) then r_snc = sin(r_a)/r_a if (j >= 0) and (j < alim) then r_y = r_y + r_g * r_w * r_snc * indat(j) endif next i resamp = r_y : rem Return new filtered sample end sub rem - Ron Nicholson's QDSS ReSampler cookbook recipe rem QDSS = Quick, Dirty, Simple and Short rem Version 0.1b - 2007-Aug-01 rem Copyright 2007 Ronald H. Nicholson Jr. rem No warranties implied. Error checking, optimization, and rem quality assessment of the "results" is left as an exercise rem for the student. rem (consider this code Open Source under a BSD style license) rem IMHO. YMMV. http://www.nicholson.com/rhn/dsp.html
Notes:
Bibliography:
CCRMA's Resampling page
CCRMA's Bandlimited Interpolation page
Here's a summary of some idea's which I've tried or intend to try for precisely estimating the pitch or frequencies produced by musical instruments (e.g for tuning guitars and pianos, etc.). Many of these are quick and dirty methods without much theoretical backing, but did seem to work somewhat, at least in a few simple experiments and test inputs. Note that musical pitch is a psycho-perceptual phenomna which is different from peak spectral frequency. Some very musical sounds even have a missing fundamental frequency in their audio spectrum. Some of the techniques in this list other than simple FFT usage may produce more accurate pitch estimates. (Unfinished Draft - 2006Mar13, updated 2009Oct09, 2011Oct02, rhn) A list of methods for estimating the frequency of some signal : Techniques mostly in the Time Domain : 0. - If there is only one sinusoid with zero noise and measurement error, then it might be possible to use 3 samples to solve an equation in 3 unknowns: y = A * sin(2*pi * F * t + P) for amplitude A, frequency F, reference phase P (but note that aliasing can occur depending on the sample spacing) Here are derivations of two closed form solutions by C.Turner: http://www.claysturner.com/dsp/3pointfrequency.pdf http://www.claysturner.com/dsp/4pointfrequency.pdf 1. - Measure the distance between 2 positive zero crossings in samples, either by counting samples in a buffer, or by triggering a counter/timer to start/stop with a comparator (analog or otherwise). The distance between zero crossings could be an estimate of the period of a sine wave if there isn't much noise in the signal and the count rate is high enough relative to the frequency of the signal. The frequency will be the reciprocal of the period distance or time, and the accuracy will be limited by the sample spacing. 2. - Linear interpolation between the two samples on either side of zero crossings for a more accurate zero crossing estimate, then measure as above between 2 interpolated zero crossings. 2B - Average the distance between zero-crossings over several pairs of zero-crossings, perhaps doing least squares fit, and/or throwing away outlying measurements first. (also see item 14., the Phase Vocoder, since FFT phase is a form of regression fit to the zero crossing locations of or near a particular frequency content.) 3. - Count positive zero crossings in some sample interval. count zero crossing per second to give f +-1 Hz. Accuracy depends on the length of the counting interval with respect to the period being measured. Band-pass filtering before counting might help remove spurious zero-crossings caused by noise. Note that noise or interfering signals can change or add additional zero crossings, and cause the zero-crossing methods above to produce spurious frequency estimation results. It might be possible to remove some of the noise or interference by first filtering the signal, e.g. a narrow band bandpass filter centered around the frequency of interest. However, if the Q of this filter is too high, one might just end up measuring the frequency at which the filter "rings". 4. - Autocorrelation Find the sample lag with the highest correlation. Autocorrelation works better than frequency domain methods on signals which are periodic but look nothing like a sinusoid (e.g. have a missing fundamental component). f = sr / lag 4B. - Autocorrelation with resampling for higher lag resolution. Resample to a much higher sampling rate, then autocorrelate. The set of integer lags will then represent a higher resolution in the frequency domain. 4C. - Window before or during autocorrelation to reduce edge effects. 5. - Least average squared difference offset (ASDF lag) Find the sample lag with the lowest difference. Works with signals that have near constant amplitude. Sometimes normalizing the amplitude (AGC) before calculating the ASDF lags helps. AMDF - or computing the mean absolute value difference of a range of lags is another variation on this approach. 5B. - ASDF minima lag with parabolic interpolation Interpolate between the three best neighboring lags. Parabolic interpolation works best if the "continuous" lag curve matches the interpolation kernel, but it's not an exact fit using a parabola. 21. - Hilbert transform to the complex time domain Use a Hilbert transform filter process to produce an imaginary time domain component. Measure the rate of change of phase of the resulting complex vector (real and imaginary) to obtain instantaneous frequency. May need to window and bandpass filter some length of signal before the Hilbert filter and low-pass filter the resulting phase to produce reasonable results. This technique seems to be related to quadrature demodulation at baseband. Techniques mostly in the Frequency Domain (using a DFT/FFT) : 6. - FFT magnitude peak Take the magnitude of the complex FFT = sqrt(b_r^2 + b_i^2), Search for the maxima. i = index of FFT bin with maximum (abs) magnitude peak sr = sample rate (samples per second) n = number of FFT points f = i * ( sr / n ) 7. - Window before FFT to reduce the side-lobes of interference: // von Hann w[i] = 0.5 - 0.5 * cos( 2*PI * i / (n-1) ) // Hamming w[i] = 0.54 - 0.46 * cos( 2*pi * i / (n-1) ) // Blackman-Nuttall w[i] = 0.3635819 - 0.4891775*cos(2*pi*i/(n-1)) + 0.1365995*cos(4*pi*i/(n-1)) - 0.0106411*cos(6*pi*i/(n-1)) The FFT of a non-bin frequency sinusiod (for a frequency Note that windowing removes information about the signal, especially near the sides, and makes the main lobe wider and less sharp. So windowing trades off lower sinusoid frequency resolution versus reducing interference from nearby signals and/or noise. Sub-bin resolution FFT/DFT frequency domain methods The FFT of a non-bin frequency sinusiod (for a frequency not exactly periodic in the FFT aperture), instead of appearing in one bin, will result in a sampled Sinc, or a sampled transform of the applied window for non-rectangular windows. Estimating the shift of this resulting Sinc (or other) shape may produce a frequency resolution much finer than FFT bin resolution. 8A. - Parabolic interpolation of FFT magnitude peak where m2 = magnitude of bin nearest peak m1,m3 = magnitude of bins to the left and right: d = 0.5 * (m1 - m3) / (m1 - 2.0 * m2 + m3) f = ( i + d ) * ( sr / n ) Works best using a window that has a parabolic shaped main lobe in the frequency domain (Hamming or von Hann window might be close enough). See: http://ccrma.stanford.edu/~jos/parshl/Peak_Detection_Steps_3.html for derivation of this interpolation formula. 8B. - Triangle interpolation of FFT magnitude peak using a window which has a triangle shaped main lobe in the frequency domain so that the FFT peak is sharper. 9. - Sinc interpolation of un-windowed FFT Find the best fitting Sinc function to the complex FFT results by autocorrelation or least-squares successive approximation. Sinc = sin(2*pi*b)/(2*pi*b) fc = b * ( sr / n ) This method allows one to use more bins (try 5 bins containing the peak) for interpolation than the parabolic method (only 3 bins) on the complex FFT result vector. More bins should contains more information about a signals frequency, especially if the FFT is done without a (non-rectangular) window to reduce side-lobes. The Sinc is the Fourier transform of a rectangle. The rectangle results from the finite aperture window of a DFT/FFT chopping a signal whether or not it is not periodic in that aperture. Variation: Estimate the Sinc fit separately on the real and imaginary component vectors of the FFT, and then take a weighted average. Changing the size of the FFT : 10. - Zero-padded FFT use a 32x longer FFT vector with 31x zeros appended (or other amount of padding). This is mostly a very accurate way to interpolate an FFT magnitude peak (a form of Sinc interpolation, but modifying in the time domain). Works well with a window and additional parabolic interpolation, or with no window and Sinc interpolation (the needed Sinc gets wider by the vector expansion factor). Note that this might increase the accuracy of finding a single sinusoid, but doesn't help in the separation of very close sinusoids or adjacent spectral interference or noise. (Note #2: Zero-padded FFTs are most accurate near Fs/4. At very low frequencies, there can be interference from the negative frequency complex conjugate image. 2007-Feb-17) Using more than one set of FFT results from the same data stream : 11. - FFT of windowed derivative dx = w(i) * (x[i] - x[i-1]) f = (sr/pi) * asin(( abs(max_fft_dx)/abs(max_fft_x) ) / 2) (seems to work better than Phase Vocoder for very low frequencies) 12. - time-frequency reassignment Using multiple windowed FFT's to reassign the center of bins. (less popular because the results aren't exactly invertible). FFT with the Hann window & find the peak bin = i_fft fc = i_fft * sr / n Calculate fft2 using the derivative of a Hann window Calculate fft3 using the Hann window * a ramp from (-1 to 1) (see Jez Wells' paper for descriptions of these windows: http://www-users.york.ac.uk/~jjw100/ReadingTheSines.pdf ) f = fc + (sr/(2*pi)) * im( i_fft c* i_fft2 ) / mag( i_fft ) t = re( i_fft c* i_fft3 ) / mag( i_fft ) t is the centroid of the sinusoid's amplitude "hump" 13. - Resampling (by windowed Sinc interpolation) to produce a signal with a period which is an exact sub-multiple of the FFT size. (e.g. successive approximation to center a spectral peak into a single bin) Frequency domanin analysis using more than 1 FFT frame on the data. 14. - 2 FFT's + Phase Vocoder analysis for sequential or overlapping frames. This method uses phase information between two FFT's with different offset (disjoint or overlapping) apertures into the data, and assuming that the same peak bin appears in both FFT frames. i = nearest FFT bin# to peak, (initial f estimate = (i * sr / n)) ph0 = atan2( imag(i), real(i) ) ph1 = phase in second FFT frame or aperture p = ( ph0 + i * 2 * pi * offset / n ) - ph1 Unwrap the phase offset p to between -pi and pi The phase vocoder corrected frequency estimate is then: f = (i * sr / n) - (p * sr / (2 * pi * offset)) Try a von Hann or Blackman-Nuttall window for better accuracy. (equation corrected 2008Jun19 by rhn) Note that a phase vocoder can be considered a fancy method of measuring zero crossings, but using the phase of the correlation against a sinusoid to get a better statistical estimate of where multiple zero crossings of the frequency should have been located. Note that you can interpolate phases between FFT bins by using something like FFTShift to (re)reference the phases (real = even, imag = odd) to the center of the FFT aperture. (Note #2: Phase Vocoder accuracy is highest if the frame offset is close to a multiple of a full period of the frequency being estimated. This is because the shape of the "spectral leakage" is slightly different for odd/even sinusoids, especially for relatively low frequency sinusoids. 2008-Feb-18 ) 14B - FFT + an 1 offset dft bin for phase vocoder phase vocoder uses successive FFT's If only one frequency is needed then only one DFT bin (complex correlation) needs to be calculated. 14C - FFT + using Goertzel algorithm for dft bin for phase vocoder This is similar to an optimization for doing a 1-bin DFT. Goertzel may require > 32 bits precision for low f0/sr ratios. Techniques mostly in the Quefrency Domain : 16 - Cepstrum Look for excitation peaks in ifft(log(mag(fft(samples)))) Useful for estimating the pitch of signals rich in harmonics, but possibly with a weak or missing fundamental component. see: http://cnx.org/content/m12469/latest/ Used in speech and singing research. If a signal has a train of harmonics, then one can try averaging the estimated frequency of several harmonics divided by their harmonic number (try several octave number offsets). This is a crude variation on using Cepstral peaks. 16B - Harmonic Product Spectrum Compress a spectrum by 2X, 3X, 4X, etc. and overlay the spectra. A signal producing those harmonics will produce overlapping peaks in the overlay, with the maximum number of peaks overlapping at what could be a fundamental frequency. The low frequency spectra or the fundamental may need to be interpolated. Also consider the case for missing fundamentals. see: http://musicweb.ucsd.edu/~trsmyth/analysis/Harmonic_Product_Spectrum.html Other Methods: 15 - continuous quadrature (de)modulation with phase extraction (or a digital frequency discriminator) Fit a slope to the phase data to produce a frequency estimate. (frequency is the derivative of phase). Perhaps use least squares and/or low-degree spline fit to filter out noise. Could use a table lookup or Goertzel for the modulating signal. This seems like a boundary case of a 1-sample phase vocoding, using quadrature multiplication instead of an FFT to extract phase information. This is how frequency is estimated in some FM radio and modem demodulation schemes. 15B - PLL Try locking a digitally synthesized PLL oscillator to the data. If it locks, use the synthesized frequency as the estimate. For stored or buffered data, the PLL can also be run backwards in "time", or even in both directions for constant frequency sources. Other methods (which I haven't tried, YMMV): 21 - RAPT (Robust Algorithm for Pitch Tracking) Appears to be similar to Harmonic Product Spectrum, except that it uses cross-correlation lags at multiple resolutions instead of FFT spectra, plus dynamic programming to minimize a weighting function over several analysis frames. see: https://www.ee.columbia.edu/~dpwe/papers/Talkin95-rapt.pdf and: http://dsp.tut.su/irapt.html 21B - YAAPT (Yet Another Algorithm for Pitch Tracking) Uses Normalized Cross Correlation (NCCF), peak picking, and dynamic programming using transition costs. see: http://bingweb.binghamton.edu/~hhu1/pitch/YAPT.pdf 17 - MUSIC see: http://cnx.org/content/m10588/latest/ 18 - ESPIRIT another matrix eigenvector decomposition approach 19 - Exponential decaying phasor curve fitting 20 - State Space embedding Useful if you are looking at a mildly bifurcated chaotic type of signal. References: See also: MIREX / Music Information Retrieval - Melody Extraction: http://www.music-ir.org/mirex/wiki/2014:Audio_Melody_Extraction Klapuri's publications on polyphonic pitch extraction: http://www.cs.tut.fi/~klap/iiro/ Wikipedia page on Pitch Detection/Estimation: https://en.wikipedia.org/wiki/Pitch_detection_algorithm
The Goertzel algorithm produces about the same thing as the magnitude from one bin of a DFT/FFT (there will be differences in numerical accuracy). The bandwidth of one DFT bin will vary inversely proportionally to the length N of the DFT/FFT (with the bin spacing being fs/N).
A DFT, or Goertzel filter, for a finite length data vector, acts as if a rectangular window was applied to the data. Thus the resulting spectrum is roughly a Sinc function (sin(x)/x), with a main lobe plus symmetric decaying ripples to each side.
But, more precisely, the result for real-valued data is the sum of two Dirichlet kernels, or periodic Sinc's. This is because: (1) the spectrum of sampled data repeats in the frequency domain; and (2) the spectrum of real data includes two mirrored complex conjugate spectrums. So you end up with two periodic trains of Sinc functions, repeating at the frequency of the window period, each periodic train of the opposite phase to the other.
So, including these effects for the one bin of a DFT, zero-padded DFT, or a Goertzel filter, I get this more precise result for the frequency response:
Given: ft = frequency of test sinusoid sr = sample rate n is the length of the rectangular window (left justified) Main lobe magnitude at frequency f: Hm(f) = sin(n * pi*(ft-f)/sr) / (n * sin(pi*(ft-f)/sr)) (this is a periodic Sinc, or Dirichlet kernel) Complex conjugate lobe magnitude at f: Hc(f) = sin(n * pi*(ft+f)/sr) / (n * sin(pi*(ft+f)/sr)) The DFT/FFT magnitude sum: M(f) = Hm(f)^2 + Hc(f)^2 + 2*Hm(f)*Hc(f)*cos(2 * ph(ft))
The correction factor Hc is "small" if the bin number (f0/sr) is "big", and thus usually ignored.
Also note that n * sin(w) is close to n * w for very small w, thus allowing simplifying the equations to the standard Sinc formulation of H(f) = sin(n*w)/(n*w).
Note that the phase of the data, relative to the window center, does make a difference for spectra that is not strictly periodic in the window or aperture. This is due to fact that the negative frequency Sinc convolution is the complex conjugate of the positive frequency Sinc convolution for real data, thus the two types of Sinc (real & imaginary) sum oppositely.
This equation seems to work even if neither f or f0 is periodic in a window of length n. That turns out to be another difference between a 1-bin DFT and Goertzel. A DFT/FFT only calculates correlations against periodic sinusoids in the DFT aperature. A Goertzel filter can be used for correlation against any frequency, irrespective of window length or periodicity.
The above equation can also be used to compute the magnitude spectrum for the FFT of zero-padded data (with the padding added after the data, or on the right), or equivalently rectangularly windowed data, by setting n in the above equation to be the width of the rectangular data window instead of the width of the whole FFT. The even/oddness phase should be measured or estimated relative to the center of the rectangular window, not the DFT aperture.
(revised 2008-Feb-19, rhn)
Just for fun, I thought I'd try and list many of the misconceptions that people seem to have regarding using the FFT, as seen in various previous posts to comp.dsp. The FFT is a widely used algorithm because the result has something (not precisely understood) to do with frequencies. Some Common FFT Misconceptions: 1. The FFT will directly produce the frequency of some phenomena, without regard to the relationship between FFT length versus the period of any oscillations (e.g. what rectangular window?), or any envelope on or modulation of some periodic waveform. 2. The FFT will find the correct frequency content without regard to the relationship between the sample rate and the highest frequency content present in the sampled phenomena, or any other need for bandlimiting (for example, of closing price data). 2b. The sample rate only needs to be twice the highest present, or twice the highest frequency of interest for an FFT to "find" that frequency. 3. Since the FFT always produces "frequencies", there must be some periodic phenomena within the sampled data. 4. The perceive pitch of a musical or speech waveform is the same as the frequency represented by the FFT magnitude peak. 5. Windowing (non-rectangular) will always result in a more accurate frequency estimation. 6. One can perfectly filter out a signal band by just zeroing out some bins between an FFT and IFFT. 7. One can filter in the frequency domain without regard to the length of the (significant) impulse response of the filter. 8. One can reconstruct a signal from just the FFT magnitude result. 9. One can reconstruct a real signal by just feeding the "positive frequency" bins to a generic IFFT. 10. One can extrapolate a signal from an FFT without inferring circular boundary conditions. 11. The (complex) FFT contains no information about the time domain envelope. 11. The phase of a sinusoid (or how far that sinusoid is away from Fs/4) does not affect the magnitude result of an FFT. 12. Information from preceding, overlapping, or different length FFT frames is of no interest in analyzing the spectrum of some waveform. 13. The "resolution" of a frequency estimation depends only on the FFT length, and has nothing to do with anything known about the signal-to-noise ratio in the sampled data. 13b. More "resolution" of frequency estimation is created by zero-padding (...beyond that provided by other methods of interpolation). 14. An FFT is not just an efficient implementation of a DFT. (Or: "What's a DFT?") 15. Using an entire FFT is the best way to check for even a very small number of frequencies (say a DTMF tone). 16. The FFT can only operate on vector lengths that are an integer power of 2. (Or FFT's of vectors not a power of 2 are "really" slow.) 17. The performance of an FFT implementation (on contemporary PC's) depends mostly on the number of arithmetic operations (or MACs or FMACs). 18. An FFT is less accurate than using the defining DFT formula for computation. 19. ??? The above are common FFT misconceptions, or sometimes just poorly thought out or stated conceptions that can lead to strange misinterpretations of the results of using an FFT. First posted: 2008-March-28 to comp.dsp Last edited: 2008-March-31
Online Paper's and Books
Ron Nicholson's
Home Page
& email:
rhn@nicholson.com
I also post in comp.dsp signing as: rhn A.T nicholson d.0.t C-o-M
Copyright 2006,2008,2011,2012 Ronald H Nicholson, Jr.
Answering
iOS/iPhone programming
and DSP questions on
DSP StackExchange
as
hotpaw2