
In the tech world, a filter is a circuit or algorithm that you can use to manipulate a signal and toss out and/or keep parts of the signal that you don't/do like. Think of the equalizer in your stereo. It allows you to emphasize more of the high frequencies (a "highpass" filter) or the lower ones ("lowpass" filter).
For a long time filters were implemented purely with passive components  capacitors, resistors and coils. These worked well enough, but always lost some percentage of the signal sent into them. Then people started using active filters with transistors and opamps. These didn't diminish (necessarily) signal power, but they always introduced noise.
(If you should ever need to design a passive or active filter circuit, I've been fairly happy with FilterFree from NuSoft for doing amateur stuff.)
Most recently, Digital Signal Processing (DSP) has become all the rage. In this technique, you digitize your signal into samples with an analog to digital converter and then use some algorithm (aka a "digital filter") on them to do what you want. The two major classes of digital filters are the Infinite (IIR) and Finite Impulse Response (FIR) filters. The major difference is that a FIR is fed with a constant input value for long enough, it will always settle down to outputting a constant value. This is known as convergence. IIRs are not guaranteed to converge, even with a stable input.
Digital filters in general work by multiplying on each of N previous samples by some chosen coefficient, and then summing up the results. CPUs specifically designed for DSP work often have special "multiply and accumulate" instructions built into hardware to make this operation as fast as possible.
A Moving Average (sometimes called a "running average") is a kind of digital lowpass filter. In other words, it's an algorithm you can use on a set of samples that will smooth them out if they're jumpy. Consider a stock chart. Some days the stock price will be higher, and some lower, but on average the stock is generally around some average price. A lowpass filter can tell you what that average price is.
A simple example is illustrative here: Suppose we take 1/3rd of today's stock price, plus 1/3rd of the previous day's stock price, plus 1/3rd of the stock price the day before that. Add them all up, and you have the average stock price over today and the two previous days. The math is strictly equivalent to doing (S1 + S2 + S3) / 3  we're just doing (1/3)S1 + (1/3)S2 + (1/3)S3 to make our lives easier(??).
This "onethird of each sample for the last three days" algorithm is known as the "Simple Moving Average." The SMA is easy enough to calculate, and its calculation can be simplified further. If you write down the equations today's 3day SMA vs yesterday's 3day SMA, you'll notice that many of the terms are identical. This means the new SMA can be computed from the current SMA by:
newSMA = oldSMA + (newSample/N  oldestSample/N)
Here, "oldestSample" is the stock price N days ago.
Unfortunately the SMA has some weaknesses. First, you have to keep around all the samples back to N days ago, since you need the last one to calculate today's SMA. And as is true of all averages, SMAs also do not respond to trends instantly. In the SMA's case, sometimes this lag can be extremely bad.
The "Linearly Weighted Moving Average." One way to allow better tracking of trends is to give greater weight to more recent samples. An easy way to do this is to multiply the sample from N days ago by (N  number of days back). This devalues the past day's prices in a linear fashion. Unfortunately its calculation is more complicated, and still involves keeping around quite a few samples from the past.
So finally, we come to the heart of things! The Exponential Moving Average places less and less weight on older samples, but in an exponential fashion. Which means today's and yesterday's samples count for a lot, but nine and ten days ago's count for very little. This is a nice feature when you're trying to smooth out a noisy input whose true value is also fast changing. You'll reject most of the noise, but also won't lag the real signal by much.
In addition, while it's not exactly the strict equivalent of the actual formula, the EWMA can be calculated extremely quickly and compactly with the equations:
newEWMA = (1/10)newSample + (9/10)oldEWMA
or, equivalently
newEWMA = (newSample + (N1)oldEWMA) / N
In the first form, this formula involves only three operations, so is quite quick. With the second form, a proper choice of N will allow the divide to be optimized to singleop bit shift. (E.g., dividing by 8 is the same as rightshifting three bits.) In either case, these equations can be reasonably accurate even with integer arithmetic. Furthermore, both formulas require only the new sample and the old EWMA to compute the new EWMA  no record keeping is necessary. And N can be twiddled to give you as quick or slowly changing a moving average as you deem necessary. The general idea ("one Nth of the new, plus N1 Nth's of the old") also has a pleasing conceptual simplicity and thus is a snap to remember and implement.
In my case, I had noisy data (gear count rates) coming in and wanted to smooth them out before comparing them for large differences. Which in this case might indicate a stuck motor that couldn't turn its gear. The EWMA worked quickly and accurately even in the restricted confines of the integers only, 1 MHz clock microcontroller I was using. So if you ever need a simple lowpass digital filter to smooth out incoming data, remember the Exponential Moving Average.