why will voice/face recognition require 10 ghz?
I don't want to start a flame war ro anything, but I don't think that more Mhz will solve the problems in these technologies. Whenever the media reviews these things, they always say the "current computers aren't fast enough" and "more speed is needed to increase accuracy" -- these statements are just cop-outs. If you look back... to the 25Mhz erra, you will see the same statements: "we need 100Mhz to make this work right". Then in the 100Mhz erra: "we need 1Ghz". These technologies are always two speed generations away.
Computers are giant calculators. They can do what they are told very well. That is, if we can make a workable algorithm, a computer can follow it perfectly. But we havn't made good algorithms for these problems yet. I remember IBM's via Voice product shipping way back when... it did it's job reasonably well, on a 100Mhz machine. We now have 2Ghz machines and nothing has improved. That is because the algorithm is the same... so the porduct still makes the same mistakes... only faster. We can, of course, use the increase in speed to make the algorithms more complex, but then we encounter another fundamental flaw.
Humans are extreamely good at pattern recognition and matching. Our evolution has allowed us to quickly recognise words and faces. I don't know if you have noticed this, but when people speak, they make a constant stream of sylables. There are no pauses between the words. Peopletalklikethis. This can be demonstrated when you listen to someone speak in a language that you don't know -- the more foriegn the better. When I hear Cantonese, it is just a constant stream of sounds. The only reason you can understand english is because your brain is good a pettern recognition. It can instantly put separators between the sounds that make words that you know. When I listen to Japanese, it is also a constream stream of sounds... except for the words that i know: ogawa, hie, kaijo, ich, ni, nani, and a few others... everything else is just a blur. Then there is the problem of words that sound the same but mean different things. To a computer "recognizing speech" is alot like "wreaking a nice beach".
Facial recognition is even worse. The current algorithms measure distances between related facial features. But these features change. If I smile, the side of my lips are farther apart. If I'm tired, my eyes narrow. If you look at me from an angle... all of the ratios change. If the algoritm is strict, there will be many false negatives. If the algorithm makes allowences for variations, there will be very many false positives. Iris and fingerprint matching are much easier, since the ratios can't change.
The problem is that these problems don't easily translate into bits that can be filtered. It's not like subtracting 2 numbers and branching if the result is 0. They are inherently more complex and seemingling random to a computer. We need to develop better techniques to solve the problem, rather than rely on brute force.
Just my 2 cents, take it as you will.