I'm not sure how we got on the CPU subject, but I'm continuously fascinated by how a CPU works. I was told an introduction to Processors can be a really cool class to take. Specially if you want to get into programming.
Knowing HOW and WHY things work can help when it comes to debugging software. For writing, not so much. Sadly, most programmers seem to think that debugging is someone else's problem. That's the difference between a programmer or developer and an ENGINEER.
It seems that there would eventually be an easier way to process code instead of multiple processes. You would think one method of compiling and decoding would be faster....
From a pure performance standpoint, it's true that you pay a performance penalty. This is the same discussion as assembly vs. C vs. Java vs. Perl vs. PHP vs. Ruby; how much are you willing to sacrifice in the way of performance in order to make your development easier?
I'm quite certain that a CPU that was designed from the ground up, with every gate and wire and via carefully placed and calculated for optimum usage would be able to reach power and performance thresholds current CPUs can only dream of. I'm also quite certain it would be at least a decade for each version, much less for different generations. And fixing bugs that arise in the field would be orders of magnitude more difficult, if it were possible at all.
EVERYTHING, from the lowliest lifeforms to the mightiest achievements of mankind, works on the basis of modules; design concepts that are isolated in themselves and which can then be replicated, tweaked slightly, and replicated some more. This is the only way in which anything of significant complexity can be constructed.
Think of it this way:
Two objects, A and B. Each talks to the other. Two connections. Something goes wrong, easy to figure out where and fix.
Now three objects, A, B, C. Each talks to all of the others. Now you've got six connections; 3 times the complexity. Simple to troubleshoot right? If A is talking to C and C starts acting strange, C must be the problem? But what if C started acting strange because B did something stupid?
Now lets go to four objects, A-D. Now we've got twelve connections. A starts acting weird and crashes. Did A crash because of a bug? First glance says 'of course'. But, when a program is handed unreasonable input, sometimes the safest thing to do is to halt the system; to crash. So, what caused A to crash, was it input from B, C, or D? Worse, was it input from B that caused D to misbehave, which then caused A to crash.
Now expand that to a hundred objects, or a thousand.
And we haven't even covered re-entrant situations, where A calls to B, which calls back to A, which calls to C, which calls D, which then calls B again.
Being able to control and define the interactions between objects is the only way to successfully build complex systems in any kind of reasonable timeframe and with any level of faith in any kind of reasonable level of reliability.
Does the ARM processors work in the same fashion as you have explained?
AFAIK, pretty much ALL CISC architectures use microcode. The ARM processors are different because they are RISC. They don't use microcode in the traditional sense (generally no microcode ROM), but the DO use a hardware description language (HDL) to describe the instruction set. This gets compiled down to a set of instruction decode blocks which are then implemented as gates in hardware. So the same modular concept exists, but it's all translated to the hardware level. ARM can do this because the instruction set is highly orthogonal and fairly simple in its base complexity (the whole point of RISC).
It's also worth noting that in the case of ARM, the vast majority of designs are implemented by hardware manufacturers who can use VHDL to program FPGAs to simulate the final hardware, and run the intended software on it. As such, in most cases the target system is well defined and understood, making optimizations possible that wouldn't be possible under more general-use conditions. (Note I said the vast majority of DESIGNS; ARM CPUs are present in the majority of cell phones, which means the majority of ARM CPUs are used by those ITEMS, but cell phones account for a tiny number of ARM-based DESIGNS, compared to TVs, refrigerators, microwaves, smart thermostats, medical devices, cars, etc.)