Noriaki: The academic articles I've read about those types of caches (dating back 20 years or so) seemed to use the term "orthogonal" so if the term has been deprecated since that time, my apologies.
Regarding the "uselessness" of the smaller L2 cache... not to nit-pick or anything, but there are several reasons why adding an L2 cache even though it is smaller than L1 would prove beneficial even if it is not "exclusive."
The main reason is that L2 caches are typically fully associative (or have much higher set associativity) than L1 caches and can therefore be used as "victim buffers" for conflict misses. It was once thought that conflict misses were somewhat rare due to the so-called pathological nature of accesses which would consistently produce them. However, as system memories become larger and used more heavily, conflict misses go up dramatically, especially in direct-mapped or two-way set associative caches.
However, in AMD's case you are quite correct, since a 64kb victim buffer would most likely be dramatic overkill. In the simulations that I have run, I see virtually no improvement in hit rate under most SPEC benchmarks when increasing the size of a victim buffer beyond 1kb.