Note that these numbers don't nearly tell the whole story.
Usually the pipes or units are differentiated. For example all can do 1 SIMD integer add per cycle, but only some can do integer multiplication. floating point ops quite often aren't available on all pipes. Shuffle ops generally have lower than maximum throughput Good CPUs can do them every cycle, but may only be two pipes capable of them.
Sometimes it gets complicated like when Intel AVX-512 client cores had "half speed AVX-512". It was way more complex than that, since the integer ops were actually full-speed, just floating-point ones were not, and the unit layout was quite complex, IIRC with three 256bit floating-point units (that were 512 bit for integer!). But on the server version of the core, it was not as simple as the units being extended to 512bit for floating point. No, it worked differently - two of the 256bit pipes coupled into one 512bit FMA pipe and the third pipe received extra dedicated fully-512bit FMA unit (that was tacked on as an additional block on the side of the original floorplan) to reach 2x 512bit FMA pipes.
Basically, ideally you want an instruction table that tells you how many ops of each type can be executed per cycle (throughput) as well as latency (the delay before the result is available, due to pipelining - can also be important!).
You can have cases like core having seemingly beefy SIMD units, but then you find that shuffle ops have poor throughput with multi-cycle latency, for example. Some code would not mind, but some would.