Right. The central point is that what needs to be displayed in a screen has to be created by some hardware and software, and the standard CPU and O/S cannot do any of that. Instead any application generating video display uses a standard set of codes for those visual elements (be they text in a particular font, or lines, or bit-mapped graphics images) that the OS merely sends to the video display hardware. That commonly is a video chip system and its associated software installed on the mobo and in its BIOS, or an added graphics card with its own software loaded into Windows, or even a graphics "chip" that is integrated into the CPU chip. Anyway, the actual signals to be sent to the display are generated NOT by the OS, but by a dedicated separate system. Then those signals need to be sent out to the display device, and there are standard cable and signal formats used for that. USB of any form is NOT among those standard cable systems, but it should be possible to design a system that generates those display data streams as just another stream of digital data packets that could be sent along a USB line. Considering the data rate required, that better be USB3 or USBC. But of course the data stream formats would not be common for USB, and the other end of the cable for this would need a connector that meets the standard for a common digital input system such as HDMI. So the output, the cable, and associated software used by the video card all would need to be custom stuff, and you could NOT do this through any "standard" USBx port
As I indicated above, I expect that the system used by these enhanced USB Hubs involves a driver for the Hub that does the job of relaying the video data to the Hub instead of to a video card, and then that Hub contains its own video chip to generate the display signals and send them out of a standard video port.Similarly the driver and Hub combine to process any associated audio data and generates the audio signals for output, too.