In terms of a monitor there really is no concept of a pixel. Monochrome monitors did not have a mask structure. They used a single beam and no mask structure. The entire inside face of the CRT was coated in phosphor. From the monitors point of view the pixel was defined for the most part as the size of the electron beam. Same concept in a color monitor, only you have three beams and a mask to ensure the correct beam hits the correct color phosphor. For all the monitor cares, the video card could be building each line out of millions of active logical pixels. I like to think of a logical pixel as the speed the video card modulate the beam, this determines beam size.
The number of the holes in the mask and stripes in the aperture grill technically set the maximum resolution of the monitor. At lower resolutions the logical pixels simply cover more than one hole or slot. The logical pixels do not need to line up with the physical holes or slots nor is there any mechanism to do so.
At resolutions that exceed the number of holes or slots across the screen, logical pixels (electron beam size) no longer hit the phosphors accurately enough to guarantee constant colors or luminance. Some of the beam is intercepted by the mask structure. On monitors with lower horizontal dot / aperture pitch, more of the beam is intercepted by the mask. However an image will still be displayed and in practice will look OK.