The difference between the usual color and monochrome (“black and white”) cameras is an additional layer of small color filters, mostly arranged in the so-called “Bayer pattern” (patented in 1976 by Bruce E. Bayer, employee of Eastman Kodak).
Bayer Pattern on image sensor, (C) Wikipedia
In an ideal world, a lens would map any (arbitrarily small) point on the object to an (arbitrarily small) point on the sensor, and pixel could be supported, regardless of their size.
Unfortunately, by physical laws, it is only possible to illuminate little light disks(=”Airy disks”). The specified number of megapixels for a lens is a rough measure of the size of these disks (we are talking in terms of micrometers here). The more megapixels, the smaller the disks.
In order for the image on a monochrome sensor to be in focus, the disks must fit in the footprint of one pixel.
On a color sensor with “Bayer pattern” there are red pixels only in every other column and row. Here, disks of light that fit in a single pixel, may even be undesirable:
The disks should be so large, that one red, one blue and two green pixels are always covered, e.g. 2×2 pixels in size. This means that the lens, in comparison to a lens for a monochrome application/camera, can and maybe even should have a reduced resolution. To avoid the color moiré we can use a lens with a lower resolution, as stated above, or an (expensive!) so called “OLPF” (=”Optical Low Pass Filter”) that ensures a guaranteed minimum blur of 2 pixels. Such filters are required when customers request videoconferencing and want to avoid color moiré.
This can be done, because every pixel has direct neighbors in the other colors. For the location of a green pixel the red intensity is merely predicted(!) by the intensity of the red pixels in the direct neighborhood. As a side effect, the RGB image is larger on the computer than on the sensor!