I'm using OpenGL but this question should apply generally to rendering.
I understand that for efficient rendering in games, you want to minimize communication between the CPU and GPU. This means pre-loading as much vertex data as possible into graphics memory before a level starts (during the level loading screen). And updating the view and projection matrices for your camera once per frame, then letting the vertex shaders scale, rotate and translate models as required to render the scene. You can use instanced rendering this way to minimize draw calls.
This works great for static geometry that never changes. But I don't understand how you're supposed to minimize communication between the CPU and GPU when lots of objects are moving. This seems impossible?
If only the CPU knows the new location of all the objects moving around each frame, then it must somehow pass that data to the GPU. If it passes the updated model matrices for moving objects to the GPU as uniforms to the vertex shader, isn't this the CPU talking to the GPU? Isn't this very slow and what we're supposed to avoid?
I understand we can use uniform buffer objects instead of updating uniforms individually, but this still means sending lots of data from the CPU to the GPU.
How can you render lots of moving objects efficiently, whose model matrices are changing every frame?