The modular structure of the Linux kernel permits optimization of run time size. Modules for network-support may be loaded, enabling the embedded system to drop log/data-files to a central server system, and then be removed from the kernel until needed again. The same holds for file system support. The size reduction may be performance-relevant on low-memory systems.
A further kernel optimization may result from reducing allocated resources, either by kernel parameters at boot-up or by reducing resources in the kernel source. This does not really require kernel hacking, merely going through kernel source files and stripping down resources that you will not need for embedded systems (e.g., by setting the maximum number of IDE devices to 2 instead of 8, reducing number of consoles to 4 and so on). These modifications can be done without going very deep into the kernel and are not too hard to apply again when a new kernel comes out. This stripping of resources will not substantially reduce the compressed kernel image but it will reduce the run time kernel memory trace.
Many kernel resources can be tuned on your desktop system via the sysctl interface operating via /proc. By tuning system settings on the full system you can find an optimized setting for your kernel. On a minimum system sysctl would be a real waste since it is very large, so the optimized values for free pages page-cache etc. must be set in the kernel source and the kernel recompiled. This is not as bad as it sounds, since optimization is rarely so application-specific that it would change within a running system.