Scientific applications must be tuned for performance to run efficiently on supercomputers having nodes with a CPU (or, a general-purpose host processor) and GPUs (or, accelerator device processors). Conventional wisdom suggests focusing tuning of applications for a GPU and making the CPU only have the role of offloading computation to the GPU, given the CPU's relatively miniscule amount of computational power. However, this is overly conservative for modern scientific applications, which include those using scientific workflows with real-time data constraints and AI/ML with low numerical precision requirements. This work identifies new performance opportunities for modern scientific applications via CPU-GPU tuning, a strategy that unifies and integrates tuning of the CPU and GPU performance parameters. Applying CPU-GPU tuning to a dot product representative of these applications run on the widely-used Summit supercomputer results in up to an 8.15x speedup. These results provide groundwork for auto-tuning software for applications run on supercomputers having node-level heterogeneity.