Bad smells often indicate a possible concern in a software design that may present challenges related to comprehension and maintenance. As a system evolves through a series of changes and maintenance activities, the bad smells embedded in the system may also evolve with the potential for introducing additional new smells. Existing bad smells research often targets textual code-based implementations. We found very little research on bad smells in systems designed with graphical languages that are used often in industry. This paper presents our analysis on the evolution of four bad smells in 575 Simulink models across 31 open-source repositories. We conducted our analysis by creating a chain of model-driven tools that could assist with various analysis needs. Our first step was to extract the evolution history of Simulink models in GitHub. Next, we manually classified each version to a maintenance category (i.e., adaptive, preventive, corrective, or perfective). Then, we developed queries to detect instances of four selected bad smells. Finally, we analysed the evolution of each of the smells across the version history of the repositories, the relationships between the smells and the size of the models, and the impact of maintenance activities on the evolution of the identified bad smells. The results suggest that: 1) larger models tend to contain more types of smells, 2) an increase in the instances of smells is usually associated with an increase in model size, but an increase in model size does not necessarily imply an increase in the number of smells, 3) the majority of bad smells are introduced during the initial construction of the models, although a significant portion of the smells are introduced at later stages, and 4) adaptive maintenance tasks often lead to an increase in the number of smells in Simulink models, but corrective maintenance tasks often correlate with a decrease in the number of smells.