Mirroring file collections in the global Internet is widely practiced with a recent study estimating the number of WWW hosts with mirrored content at 10% of all WWW hosts. Conventional mirroring tools, however, are not well-suited for the large-scale multiple-site replication services envisioned by projects such as the Internet2 Distributed Storage Infrastructure (I2-DSI) project. This paper presents a scaleable design for the automated synchronization of large collections of files replicated across multiple hosts, as in I2-DSI, and outlines of how the design has been realized using rsync+, a modification to the popular open-source mirroring tool, rsync. A performance study based on an instrumented mirror using rsync+ empirically characterizes server-side processing costs Lander realistic, large-scale workloads, and supplementary measurements of network throughput across Internet2 links illustrate the achievable network performance in a high-speed wide-area network. These experimental results confirm the validity of scalability arguments for the design, uncover key system parameters for rsync+ that must be tuned for efficient operation, and indicate the limitations of TCP-only transport solutions as the number of mirror sites grows. (C) 2000 Academic Press.