In-sensor processing of dynamic and static information of visual objects avoids exchanging redundant data between physically separated sensing and computing units, holding promise for computer vision hardware. To this end, gate-tunable photodetectors, if built in a highly scalable array form, would lend themselves to large-scale in-sensor visual processing because of their potential in volume production and hence, parallel operation. Here we present two scalable in-sensor visual processing arrays based on dual-gate silicon photodiodes, enabling parallelized event sensing and edge detection, respectively. Both arrays are built in CMOS compatible processes and operated with zero static power. Furthermore, their bipolar analog output captures the amplitude of event-driven light changes and the spatial convolution of optical power densities at the device level, a feature that helps boost their performance in classifying dynamic motions and static images. Capable of processing both temporal and spatial visual information, these retinomorphic arrays suggest a path towards large-scale in-sensor visual processing systems for high-throughput computer vision.