In the paper three layer perceptron with one hidden layer and the output layer consisting of one neuron is considered. This is commonly used architecture to solve regression problems where such a perceptron minimizing the mean squared error criterion for the data points x(k), y(k)), k = 1, .... N is sought. It is shown that in the model: y(k) = g0(X(k)) + epsilon(k), k = 1, .... N, where x(k) is independent from zero mean error term epsilon(k), this procedure is consistent when N --> infinity, provided that g0 is represented as three layer perceptron with Heaviside transfer fucntion. The same result is true when transfer function is an arbitrary continuous function with bounded limits at +/- infinity and the hidden-lo-output weights in the considered family of perceptrons are bounded.