We consider non-parametric estimation and inference of conditional moment models in high dimensions. Formally, we consider the problem of finding a parameter vector theta(x) is an element of R-p that is a solution to a set of conditional moment equations of the form: E[psi(Z; theta(x))vertical bar X = x] = 0, (1) when given n i.i.d. samples (Z(1),..., Z(n)) from the distribution of Z, where psi : Z x R-p -> R-p is a known vector valued moment function, Z is an arbitrary data space, X is an element of X subset of R-D is the feature vector that is included Z. We show that even when the dimension D of the conditioning variable is larger than the sample size n, estimation and inference is feasible as long as the distribution of the conditioning variable has small intrinsic dimension d, as measured by locally low doubling measures. Our estimation is based on a sub-sampled ensemble of the k-nearest neighbors (k-NN) Z-estimator. our estimator solves a locally weighted empirical conditional moment equation (theta) over cap (x) solves : Sigma(n)(i=1) K(x, X-i, S) psi(Z(i); theta) = 0, (2) where K(x, X-i, S) is a kernel capturing the proximity of X-i to the target point x. We consider weights K(x, X-i, S) that take the form of an average over B base weights: K(x, X-i, S) = 1/B Sigma(B)(b=1) K(x, X-i, S-b) 1{i is an element of S-b}, where each K(x, X-i, S-b) is calculated based on a randomly drawn sub-sample S-b of size s < n from the original sample. We show that if the intrinsic dimension of the covariate distribution is equal to d, then the finite sample estimation error of our estimator is of order n(-1/(d+2)) and our estimate is n(-1/(d+2)) - asymptotically normal, irrespective of D. The sub-sampling size required for achieving these results depends on the unknown intrinsic dimension d. We propose an adaptive data-driven approach for choosing this parameter and prove that it achieves the desired rates. We discuss extensions and applications to heterogeneous treatment effect estimation.