In recent years, extreme learning machines (ELM) have been used to accurately predict a variety of hydrological variables (e.g., streamflow, precipitation, river water quality). Using the same model structure, ELM often obtains similar performance to multi-layer perceptron (MLP) networks without the need for an iterative learning process (backpropagation), resulting in faster training. However, despite the increasing popularity of ELM, the hydrology literature has not focused on training algorithms that can be used to generate probabilistic predictions for this method. This is an interesting research gap, as it is generally accepted that quantifying hydrological prediction uncertainty and producing probabilistic predictions (instead of point predictions or mean value predictions) is a prerequisite for reliable water resource management. Thus, for the first time, Bayesian ELM (BELM) and sparse BELM (SBELM) methods are adopted and applied for probabilistic streamflow simulation and multi-step ahead forecasting (1-3 days), using as a case study four watersheds from Mexico, Germany, Canada, and Belgium. Using deterministic and probabilistic metrics, BELM and SBELM are compared against Bayesian linear regression (BLR), MLP combined with Monte-Carlo dropout weights, and a deep learning method: long short-term memory network (LSTM) coupled with Monte-Carlo dropout weights. Adding time-lagged observations of streamflow and meteorological variables (up to 14 days), such as precipitation and potential evapotranspiration, improves the simulation accuracy up to a factor of 10. In general, both BELM and SBELM show more accurate point predictions than MLP, BLR, and LSTM. BELM and SBELM show similar performance in terms of accuracy and reliability. Although, BELM marginally outperforms SBELM by generating, on average, a narrower prediction interval width. The sparsity feature of SBELM reduces (on average) the network size by 14-83 %. BELM and SBELM produce more accurate and reliable predictions than LSTM and are up to 122 and 125 times more computationally efficient to train, respectively. The case study suggests that BELM and SBELM are promising probabilistic machine learning models for hydrological prediction that are attractive alternatives to physical (e.g., lumped conceptual) models and common deep learning models (e.g., LSTM).