||The fundamental problem in modeling complex phenomena such as human perception using probabilistic methods is that of deducing a stochastic model of interactions between the constituents of a system from observed configurations. Even in this era of big data, the complexity of the systems being modeled implies that inference methods must be effective in the difficult regimes of small sample sizes and large coupling variability. Thus, model inference by means of minimization of a cost function requires additional assumptions such as sparsity of interactions to avoid overfitting. In this paper, we completely divorce iterative model updates from the value of a cost function quantifying goodness of fit. This separation enables the use of goodness of fit as a natural rationale for terminating model updates, thereby avoiding overfitting. We do this within the mathematical formalism of statistical physics by defining a formal free energy of observations from a partition function with an energy function chosen precisely to enable an iterative model update Minimizing this free energy, we demonstrate coupling strength inference in nonequilibrium kinetic Ising models, and show that our method outperforms other existing methods in the regimes of interest. Our method has no tunable learning rate, scales to large system sizes, and has a systematic expansion to obtain higher-order interactions. As applications, we infer a functional connectivity network in the salamander retina and a currency exchange rate network from time-series data of neuronal spiking and currency exchange rates, respectively. Accurate small sample size inference is critical for devising a profitable currency hedging strategy.