Publications

Below is a collection of my publications, including any preprints or technical reports.

Papers

  1. Hoffman, M., Shahriari, B., Aslanides, J., Barth-Maron, G., Behbahani, F., Norman, T., Abdolmaleki, A., Cassirer, A., Yang, F., Baumli, K., Henderson, S., Novikov, A., Colmenarejo, S. G., Cabi, S., Gulcehre, C., Paine, T. L., Cowie, A., Wang, Z., Piot, B., and de Freitas, N. (2020). Acme: A Research Framework for Distributed Reinforcement Learning. arXiv:2006.00979. [pdf] [bibtex]

  2. Gu, A., Gulcehre, C., Paine, T. L., Hoffman, M., and Pascanu, R. (2019). Improving the Gating Mechanism of Recurrent Neural Networks. arXiv:1910.09890. [pdf] [bibtex]

  3. Paine, T. L., Gulcehre, C., Shahriari, B., Denil, M., Hoffman, M., Soyer, H., Tanburn, R., Kapturowski, S., Rabinowitz, N., Williams, D., Barth-Maron, G., Wang, Z., de Freitas, N., and Team, W. (2019). Making Efficient Use of Demonstrations to Solve Hard Exploration Problems. arXiv:1909.01387. [pdf] [bibtex]

  4. Shillingford, B., Assael, Y., Hoffman, M. W., Paine, T., Hughes, C., Prabhu, U., Liao, H., Sak, H., Rao, K., Bennett, L., Mulville, M., Coppin, B., Laurie, B., Senior, A., and de Freitas, N. (2019). Large-scale visual speech recognition. In INTERSPEECH. [pdf] [bibtex]

  5. Paine, T. L., Colmenarejo, S. G., Wang, Z., Reed, S., Aytar, Y., Pfaff, T., Hoffman, M. W., Barth-Maron, G., Cabi, S., Budden, D., and de Freitas, N. (2018). One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL. arXiv:1810.05017. [pdf] [bibtex]

  6. Barth-Maron, G., Hoffman, M. W., Budden, D., Dabney, W., Horgan, D., and TB, D., Muldal, A., Heess, N., and Lillicrap, T. (2018). Distributed Distributional Deterministic Policy Gradients. In International Conference on Learning Representations. [pdf] [bibtex]

  7. Cabi, S., Colmenarejo, S. G., Hoffman, M. W., Denil, M., Wang, Z., and Freitas, N. (2017). The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously. In Conference on Robotic Learning. [pdf] [bibtex]

  8. Chen, Y., Hoffman, M. W., Colmenarejo, S. G., Denil, M., Lillicrap, T. P., Botvinick, M., and de Freitas, N. (2017). Learning to learn without gradient descent by gradient descent. In International Conference on Machine Learning. [pdf] [bibtex]

  9. Wichrowska, O., Maheswaranathan, N., Hoffman, M. W., Colmenarejo, S. G., Denil, M., de Freitas, N., and Sohl-Dickstein, J. (2017). Learned optimizers that scale and generalize. International Conference on Machine Learning. [pdf] [bibtex]

  10. Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T., and de Freitas, N. (2016). Learning to learn by gradient descent by gradient descent. In Neural Information Processing Systems. [pdf] [bibtex]

  11. Hernández-Lobato, J. M., Gelbart, M. A., Adams, R. P., Hoffman, M. W., and Ghahramani, Z. (2016). A general framework for constrained Bayesian optimization using information-based search. Journal of Machine Learning Research, 17. [pdf] [bibtex]

  12. Hoffman, M. W., and Ghahramani, Z. (2015). Output-Space Predictive Entropy Search for Flexible Global Optimization. In NIPS workshop on Bayesian optimization. [pdf] [bibtex]

  13. Hernández-Lobato, J. M., Gelbart, M. A., Hoffman, M. W., Adams, R. P., and Ghahramani, Z. (2015). Predictive Entropy Search for Bayesian Optimization with Unknown Constraints. In International Conference on Machine Learning. [pdf] [bibtex]

  14. Shahriari, B., Wang, Z., Hoffman, M. W., Bouchard-Côté, A., and de Freitas, N. (2015). An Entropy Search Portfolio for Bayesian Optimization. arXiv:1406.4625. [pdf] [bibtex]

  15. Hoffman, M. W., and Shahriari, B. (2014). Modular mechanisms for Bayesian optimization. In NIPS workshop on Bayesian optimization. [pdf] [bibtex]

  16. Hernández-Lobato, J. M., Hoffman, M. W., and Ghahramani, Z. (2014). Predictive Entropy Search for Efficient Global Optimization of Black-box Functions. In Neural Information Processing Systems. [pdf] [bibtex]

  17. Hoffman, M. W., Shahriari, B., and de Freitas, N. (2014). On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In International Conference on Artificial Intelligence and Statistics. [pdf] [bibtex]

  18. Hoffman, M. W., and de Freitas, N. (2012). Inference strategies for solving semi-Markov decision processes. In L. E. Sucar, E. F. Morales, and J. Hoey (Eds.), Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions. IGI Global. [pdf] [bibtex]

  19. Hoffman, M. W., Lazaric, A., Ghavamzadeh, M., and Munos, R. (2012). Regularized Least Squares Temporal Difference Learning with Nested ell_2 and ell_1 Penalization. In European Workshop on Reinforcement Learning. [pdf] [bibtex]

  20. Ghavamzadeh, M., Lazaric, A., Hoffman, M. W., and Munos, R. (2011). Finite-Sample Analysis of Lasso-TD. In International Conference on Machine Learning. [pdf] [bibtex]

  21. Hoffman, M. W., Brochu, E., and de Freitas, N. (2011). Portfolio Allocation for Bayesian Optimization. In Uncertainty in Artificial Intelligence. [pdf] [bibtex]

  22. Hoffman, M. W., de Freitas, N., Doucet, A., and Peters, J. (2009). An Expectation Maximization algorithm for continuous Markov Decision Processes with arbitrary reward. In International Conference on Artificial Intelligence and Statistics. [pdf] [code] [bibtex]

  23. Hoffman, M. W., Kueck, H., de Freitas, N., and Doucet, A. (2009). New inference strategies for solving Markov decision processes using reversible jump MCMC. In Uncertainty in Artificial Intelligence. [pdf] [bibtex]

  24. Kueck, H., Hoffman, M. W., Doucet, A., and de Freitas, N. (2009). Inference and Learning for Active Sensing, Experimental Design and Control. In Iberian Conference on Pattern Recognition and Image Analysis. [pdf] [bibtex]

  25. Hoffman, M. W., Doucet, A., de Freitas, N., and Jasra, A. (2007). Bayesian policy learning with trans-dimensional MCMC. In Neural Information Processing Systems. [pdf] [bibtex]

  26. Hoffman, M. W., Doucet, A., de Freitas, N., and Jasra, A. (2007). On solving general state-space sequential decision problems using inference algorithms (No. TR-2007-04). University of British Columbia, Computer Science. [pdf] [bibtex]

  27. Hoffman, M. W., Grimes, D. B., Shon, A. P., and Rao, R. P. N. (2006). A probabilistic model of gaze imitation and shared attention. Neural Networks, 19. [pdf] [bibtex]

  28. Shon, A. P., Grimes, D. B., Baker, C. L., Hoffman, M. W., Zhou, S., and Rao, R. P. N. (2005). Probabilistic gaze imitation and saliency learning in a robotic head. In International Conference on Robotics and Automation. [pdf] [bibtex]

Thesis

  1. Hoffman, M. W. (2013). Decision making with inference and learning methods (PhD thesis). University of British Columbia. [pdf] [bibtex]