Publications

Cheng, T.S. et al. (2024) “Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum”, in Proceedings of Machine Learning Research. ML Research Press, pp. 8141–8162.   
Compagnoni, E.M. et al. (2024) “SDEs for Minimax Optimization”, in Proceedings of Machine Learning Research. ML Research Press, pp. 4834–4842.   
Francazi, E., Lucchi, A., Baity-Jesi, M. (2024) “Initial Guessing Bias: How Untrained Networks Favor Some Classes”, in Proceedings of Machine Learning Research. ML Research Press, pp. 13783–13839.   
Lucchi, A., Kohler, J. (2023) “A sub-sampled tensor method for nonconvex optimization”, IMA Journal of Numerical Analysis, 43(5), pp. 2856–2891. Available at: 10.1093/imanum/drac057.   
Kersting, H. et al. (2023) “Mean first exit times of Ornstein-Uhlenbeck processes in high-dimensional spaces”, Journal of Physics A: Mathematical and Theoretical, 56(21). Available at: 10.1088/1751-8121/acc559.   
Cheng, T.S. et al. (2023) “A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge Regression”, in Advances in Neural Information Processing Systems. Neural information processing systems foundation.   
Anagnostidis, S. et al. (2023) “Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers”, in A. Oh et al. (eds.) 37th Conference on Neural Information Processing Systems (NeurIPS 2023). New Orleans, Louisiana, USA (Advances in Neural Information Processing Systems 36 (NeurIPS 2023). Available at: https://proceedings.neurips.cc/paper_files/paper/2023.   
Sotiris, A., Lucchi, A., Hofmann, T. (2023) “Mastering Spatial Graph Prediction of Road Networks”, in Proceedings of the IEEE International Conference on Computer Vision. Institute of Electrical and Electronics Engineers Inc, pp. 5385–5395. Available at: 10.1109/ICCV51070.2023.00498.   
Singh, S.P. et al. (2022) “Phenomenology of Double Descent in Finite-Width Neural Networks”, in. International Conference on Learning Representations: International Conference on Learning Representations. Available at: https://openreview.net/forum?id=lTqGXfn9Tv.   
Orvieto, A. et al. (2022) “Vanishing Curvature in Randomly Initialized Deep ReLU Networks”, in. International Conference on Artificial Intelligence and Statistics: International Conference on Artificial Intelligence and Statistics.   
Fluri, J. et al. (2022) “Full w CDM analysis of KiDS-1000 weak lensing maps using deep learning”, Physical Review D, 105(8), p. 083518.   
Anagnostidis, S. et al. (2022) “Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse”, in. Advances in Neural Information Processing Systems: Advances in Neural Information Processing Systems. Available at: https://openreview.net/forum?id=FxVH7iToXS.   
Diouane, Y., Lucchi, A., Patil, V.P. (2022) “A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning”, in. International Conference on Artificial Intelligence and Statistics: International Conference on Artificial Intelligence and Statistics.   
Bachmann, G., Hofmann, T., Lucchi, A. (2022) “Generalization Through the Lens of Leave-One-Out Error”, in. International Conference on Learning Representations: International Conference on Learning Representations. Available at: https://openreview.net/forum?id=7grkzyj89A_.   
Lucchi, A. et al. (2022) “On the Theoretical Properties of Noise Correlation in Stochastic Optimization”, in. Advances in Neural Information Processing Systems: Advances in Neural Information Processing Systems. Available at: https://openreview.net/forum?id=cNrglG_OAeu.   
Orvieto, A. et al. (2022) “Anticorrelated Noise Injection for Improved Generalization”, in Proceedings of Machine Learning Research. PMLR: PMLR.   
Yang, J. et al. (2022) “Faster single-loop algorithms for minimax optimization without strong concavity”, in. International Conference on Artificial Intelligence and Statistics: International Conference on Artificial Intelligence and Statistics.