MLOPT Idea Seminar is an informal student-led seminar series where junior researchers all around the world are invited to share their recent ideas and findings. In an effort to bring students closer together and engage them in the discussion, MLOPT Idea Seminar is designed to provide an informal platform for students to freely share their published or half-baked papers and to get feedback. The biweekly MLOPT Idea Seminar series covers a variety of recent research topics, including the intersection of Machine Learning and Optimization.

## Upcoming Events

This is an accordion element with a series of buttons that open and close related content panels.

## May. 13, 2022: Xuefeng Du (UW-Madison)

May. 13, 2022: Xuefeng Du (UW-Madison)

- Title: Unknown-aware learning for object detection
- Abstract: Out-of-distribution (OOD) detection is indispensable for deploying machine learning models in the wild. One of the key challenges in OOD detection is that models lack supervision signals from unknown data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this talk, we are going to talk about our recent approaches to virtual outlier synthesis and spatial-temporal unknown distillation to generate unknowns in a convenient way. Then, we design sophisticated regularization terms to help our model achieve strong OOD detection performance on a new benchmark for object detection.
- Short bio: Xuefeng is a second-year CS Ph.D. student at UW-Madison. His research focus is trustworthy machine learning, such as adversarial robustness, learning problem with label noise, and out-of-distribution detection. He is also interested in neural architecture search, graph mining, and high-level vision tasks, such as object detection and segmentation.

## May. 27, 2022: Dung (June) Thai (UMass Amherst)

May. 27, 2022: Dung (June) Thai (UMass Amherst)

## Jun. 10, 2022: Sewon Min (UWashington)

Jun. 10, 2022: Sewon Min (UWashington)

## Jun. 24, 2022: Yuxin Sun (UW-Madison)

Jun. 24, 2022: Yuxin Sun (UW-Madison)

## July 8, 2022: Minhui Huang (UC-Davis)

July 8, 2022: Minhui Huang (UC-Davis)

## July 22, 2022: Tuan Dinh (UW-Madison)

July 22, 2022: Tuan Dinh (UW-Madison)

## Aug. 5, 2022: TBD

Aug. 5, 2022: TBD

## Aug. 19, 2022: TBD

Aug. 19, 2022: TBD

## Sep. 2, 2022: Andrew Lowy (USC)

Sep. 2, 2022: Andrew Lowy (USC)

## Sep. 16, 2022: Jiefeng Chen (UW-Madison)

Sep. 16, 2022: Jiefeng Chen (UW-Madison)

## Past Talks

This is an accordion element with a series of buttons that open and close related content panels.

## Apr. 29, 2022: Andrew Wagenmaker (UWashington)

- Apr. 29, 2022: Andrew Wagenmaker (UWashington)
- Links: [Video]
- Title: Towards the True Complexity of Learning Near-Optimal Policies in Markov Decision Processes
- Abstract: Recent advances in reinforcement learning (RL) demonstrate its potential for solving real-world problems. However, significant challenges remain, limiting its widespread deployment. In particular, the large sample complexity—the number of interactions with the environment—required by RL algorithms to learn near-optimal policies has proved a critical difficulty. In this work, we take steps towards addressing this challenge by seeking to understand the true sample complexity of RL, with the goal of developing algorithms that take into account problem structure to learn efficiently. We consider two directions in particular. First, we study the instance-dependent complexity of learning near-optimal policies in tabular Markov Decision Processes (MDPs). We seek to obtain sample complexities that scale not with worst-case measures such as the number of states and actions, but with quantities that depend on the inherent difficulty of each individual MDP. We propose a new instance-dependent measure of complexity, the gap-visitation complexity, and an algorithm with sample complexity scaling as the gap-visitation complexity. Furthermore, by analyzing the gap-visitation complexity on several examples, we show that existing approaches to learning near-optimal policies can have sample complexity that is arbitrarily suboptimal. Next, we study how the structure of the MDP affects the ability to generalize across reward functions. We show that, when the MDP has a particular underlying linear structure, the sample complexity of learning a near-optimal policy for an arbitrary number of reward functions is, in the worst case, no larger than the sample complexity of learning a near-optimal policy for a single reward function, implying that reward-free RL is no harder than reward-aware RL in linear MDPs.
- Short bio: Andrew is a fourth-year Ph.D. student in Computer Science at the University of Washington working with Kevin Jamieson. His research interests are in active learning, reinforcement learning, and control theory. He is supported by an NSF Graduate Research Fellowship

## Apr. 15, 2022: Jeongyeol Kwon (UTexas)

Apr. 15, 2022: Jeongyeol Kwon (UTexas)

- Time: 12:30 pm CDT
- Location: Orchard view room, Wisconsin Institutes for Discovery (WID), 330 N Orchard St.
- Title: Reinforcement Learning with Latent Contexts
- Abstract: Learning a near optimal policy in a partially observable system remains an elusive challenge in contemporary reinforcement learning. One example where such a challenge arises is a system associated with latent contexts. In this talk, we consider episodic reinforcement learning in a reward-mixing Markov decision process (RM-MDP). There, a reward function is drawn from one of multiple possible reward models at the beginning of every episode, but the identity of the chosen reward model (context) is not revealed to the agent. With no assumptions on system dynamics, even for two switching-reward models, the problem requires several new ideas beyond existing algorithmic and analysis techniques for efficient exploration. We develop the first efficient algorithm that does not require any assumptions for two switching-reward models. I will then briefly discuss the scalability and generalizability of the proposed method in more challenging scenarios.
- Short bio: Jeongyeol is a Ph.D. Candidate in Electrical and Computer Engineering at the University of Texas at Austin. He is broadly interested in theoretical aspects of machine learning, with a focus on Statistical Learning Theory, Optimization, Bandits/Reinforcement Learning, and Robust Statistics. His current research aims to develop efficient and robust algorithms for sequential decision-making problems arising from real applications.

## Apr. 1, 2022: Defeng Liu (PolyMtl)

Apr. 1, 2022: Defeng Liu (PolyMtl)

- Link: [Video]
- Title: Learning to search in local branching for solving mixed-integer programs
- Abstract: Finding high-quality solutions to mixed-integer linear programming problems (MILPs) is of great importance for many practical applications. In this respect, the refinement heuristic local branching (LB) has been proposed to produce improving solutions and has been highly influential for the development of local search methods in MILP. For a LB algorithm, the choice of the neighborhood size is critical to performance. Although it was initialized by a conservative value in the original LB scheme, our new observation is that the best size is strongly dependent on the particular MILP instance. In this work, we investigate the relation between the size of the search neighborhood and the behavior of the underlying LB algorithm, and we devise a leaning based framework for guiding the neighborhood search of the LB heuristic. We computationally show that the neighborhood size can indeed be learned, leading to improved performances and that the overall algorithm generalizes well both with respect to the instance size and, remarkably, across instances.
- Short Bio: Defeng Liu is currently doing his Ph.D. in Mathematics at the Polytechnique Montréal, under the supervision of Prof. Andrea Lodi. His research focuses on Mixed-Integer Programming, Mathematical Optimization, and Machine Learning. Namely, Defeng is interested in applying deep learning and reinforcement learning techniques in mixed-integer programs to solve complex problems, aiming at bringing machine learning and operations research closer together.

## Mar. 18, 2022: Yae Jee Cho (CMU)

Mar. 18, 2022: Yae Jee Cho (CMU)

- Links: [Video]
- Title: Leveraging Biased Client Selection in Federated Learning
- Abstract: Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing. Previous works analyzed federated learning by accounting of data heterogeneity, communication/computation limitations, and partial client participation. However, most assume unbiased client participation, where clients are selected such that the aggregated model update is unbiased. In our work, we investigate how biased client selection can be leveraged to improve many aspects of federated learning including convergence speed, fairness, and client incentives.
- Short Bio: Yae Jee Cho is a Ph.D. student in the ECE department at Carnegie Mellon University advised by Prof. Gauri Joshi. Her main research interests are in distributed machine learning, in particular, federated learning with a focus on client systems and data heterogeneity. She received her B.S. and M.S. degrees in Electrical & Computer Engineering from Yonsei University. She has worked with the Privacy in AI team at Microsoft Research as a research intern on federated learning and is a recipient of the Korean Government Doctoral Scholarship.

## Mar. 4, 2022: Boxin Zhao (UChicago)

Mar. 4, 2022: Boxin Zhao (UChicago)

- Links: [arXiv][arXiv]
- Title: Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback
- Abstract: In federated learning (FL) problems, client sampling plays a key role in the convergence speed of training algorithm. However, while being an important problem in FL, client sampling is lack of study. In this paper, we propose an online learning with bandit feedback framework to understand the client sampling problem in FL. By adapting an Online Stochastic Mirror Descent algorithm to minimize the variance of gradient estimation, we propose a new adaptive client sampling algorithm. Besides, we use online ensemble method and doubling trick to automatically choose the tuning parameters in the algorithm. Theoretically, we show dynamic regret bound with comparator as the theoretically optimal sampling sequence; we also include the total variation of this sequence in our upper bound, which is a natural measure of the intrinsic difficulty of the problem. To the best of our knowledge, these theoretical contributions are novel to existing literature. Moreover, by implementing both synthetic and real data experiments, we show empirical evidence of the advantages of our proposed algorithms over widely-used uniform sampling and also other online learning based sampling strategies in previous studies. We also examine its robustness to the choice of tuning parameters. Finally, we discuss its possible extension to sampling without replacement and personalized FL objective. While the original goal is to solve client sampling problem, this work has more general applications on stochastic gradient descent and stochastic coordinate descent methods.
- Short bio: Boxin Zhao is a Ph.D. student in Statistics and Econometrics at the University of Chicago, advised by Prof. Mladen Kolar. His research interests include machine learning, federated learning, probabilistic graphical models, high-dimensional statistics, and optimization, with a focus on developing novel methodologies with both practical applications and theoretical guarantees.

## Feb. 18, 2022: Sungryull Sohn (LG AI Research US)

Feb. 18, 2022: Sungryull Sohn (LG AI Research US)

- Links: [Video] [arXiv]
- Title: Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning
- Abstract: Operating in the real world often requires agents to learn about a complex environment and apply this understanding to achieve a breadth of goals. This problem, known as goal-conditioned reinforcement learning (GCRL), becomes especially challenging for long-horizon goals. Current methods have tackled this problem by augmenting goal-conditioned policies with graph-based planning algorithms. However, they struggle to scale to large, high-dimensional state spaces and assume access to exploration mechanisms for efficiently collecting training data. In this work, we introduce Successor Feature Landmarks (SFL), a framework for exploring large, high-dimensional environments so as to obtain a policy that is proficient for any goal. SFL leverages the ability of successor features (SF) to capture transition dynamics, using it to drive exploration by estimating state-novelty and to enable high-level planning by abstracting the state-space as a non-parametric landmark-based graph. We further exploit SF to directly compute a goal-conditioned policy for inter-landmark traversal, which we use to execute plans to “frontier” landmarks at the edge of the explored state space. We show in our experiments on MiniGrid and ViZDoom that SFL enables efficient exploration of large, high-dimensional state spaces and outperforms state-of-the-art baselines on long-horizon GCRL tasks.
- Short bio: Sungryull Sohn is a research scientist at LG AI research US. He received B.S. and M.S. degrees in Electrical Engineering from KAIST, and a Ph.D. degree from the University of Michigan. Before his Ph.D., he worked at Electronics and Telecommunications Research Institute (ETRI) as a researcher. During his Ph.D., he worked at Microsoft Research as a research intern and Google Brain as a research intern and student researcher. His research interests include various areas in deep reinforcement learning, such as compositional task learning, hierarchical reinforcement learning, meta-reinforcement learning, multi-task reinforcement learning, offline reinforcement learning, and neural combinatorial optimization.

## Feb. 4, 2022: Jiaxin Hu (UW-Madison)

Feb. 4, 2022: Jiaxin Hu (UW-Madison)

- Links: [Video][arXiv]
- Title: Multiway Spherical Clustering via Degree-Corrected Tensor Block Models
- Abstract: We consider the problem of multiway clustering in the presence of unknown degree heterogeneity. Such data problems arise commonly in applications such as recommendation system, neuroimaging, community detection, and hypergraph partitions in social networks. The allowance of degree heterogeneity provides great flexibility in clustering models, but the extra complexity poses significant challenges in both statistics and computation. Here, we develop a degree-corrected tensor block model with estimation accuracy guarantees. We present the phase transition of clustering performance based on the notion of angle separability, and we characterize three signal-to-noise regimes corresponding to different statistical-computational behaviors. In particular, we demonstrate that an intrinsic statistical-to-computational gap emerges only for tensors of order three or greater. Further, we develop an efficient polynomial-time algorithm that provably achieves exact clustering under mild signal conditions. The efficacy of our procedure is demonstrated through two data applications, one on human brain connectome project, and another on Peru Legislation network dataset.
- Short Bio: Jiaxin is a second-year Ph.D. student in Statistics at the University of Wisconsin-Madison working with Prof. Miaoyan Wang. Her research interests lie in the intersection of statistics and machine learning. Currently, she is working on higher-order tensor methods with application in networks. Prior to the Ph.D. program, she received a Master’s degree in Statistics from UW-Madison in 2020 and Bachelor’s degree in Statistics from Wuhan University in 2019.

## Jan. 21, 2022: Jy-yong Sohn (UW-Madison)

Jan. 21, 2022: Jy-yong Sohn (UW-Madison)

- Links: [Video][arXiv]
- Title: GenLabel: Mixup Relabeling using Generative Models
- Abstract: Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main causes of this phenomenon by theoretically and empirically analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple yet effective relabeling algorithm designed for the mixup. In particular, GenLabel helps the mixup algorithm correctly label mixup samples by learning the class-conditional data distribution using generative models. Via extensive theoretical and empirical analysis, we show that mixup, when used together with GenLabel, can effectively resolve the aforementioned phenomenon, improving the generalization performance and the adversarial robustness.
- Short Bio: Jy-yong is a post-doctoral researcher in the Department of Electrical and Computer Engineering (ECE) at the University of Wisconsin-Madison, working with Prof. Dimitris Papailiopoulos and Prof. Kangwook Lee. He is interested in the intersection of machine learning, information theory, and distributed algorithms. He received his Ph.D. degree in 2020 from KAIST, under the supervision of Prof. Jaekyun Moon. He is a recipient of the IEEE ICC Best Paper Award, Qualcomm Innovation Awards, KAIST Global Leader Fellowship, KAIST EE Best Research Achievement Award, and NRF Korea Post-doctoral Fellowship.

## Jan. 7, 2022: Hongyi Wang (CMU)

Jan. 7, 2022: Hongyi Wang (CMU)

- Links: [video][arXiv]
- Title: On the Utility of Gradient Compression in Distributed Training Systems
- Abstract: A rich body of prior work has highlighted the existence of communication bottlenecks in synchronous data-parallel training. To alleviate these bottlenecks, a long line of recent research proposes gradient and model compression methods. In this talk, I will first present an evaluation of gradient compression methods’ efficacy and comparisons of their scalability with optimized implementations of synchronous data-parallel SGD across more than 200 realistic distributed setups. The observation is that, surprisingly, only in 6 cases out of 200, gradient compression methods provide promising speedup. I will then introduce our extensive investigation to identify the root causes of this phenomenon and present a performance model that can be used to identify the benefits of gradient compression for a variety of system setups. Finally, I will propose a list of desirable properties that gradient compression methods should satisfy, in order for them to provide meaningful utility in practice.
- Short Bio: Hongyi Wang is currently a postdoctoral research fellow at the Machine Learning Department at CMU working with Eric Xing. He obtained his Ph.D. degree at the CS department at UW-Madison, advised by Dimitris Papailiopoulos. His research interests are machine learning algorithms and systems. Hongyi has developed several well-known frameworks regarding communication-efficient and robust distributed learning, which were appeared on ICML, NeurIPS, ICLR, and MLSys. His recent work on federated learning has received lots of attention as well as the Baidu best paper award from the SpicyFL workshop at NeurIPS 2020.

## Dec. 10, 2021: Angeliki Giannou (UW-Madison)

- Links: [video] [arxiv]
- Title: Survival of the strictest: Stable and unstable equilibria under regularized learning with partial information
- Abstract: In this work, we examine the Nash equilibrium convergence properties of no-regret learning in general N-player games. For concreteness, we focus on the archetypal follow the regularized leader (FTRL) family of algorithms, and we consider the full spectrum of uncertainty that the players may encounter – from noisy, oracle-based feedback, to bandit, payoff-based information. In this general context, we establish a comprehensive equivalence between the stability of a Nash equilibrium and its support: a Nash equilibrium is stable and attracting with arbitrarily high probability if and only if it is strict (i.e., each equilibrium strategy has a unique best response).
- Short Bio: Angeliki is a first-year Ph.D. student in the UW-Madison, advised by Prof. Dimitris Papailiopoulos. Prior to that, she received a Bachelor & M.Sc degree in ECE from National Technical University of Athens, working with Prof. Panayotis Mertikopoulos. Her research interests involve machine learning theory and optimization.

## Nov. 12, 2021: Chulhee Yun (MIT)

- Links: [video] [arxiv]
- Title: Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond
- Abstract: In distributed learning, local SGD (also known as federated averaging) and its simple baseline minibatch SGD are widely studied optimization methods. Most existing analyses of these methods assume independent and unbiased gradient estimates obtained via with-replacement sampling. In contrast, we study shuffling-based variants: minibatch and local Random Reshuffling, which draw stochastic gradients without replacement and are thus closer to practice. For smooth functions satisfying the Polyak-Łojasiewicz condition, we obtain convergence bounds (in the large epoch regime) which show that these shuffling-based variants converge faster than their with-replacement counterparts. Moreover, we prove matching lower bounds showing that our convergence analysis is tight. Finally, we propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings. Joint work with Shashank Rajput (UW-Madison) and Suvrit Sra (MIT).

Organizers: Jy-yong Sohn and Yuchen Zeng

Get the latest MLOPT Idea Seminar news: join the MLOPT Idea Seminar mailing list on Google Groups (you must have a google account to join).