In-depth analysis: Automated machine learning from the perspective of bilevel optimization

(a) Illustration of key issues of ML task. (b) Formulation of AutoML paradigm from the perspective of bilevel optimization. Credit: Science China Press

Recently, professors Risheng Liu from Dalian University of Technology and Zhouchen Lin from Peking University collaborated on an opinion article published in the National Science Review (NSR). Their article delves deeply into AutoML from the perspective of bilevel optimization, achieving unified modeling of various AutoML tasks while exploring challenges and opportunities. This article will be included in the NSR's special topic on "Automating Machine Learning."

Generally, AutoML requires the automation of three key tasks, including meta-feature learning, neural network architecture search, and hyperparameter optimization. Bilevel Optimization (BLO) is an effective mathematical tool for modeling these tasks, providing a unified AutoML framework. This framework achieves the core objective of AutoML: constructing high-performance models with minimal human intervention.

Specifically, in the upper-level optimization, the core variables are "meta-parameters," aiming to seek the optimal "methodology" to achieve performance optimization of machine learning models on the validation set (such as meta-features, network structures, and hyperparameters). On the other hand, the core variables in lower-level optimization are "model parameters," focusing on optimizing model performance on the training set.

Currently, ML/AutoML technologies, represented by gradient-based BLO algorithms, have gradually gained prominence. However, they still face numerous challenges in practical applications.

For instance, some algorithms heavily rely on the singularity and convexity of lower-level problems, limiting their practicality in real-world scenarios. Additionally, when employing approximate substitution methods in practical applications, there is a lack of theoretical analysis regarding the rigorous convergence of algorithms.

In the future, the challenges faced by BLO in the field of AutoML and promising research directions mainly include the following aspects:

Compute Acceleration: As the scale of datasets expands and task complexity grows, there is an urgent need to accelerate the computational speed of BLO algorithms in handling large-scale, high-dimensional AutoML tasks. Parallel/distributed computing technologies could serve as an effective approach to address this issue.

Theoretical Breakthroughs: Presently, gradient-based BLO methods heavily rely on stringent theoretical assumptions, such as the assumption of submodularity and convexity in lower-level problems. To meet the demands of real-world applications, there is a necessity to construct new theoretical analysis frameworks and efficient computational methods to handle better more challenging practical scenarios involving non-convexity and discreteness.

Optimization-Derived Learning: From the new perspective of bi-level optimization, we can explore disruptive AutoML technologies that integrate Simulation Learning Methodology (SLeM), especially when integrated with large models. This exploration involves delving deeper into the underlying logic of AutoML to design more efficient and precise learning strategies.

In summary, this article has achieved unified modeling of different AutoML tasks from the perspective of BLO. It extensively analyzes the current state and future directions of AutoML centered around the development of BLO algorithms. The novel viewpoints presented in this article contribute to advancing AutoML, empowering artificial intelligence technology to progress toward more intelligent and efficient realms.

More information: Risheng Liu et al, Bilevel optimization for automated machine learning: a new perspective on framework and algorithm, National Science Review (2023). DOI: 10.1093/nsr/nwad292

Provided by Science China Press

In-depth analysis: Automated machine learning from the perspective of bilevel optimization

Robotic system feeds people with severe mobility limitations

New study finds AI-generated empathy has its limits

New approach uses generative AI to imitate human motion

'Digital afterlife': Call for safeguards to prevent unwanted 'hauntings' by AI chatbots of dead loved ones

AI and holography bring 3D augmented reality to regular glasses

Lab's AI work results in increased revenue, decreased land requirements for wind power industry

Teaching robots to move by sketching trajectories

A new, low-cost, high-efficiency photonic integrated circuit

Scientists determine disorder improves lithium-ion battery life

Chemists present roadmap to a carbon-neutral refinery by 2050

Flexible pseudocapacitor defies climate extremes, packs energy punch

A low-energy process for high-performance solar cells could simplify the manufacturing process

Researchers identify cause of electron-hole separation in thin-film solar cells to increase solar cell efficiency

Video shows how swarms of miniature robots simultaneously clean up microplastics and microbes

New large learning model shows how AI might shape LGBTQIA+ advocacy

Computer scientists discover vulnerability in cloud server hardware used by AMD and Intel chips

Why getting in touch with our 'gerbil brain' could help machines listen better

New process brings commercialization of CO₂ utilization technology to produce formic acid one step closer