TY  - JOUR
T1  - Database Native Model Selection
T2  - 50th International Conference on Very Large Data Bases, VLDB 2024
AU  - Xing, Naili
AU  - Cai, Shaofeng
AU  - Chen, Gang
AU  - Luo, Zhaojing
AU  - Chin Ooi, Beng
AU  - Pei, Jian
N1  - Publisher Copyright:
© 2024, VLDB Endowment. All rights reserved.
PY  - 2024
Y1  - 2024
N2  - The growing demand for advanced analytics beyond statistical aggregation calls for database systems that support effective model selection of deep neural networks (DNNs). However, existing model selection strategies are based on either training-based algorithms that deliver high-performing models at the expense of high computational cost, or training-free algorithms that enhance computational efficiency with reduced effectiveness. These strategies often disregard computational cost and response time Service-Level Objectives (SLOs), which are of concern to average or budgetconscious machine learning users. In addition, they lack a welldesigned integration of the model selection algorithms with DBMSs, which hinders efficient in-database model selection. This paper presents TRAILS, a resource-efficient and SLO-aware in-database model selection system. To leverage the strengths of both trainingfree and training-based model selection, we first characterize nine state-of-the-art training-free model evaluation metrics and propose a more effective one named JacFlow, and then, restructure the conventional model selection procedure into two phases: filtering and refinement. A novel coordinator is also introduced to strike a balance between the high efficiency of train-free algorithms and the high effectiveness of training-based algorithms, ensuring high-performing model selection while adhering to target SLOs. Moreover, we incorporate the proposed algorithm into PostgreSQL to develop TRAILS, thereby both enhancing resource efficiency and reducing model selection latency. This integration establishes a foundation for declarative model definition and selection within DBMSs. Empirical results demonstrate that our TRAILS reduces model selection time and computational expenses considerably by up to 24.38x and 29.32x respectively compared to existing model selection systems.
AB  - The growing demand for advanced analytics beyond statistical aggregation calls for database systems that support effective model selection of deep neural networks (DNNs). However, existing model selection strategies are based on either training-based algorithms that deliver high-performing models at the expense of high computational cost, or training-free algorithms that enhance computational efficiency with reduced effectiveness. These strategies often disregard computational cost and response time Service-Level Objectives (SLOs), which are of concern to average or budgetconscious machine learning users. In addition, they lack a welldesigned integration of the model selection algorithms with DBMSs, which hinders efficient in-database model selection. This paper presents TRAILS, a resource-efficient and SLO-aware in-database model selection system. To leverage the strengths of both trainingfree and training-based model selection, we first characterize nine state-of-the-art training-free model evaluation metrics and propose a more effective one named JacFlow, and then, restructure the conventional model selection procedure into two phases: filtering and refinement. A novel coordinator is also introduced to strike a balance between the high efficiency of train-free algorithms and the high effectiveness of training-based algorithms, ensuring high-performing model selection while adhering to target SLOs. Moreover, we incorporate the proposed algorithm into PostgreSQL to develop TRAILS, thereby both enhancing resource efficiency and reducing model selection latency. This integration establishes a foundation for declarative model definition and selection within DBMSs. Empirical results demonstrate that our TRAILS reduces model selection time and computational expenses considerably by up to 24.38x and 29.32x respectively compared to existing model selection systems.
UR  - http://www.scopus.com/pages/publications/85190424999
U2  - 10.14778/3641204.3641212
DO  - 10.14778/3641204.3641212
M3  - Conference article
AN  - SCOPUS:85190424999
SN  - 2150-8097
VL  - 17
SP  - 1020
EP  - 1033
JO  - Proceedings of the VLDB Endowment
JF  - Proceedings of the VLDB Endowment
IS  - 5
Y2  - 24 August 2024 through 29 August 2024
ER  -