Abstract:In recent years, the data management and analysis technique supporting Artificial Intelligence (AI) has become one of the hot issues in the field of big data and AI. Using and developing theory and technology of data management and analysis provide a basic support for improving the efficiency and effectiveness of the life cycle of AI systems and will surely further promote the development of AI technology based on big data and its wider application. In particular, AI technology represented by machine learning extracts knowledge by modeling data, and one training process includes multiple sub-processes such as data selection, feature extraction, algorithm selection, hyper-parameter tuning and effect evaluation. After the effect evaluation is obtained at the end of the training, it is usually necessary to manually analyze model effect to mine the relationship of model effect with data, features and algorithms, and the training sub-processes are adjusted and iterated for multiple rounds based on data analysis and artificial experience. Apparently, machine learning tasks are much more complicated than query and analysis tasks of database systems. Due to the large number of training sub-processes and iteration adjustments of machine learning and many sub-processes requiring manual participation, the training process is still task-oriented, and the training sub-process is customized and optimized according to the features of the task. This approach has a high cost of labor participation and cannot reuse resources such as data, features and models between multiple tasks. Therefore, there are problems of high cost, low efficiency and high energy consumption. Then how to reduce the management cost in AI computing processes such as machine learning to improve its intelligent computing efficiency has become a core challenge in this field.