Research


The research themes at IDEAS lab include, but are not limited to, the following aspects where most of the work is either about AI for SE or SE for AI:

Modeling and analyzing the behaviors of software systems is important for any form of reasoning conducted on the effects of adaptation solutions to non-functional quality, such as response time, throughput and energy consumption. The challenge is concerned with (1) how to efficiently build accurate model that captures the correlation between a wide range of adaptable features (with or without environmental factors) and the performance attribute; (2) identifying the root cause of performance problems. The produced model often serve as inexpensive tool to evaluate a possible adaptation solution, which in turn, can greatly speed up the profiling/decision making process, either at design time or at runtime.

Example work:

  • Hybrid feature selection and adaptive machine learning modeling using information theory [TSE17] [SEAMS13] [UCC14]
  • A framework that exploits debt aware and machine learning-driven modeling [ICPE18]
  • The first empirical comparison between incremental and retrained machine learning algorithms for modeling adaptable software. [SEAMS19]
  • An empirical study about the encoding schemes for configuration performance learning [MSR22]
  • A dividable learning framework for dealing with sparsity in configuration performance learning [FSE23]
  • A sequential learning framework for configuration performance learning under diverse environments [FSE24]

Software architecture serves as the blueprint for engineering software systems, and it is of particular important for adaptable software, which contains various different decisions to make during the engineering process. Yet, the difficulty lies in the fact that engineering complex and adaptable software systems often lack of general guidance at the architecture level, due to the high degree of dynamics and uncertainty involved. A clear guidance on architecting such software systems will not only reduce the cost in the engineering process, but also improve the quality of the developed software system.

Example work:

  • A set of architectural patterns and guidelines for engineering self-aware software systems [IEEE Com15a] [IEEE Com15b] [Arxiv14]
  • Symbiotic architecture style with dynamic architectural granularity of software systems [SEAMS14]
  • A survey and taxonomy of the architecture styles for self-aware and self-adaptive cloud autoscaling systems, including the underlying algorithms [CSUR18]
  • A patternized architecture guideline that synergizes SE domain expertise and AI algorithms to engineer self-aware and self-adaptive software systems [Proc. IEEE20]

The performance quality, e.g., latency and energy consumption, is a crucial, but challenging property to maintain due to the complexity in code, configuration and architecture of the software. The key difficulty lies in the increasing complexity of software systems, e.g., high number of features, dependencies and the non-functional quality to be tuned, as well as the different forms of domain knowledge that the software engineers own. Further, there is often a great extent of conflicting between the optimizing objectives, e.g., latency versus energy consumptions. The problem becomes even more difficult at runtime due to the uncertainty about when changes occurs and the extent of changes. To tackle the challenges, developing automatic and dynamic reasoning processes that relies on stochastic search algorithms, which enables effective and efficient optimization, profiling and management of the software system's behaviours, is a promising solution.

Example work:

  • A framework for feature and knee point guided multi-objective evolutionary algorithm to optimize software performance [TOSEM18]
  • Seeding evolutionary algorithm for multi objectively optimizing software services [IST19]
  • Dynamic and tailored ant colony optimization for cloud software services with balanced trade-off results across different tenants [TSC17]
  • An empirical study on the optimization for adaptable cross-project defect prediction software [ICSE20]
  • A two-level reasoning algorithm to self-adapt service composition [SEAMS20]
  • A bi-level optimization framework to build model and tune the highly-configurable cross-project defect predictors [ASE20]
  • A tailored multi-objectivization model for tuning software configuration with mitigated local optima [FSE21]
  • A multi-modal optimization approach for lifelong performance planning of self-adaptive systems [SANER22]
  • A methodology for planning landscape analysis in self-adaptive systems [SEAMS22]
  • A long-term debt aware adaptation framework for service composition in the cloud [TSC23]
  • An empirical study on whether and how performance preferences/aspirations can affect configuration tuning [TOSEM23a]
  • A theory and normalization improvement on the multi-objectivization model for tuning software configuration [TSE24]
  • An adaptive multi-objectivization model for tuning software configuration [FSE24]

SBSE often relies on advanced metaheuristic optimization algorithms, e.g., the evolutionary algorithms, to optimize various complex tasks in software engineering. However, there are many general research questions in SBSE, e.g., how to select quality indicator? Which objective normalization strategy is more suitable under multiple objectives? How to better specialize the search operators?

Example work:

  • A criticism and corrected guidance on quality indicator selection for SBSE problems [ICSE-NIER18]
  • A discussion on what was going wrong with the current way of evaluating solution sets in SBSE, together with comprehensive guidance on the selection of quality indicators and evaluation methods. [TSE21]
  • An empirical comparison between weighted search and Pareto search on multi-objective SBSE from different perspectives. [TOSEM23b]

Assuring software quality and its reliability is an important aspect to consider. This includes intelligent testing, finding bugs and eventually fixing them. In particular, it is challenging to consider the naturalness of source code, together with many other documents that produced during the software engineering process.

Example work:

  • A transformer-based framework that learns the semantics of SQL and generates test cases to find SQL injection vulnerabilities [ISSTA20]

Assuring software quality and its reliability is an important aspect to consider. This includes intelligent testing, finding bugs and eventually fixing them. In particular, it is challenging to consider the naturalness of source code, together with many other documents that produced during the software engineering process.

Example work:

  • An empirical study of the practice on how performance bugs are reported for deep learning frameworks [EASE22]
  • A multifaceted hierarchical attention network for automatically identifying performance-related bug reports for deep learning frameworks [APSEC22]