TU München - Fakultät für
Es sprechen Studenten Ã¼ber ihre abgeschlossenen Diplomarbeiten und Systementwicklungsprojekte.
Am Freitag, 23.03.18, ab 10:00 Uhr, im Raum "McCarthy" (01.11.051):
Implementation and Evaluation of Static Feature Location in Practice
Software is often developed following an incremental development process, meaning that once an initial version is finished, it is constantly adapted and extended. Some software projects grow to very large sizes, where a single person cannot know the location of all the features in the code. Because this information is essential for implementing a new functionality or fixing a bug, feature location aims to help with the task of the correct location in the codebase. For this thesis, we had the following usage scenario in mind. A developer is assigned the task of implementing a bug or feature described textually, usually in form of an issue from an issue tracker. They then use this description as input for the feature location algorithm to find promising points in the code to start the implementation. Since the program is never executed in this scenario, we decided to exclude dynamic feature location techniques and focus on static ones. We analyzed existing approaches in terms of their usability for our usage scenario and discovered, that despite there being many publications on potential approaches, publicly available source code or implementations are mostly missing. Therefore, we implemented our own, TF-IDF based algorithm based on Apacheâ€™s Lucene search library. We then evaluated the viability of this algorithm using five open-source and one industrial project. Our results show, that there are major drawbacks to using solely TF-IDF for feature location, namely the inability to identify synonyms and even obvious results, like classes from a stack trace in the description. For our evaluation, we implemented a benchmark system, which is easily extendable with other static feature location techniques or data sources, so that it can be used for further research. It executes feature location algorithms on a set of benchmarks and calculates the precision, recall, F1 measure and top-5 precision for the individual queries of the benchmarks. One can use it to experiment with or compare different static feature location techniques.