Analyzing the Impact of Similarity Measures in Duplicate Bug Report Detection
Author | : Som Gupta |
Publisher | : |
Total Pages | : 0 |
Release | : 2020 |
ISBN-10 | : OCLC:1376891372 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Analyzing the Impact of Similarity Measures in Duplicate Bug Report Detection written by Som Gupta and published by . This book was released on 2020 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Duplicate Bug Report Detection is one of the very important tasks which is done during the assignment of bug reports to the concerned developer. As the Bug Reports of Open-Source projects are usually submitted by persons all over the geographical locations, the submission process is uncoordinated. Moreover this un coordinated submission leads to duplicate bug reports also. Bug Report Triager has to usually go through the tedious process of manually detecting the duplicate bug reports. Automatic Duplicate Bug Report Detection assists in easing the work of detection of duplicate bug reports. Survey shows that calculation of bug reports on the basis of similarity measures is the best way to perform this task of duplicate bug report detection task as the unbalanced data leads to im-balancing problem for machine learning approach. In this paper, we analyze how the different similarity measures impact the task of duplicate bug reports. For our analysis purpose, we have used Levenshtein, Jaccard, Cosine, BM25 , LSI and K-Means similarity measures. By including these similarity measures for the analysis purpose, Natural Language Processing, Machine Learning and Information Retrieval techniques are covered.