A big problem in software development nowadays in the practical and academic sense is the Bug Duplication Problem [see the first discussion about it here]. RiSE Labs is working in this direction too and I was wondering about possible solutions about it. I would like to discuss it with you here. The discussion is based on scenarios and some possibilities.General Case: Don’t mess up my Repository
In this general case, the idea is that a tool can help you to avoid the mess. Maybe, we [user and tool?] can do it.
1st – Scenario – I cannot avoid it but I can help you with it (No intelligence both side)
The idea here is that the user sends a bug report for the environment without checking previous bug report and a tool cannot do anything to solve it. The tool just confirms it (Bug sent with success!). You can do some searches and see the reports after and remove it [if it is a duplicate].

2nd – Scenario – I can try to avoid it and maybe I can help you (Some intelligence in the tool side)
In this case the user sends a bug report – in the same way without checking a previous one – and the tool presents a message for the user that there is a possibility that it is a duplicate bug report. Yes, the tool can do some analyzes, compute some data and say that. The user has the opportunity to see the duplication and confirms in any case the manager receives a notification also about this possibility.
3rd – Scenario – Yes, we can have some intelligence in the process.
In this case the user has some templates and a well defined vocabulary to send the bug report. Maybe with a restricted one we can be more precise and the tool can have more accuracy in its prediction. Thus, the user based on some templates sends the reports and the tool can do some analyzes and predict a better possibility about the duplication and do the notifications.
4th – Scenario – Intelligent tool
Using a template or not, while the user is writing the bug report, the tool can perform some analysis and notify him with some confidence that that one can be a duplication and show why. Yes, the tool can combine semantic and textual information and say it for the user.
What do you think about it? Can you present more scenarios? Another possibility is a scenario as: My repository has duplicate reports [I cannot say that it is a mess or I can have problems with some companies] and what a tool can do in order to help me to be more productive? It is that I am starting a bug report and my tool cannot support the previous scenarios.

