ml4debugging
I want to propose a three point Manifesto that Governs the Communication between a Computer and its Programmer. The Manifesto reads as follows:
We can use the same technology to think of the error message generated by a computer language interpreter, along with the line of code containing the error, as the "source" language, and the natural language description of the error message as the "target" language. An example:
This is the present error message of a Python interpreter in response to a fragment of Python code: The idea is not necessarily to replace the error message or stack trace of the interpreter, but possibly to supplement it with a message in natural language. You can see a more complete list of such message pairs at this link: Present and Correct Message Pairs. The primary challenge in this project is to put together the training data that will be used to train the neural network, and I am still searching for a clever way to do it which does not make it necessary to hand-write all the message pairs. There are a number of efforts underway to improve the quality of error messages, specifically for Python, the language discussed in this website. Some of these come from the creators of Python itself, in the most recent versions of the language. This is the only site I know of that proposes a Machine Learning approach. In recent years, large scale text generation models such as GPT-3 by OpenAI and GPT-J and GPT-NeoX by EleutherAI have come into play, and they may provide a clue as to how to proceed. My personal website has some information about me. |