It has been a month since GitHub launched the Copilot service. This AI programming tool is trained based on trillions of publicly available codes extracted from GitHub and English examples. It can automatically complete entire lines of code or entire sections of functions, generate corresponding codes based on comments, write tests, and quickly find solutions. The alternative method of the problem has brought a great efficiency improvement. However, since the official announcement a month ago, discussions about copyright, open source licenses, privacy and security have never ceased.
Recently, the Free Software Foundation (FSF) also expressed doubts and solicited white papers to discuss the philosophical and legal issues of Copilot.
The Free Software Foundation is a private non-profit organization dedicated to promoting free software. Its main job is to run the GNU project and develop more free software. Since the mid-1990s, employees and volunteers of the Free Software Foundation have mainly worked on the legal and structural issues of the free software movement.
FSF Licensing Compliance Manager Donald Robertson pointed out in a blog that Copilot is unacceptable and unjust . It needs to run non-free/free software (such as Visual Studio or a part of Visual Studio Code), and Copilot is a Service as a Software Substitute, which is based on the calculation of others.
FSF said that the use of Copilot for freely licensed software has an impact on a large part of the free software community: developers want to know whether training neural networks on their software is a fair use; others who are interested in using Copilot want to know from GitHub Whether the code fragments and other elements copied in the hosted repo will cause infringement; even if these are legally reasonable, the developer also wants to know whether a proprietary software company uses the developer’s work to build a service, whether there is a fundamental infringement fair.
In order to help the community get the answers it needs, and seek the best opportunity to defend user freedom in this area, FSF announced that it will fund a white paper solicitation to solve Copilot, copyright, machine learning, and free software issues. The FSF stated that it will read the submitted white papers and select white papers that help clarify the issues for publication. The published articles will receive a $500 bonus. The deadline is August 23, 2021.
In addition, FSF lists some questions of interest:
- Does Copilot's training based on public repositories violate copyright? Is it fair use?
- How likely is the output of Copilot to cause an actionable infringement claim for GPL licensed works?
- How can developers ensure that their copyrighted code is not affected by Copilot?
- Is there a way for developers who use Copilot to comply with free software licenses, such as GPL?
- If Copilot learns the code covered by AGPL, does Copilot violate AGPL?
- If the code generated by Copilot does cause an infringement of a free software licensed work, how can the copyright owner detect such an infringement?
- Is the artificial intelligence/machine learning model produced by machine learning a compiled version of the training data, or something that users can modify through further training, such as source code?
-Is the AI/ML model trained by Copilot copyrighted? If yes, who owns the copyright?
- Should ethics advocacy organizations like FSF advocate for changes to copyright laws related to these issues?
Regarding the FSF's protest, GitHub expressed its willingness to be open to any issues. "This is a new field, and we look forward to participating in discussions on these topics with developers, leading the industry to develop appropriate standards for training AI models.