Legal framework of textual data processing for Machine Translation and Language Technology research and development activities/Licensing
Types of licences
Τhis is the most important exploitation and dissemination instrument for copyright law. It is essentially a set of permissions covering the rights that the right-holder/licensor wishes to grant to the licensee. In the case of MP/MT, licences are important both as an instrument for receiving permissions (licensing in) and as instrument for granting permissions (licensing out). The following set of definitions provide the key types of licensing, making reference to PSI licences as well. PSI licences are of particular interest as they constitute a key instrument for the release of huge amounts of information by Public Sector Bodies (PSBs). Non-transactional Licences are the licences that take effect through the actual use of the licensed material, without the need for any additional transactions.
- Re-use Licences are the licences that allow the re-user to use the PSI in a fashion other than the one originally intended by the PSB, but are not necessarily non-transactional, neither open or standardised. These are licences used almost exclusively in the PSI context.
- Re-usable Licences are standard licences that are publicly available and may be re-used by any licensor without any modifications. A highly re-usable licence is normally stored at permanent URI and has a community supporting its updating.
- Standard Licences are licences that are addressed to a non-specified range of recipients and are not the result of individual negotiations between the licensor and the licensee.
- Open Access (OA) is online access to peer-reviewed scholarly research. OA has two degrees: (a) gratis OA, which is online access to scholarly resources for free, and (b) libre OA, which is online access to scholarly resources for free, with some additional freedoms for the end/re-user, which are normally granted through Creative Commons or other Open Licences. These are described, in this Report, as Open Access Licences.
- Open Licences are all standard, non-transactional licences that, to some extent, allow the end user to engage in the 4Rs, i.e. Re-use, Revise, Remix and Redistribute. Licences that allow all 4Rs under the sole condition of attribution or share-alike, comply with the Open Knowledge Definition and constitute Open Knowledge Definition (OKD) Conformant Licences. From the Creative Commons Licences, only Creative Commons Attribution and Creative Commons Share-Alike are Open Knowledge Definition Conformant Licences. From the Open Knowledge Foundation, the Open Data Commons Open Database License (ODbL) and the Open Data Commons Attribution (ODC-BY) licences are ODF conformant. According to the OKF, UK's OGL 2.0 is characterised as a conformant but "non-reusable" licence", in the sense that it cannot be re-used by any public sector body (PSB) in any EU Member State. However, it needs to be stressed out that the OGL was drafted to be used by any UK PSB, not just the UK government departments and agencies, and covers all PSI, including but not limited to Crown Copyright. The PSI released under the UK OGL 2.0 may be licensed under CC-BY or ODC-BY.
- Copyleft Licences are all licences containing terms that allow modifications to the licensed work on the condition that the work is further disseminated under the same terms and conditions. In the Creative Commons set of licences, all Share-Alike licences are copyleft licences.
- Input Licences: are all licences acquired by a PSB in order to be able to release PSI.
- Output Licences: are all licences under which PSB makes content available to re-users.
- Implied licences: are licences that are not expressed in the form of a text but are rather implied by the conduct of the licensor. Examples of an implied licence would be allowing web robots (bots) and crawlers to obtain data or content from a web page by not instructing them otherwise through the html code of the relevant web page.
Exemplary case: Creative Commons Licences
All Creative Commons (CC) licences are comprised of a combination of the four high level Licence Elements. These are the core terms of the CC licences that may be combined with each other in order to produce the different CC licences. The CC licence elements are the following:
- No Derivatives
The licensor may choose a combination of the above in order to build the licence that suits most her needs. In the rest of this subsection we present each of the elements separately.
While licensors may choose any of the above elements, all the CC licences contain the attribution element. This has been the result of the need to accommodate moral rights within the CC licences and also an element that in the first year of the operation of the CC licences has been chosen by all CC licensors. One of the key functions of the CC licences is to allow the maximum dissemination of the work in order to increase the reputation of the creator and, in that sense, the Attribution element is a quintessential element of any open content licence. In addition, for the rights holders and authors not wishing to use the attribution element, there is always the CC Zero tool that waives all economic rights and does not contain any positive attribution requirements. However, even in CC Zero it is possible to require attribution not in the form of a formal legal requirement but rather in the form of a soft norm that is attached to CC Zero. The CC wizard allows licensors to expressly describe how they would like to be attributed. This is an important aspect of the CC licences as it allows the licensor/author to opt for the attribution model that is closer to her objectives and goals.
The Non-commercial element is one of the most widely used. It grants the licensee permission to copy, distribute, display, perform, and remix the licensed work for non-commercial purposes only. The Non-commercial Licence Element means that the licensee cannot use the work commercially unless she receives an additional permission from the licensor. However, because CC Licences are non-exclusive, the Non Commercial Licence Element would allow the Licensor to themselves commercially exploit the work, and grant licences to others to be able to use the work for commercial purposes.
It is important to highlight that the Share-Alike element refers to derivative works. It means that if the licensor creates a derivative work and decides to further disseminate it or otherwise make it available, this needs to be done under the same terms and conditions as the original licence. For instance, if the original work is disseminated under a CC BY NC SA licence, then the derivative work also has to be disseminated under the same licence. The CC BY SA licence is the licence used by Wikipedia and it allows licensing of the derivative works under any licence that has been approved as CC BY SA compatible by Creative Commons. The SA element does not refer to the original works that are always licensed directly from the licensor and hence are always licensed under the same terms and conditions no matter how many copies are being made. The SA element is a viral element in the sense that it triggers a proliferation of the CC licences as more derivative works are produced and in that sense the CC SA licences are copyleft licences.
No Derivatives (ND)
The No Derivatives licence element is used in order to allow licensees to copy, distribute, display, and perform only verbatim copies of your work. The creation of derivative works is not allowed unless an explicit permission is obtained from the licensor. The ND licence element is used in case where the licensor does not wish to allow any changes made to the original work but want to encourage people to disseminate it as widely as possible.
List of possible combinations The combination of the aforementioned elements results in the following possible combinations:
- Attribution Non-commercial
- Attribution Non-commercial Share Alike
- Attribution Non-commercial No Derivatives
- Attribution No Derivatives
- Attribution Share-Alike
Summing up the basic features of the CC project
The CC project has features that may be useful in projects that go beyond the problem of Copyright law. They were originally built in order to accommodate the need for a user-centric system where the individual is able to re-use the content without having to obtain permission for every subsequent use of the work. This is particularly useful in cases of user generated content and there is the question whether elements of this model could be used for the management of personal data. The most important of the CC project features are as follows:
- It is a human centric project: unlike standard licence agreements issued by companies that have their own legal teams, the CC licences are public, that is, they are offered to copyright holders to license their works and are addressed to potential re-users of the work. That means, they need to be easy to read and understand both by the licensor and the licensee.
- It is a licensor centric – author respecting project: it respects the rights of the author in the sense that she decides whether to license the work or not and moral rights are respected in all licences
- It is machine readable: it acknowledges the fact that most of the searching of the works is being done automatically and in that sense search engines need to be able to “read” and recognise the respective licence elements. This facilitates the easy identification, re-use and marking of the work and thus is in accordance with current creative practices on the Internet.
- It is standardised and modular: the licences are standard and in that sense their operation is simple and clear and once interoperability is achieved this is valid for the entirety of the CC project. At the same time, the CC licences are modular, i.e. they allow a limited set of combinations that provide the necessary level of flexibility to prospective licensors and users.
- It has paid great attention to its organisational and institutional rolling out: the project has been very successful not only because of the features of the licence but also because it has been rolled out through universities and the legal professors that actually taught the use of the licences to future lawyers. The process of transferring the licences to different legal systems involved the key academic and often professional institutions in different countries and in that sense allowed the institutional acceptance of the licences. Any similar process in any other area, such as privacy should involve a similar procedure.