Wikibooks:Artificial intelligence
This page documents an official Wikibooks policy that the Wikibooks community has accepted and Wikibookians must follow. Except for minor edits, please make use of the discussion page to propose changes to this policy. |
| This policy has an unstable branch for proposing changes. |
The following policy outlines the Wikibooks community's consensus on the use of content from generative artificial intelligence models ("Gen AI").
Text generation
Large language models (LLMs), often referred to as "AI chatbots" or simply "AI", can be beneficial in certain circumstances. Well-known models include Gemini, ChatGPT, and Copilot. However, like human-generated text, machine-generated text can also contain errors or flaws, or even be entirely useless. In particular, requesting a language model to write a book or an essay can sometimes cause the production of complete fabrications, including fictitious references. The output can be biased, libel living people, infringe on copyrights, or simply be of poor quality. This might not pose a large risk for very rote tasks within closed communities; however, these issues can quickly become problematic in large communities and environments where knowledge transfer, verifiability, accountability, and critical thinking are important. In particular, the high volume and speed at which LLMs can generate content, all of which would need to be verified, means that their use poses an outsize risk. As such, LLMs may not be used to generate or summarize material and ideas at Wikibooks, and their sources should not be blindly trusted.
Translation
Translations made by LLMs are not allowed on Wikibooks due to potential issues as described above. Please see Wikibooks:Content translation for further information.
Media
Many generative AI models can generate media, such as images and videos, from prompts. If you are interested in uploading this media, please be aware of relevant policies on licensing at our sister project Wikimedia Commons or our local policy on images, depending on whether you are uploading it here or there.
Required disclosure
Any permissible media made with the help of an LLM must be explicitly marked as such in both the edit summary and the page's discussion page. The following information must be provided:
- The date of generation/addition
- The tool and tool version used (e.g. Gemini, ChatGPT, Midjourney)
- The prompt(s) fed into the tool
This applies to every instance of using AI content. If you create new prompts and incorporate them into a page multiple times, each instance must be documented, including on the talk page.
Detection and enforcement
As of this policy's adoption, there are no reliable, high-quality tools capable of detecting AI-generated materials. Any tools that claim to positively identify AI-generated materials should not be relied on to enforce this policy. Instead, editors will have to be on the lookout for various issues, such as:
- Illogical or meaningless sentences
- Sentences, phrases, or arguments that seem coherent on the surface but do not hold up to scrutiny
- Word changes that inappropriately change the meaning of a sentence
- Phrasing that suggests it was generated in response to a prompt
- Flawed or 'too perfect' images
- Hallucinations (incorrect information about which the AI is confident; often uncited, citing a nonexistent source, or citing a source that does not actually support the claims made)
- Rapid, bulk additions of problematic content with no evidence of in-progress work
- Non-wiki formatting and markup
Most of these issues, however, can occur without the use of generative AI tools. If you detect these issues, you should first engage with the contributor in good faith to point out and address the problematic content, with the goal of resolving the issues. If good faith discussion and guidance fails, or if repeated, unambiguous violation of this policy is found, problematic editors may be subject to warning and subsequent editing restrictions.
Use of copyright violation detectors (e.g. Earwig's Copyvio Detector) can be used to help identify text copied verbatim from online sources.