IFIN Generative AI Policy
Introduction
"Generative AI," or "GenAI," is defined as any tool, application, service, or algorithm that employs machine learning techniques against a corpus of training today to produce material resembling human creative output. This is distinct from "predictive" AI, which produces numerical or categorical predictions and classifications based on input data. While generative AI uses predictive AI components, its output requires specific consideration.
The value proposition of GenAI is the productivity increase afforded by the models performing the "heavy lifting" of creative output. Provided with a prompt, the model will produce material that is similar in form to what a human would have produced manually. This value proposition is confounded by fundamental flaws in the technology, as well as the negative legal, societal, and environmental externalities of the technology at scale.
Policy
The use of genAI in contributions to IFIN is strictly prohibited. "Contributions" is defined in the IFIN Collaboration Agreement.
Hallucination/Misinformation
"Hallucination," or "confabulation," is the tendency of generative models to produce output that, while authoritative in tone, has no factual basis. This is an unavoidable result of the nature of model training and optimization in its current form. Consequently, all model output must be scrutinized and validated for accuracy and veracity. This labor increases with the size of the model output to review. Especially in text (i.e. not computer code) output, significant domain expertise is required to validate the model's text. Having an expert produce the text in the first place is almost always a better use of resources.
For computer code specifically, hallucinations can result in hard-to-identify bugs or vulnerabilities. This is especially true when codebases are created in whole or large part by generative models. Lack of familiarity with the codebase leads to an inability to verify the output of code. While a mature software development lifecycle may catch these issues beforehand, there is the possibility that unit and integration tests themselves are AI-generated. Put simply, the greater the amount of trust in genAI code, the higher the probability of bugs and vulnerabilities.
Misinformation is another unavoidable consequence of the nature of genAI. Models have no "understanding" of the vectors they evaluate to produce their predictions and output. Causality, chronology, and other logical structures common to human reasoning are not present in genAI, regardless of marketing terminology. Conclusions and interpretations within model output are simply probabilistic results based on input, and can be easily manipulated by any of the input data to produce inaccurate results.
So far, the risks discussed have not considered malicious intent from any of the data sources involved in producing the model's inferences. When the internet itself becomes a source of input values, that can not be safely assumed. Indirect prompt injection occurs when malicious instructions are fed to the model without the knowledge of the user. These instructions affect the output of the model, sometimes altering factual statements, possible phishing attacks, and even direct data leakage. Data leakage in particular is a major concern in agentic systems, where models are given direct control over tools that can make contact with the outside world, possessing any number of secrets in their context.
Copyright/Licensing
Generative models were developed using copyrighted materials for "training" (preparing models to produce novel output based on novel input). Litigation has already begun to reward copyright holders for genAI companies' appropriation of their intellectual property. Other cases are ongoing, and the liabilities for producing potentially copyrighted material with generative AI are unclear. The safest way to insulate a given organization from legal liability is to avoid the use of genAI in business products.
Environmental
Although specifics differ in reporting the current level of environmental impact of both training generative models and ongoing inference, the overall picture is clear: the insatiable demand for computing capacity and the energy necessary to power it has significant detrimental impacts in the forms of emissions and water usage. Construction of both data centers and energy production facilities—few of which are renewable—deleteriously impacts residents near these sites. While some AI advocates claim these systems will somehow invent solutions to the problems they now cause, this is magical thinking that downplays the actual risks posed by the technology today.
Conclusion
With so many downside risks, and with such dubious benefit to usage, the choice for IFIN is clear: we choose not to participate in the toxic cult of generative AI.