Australian AI needs a copyright free zone

| March 25, 2026

Australia’s copyright laws block domestic AI development, leaving its institutions dependent on foreign models that foreign governments can compromise.

No Five Eyes partner has a legal framework that supports domestic training of frontier foundation models, the general-purpose systems trained at the largest scales of computing and data, capable of reasoning across domains such as code, strategy and law. Yet defence and intelligence systems across those countries will increasingly run on exactly these models. If they are trained elsewhere under conditions Canberra won’t have set and will be unable to verify, that is not just a copyright policy; it’s a supply-chain vulnerability.

In October 2025, Attorney-General Michelle Rowland confirmed that Australia would not suspend copyright on material used for AI training. It would instead favour a model that prioritised copyright licensing to protect creative industries. The stance is principled. It is also becoming a strategic liability due to Australian institutions depending on models built elsewhere. Since it is impracticable for AI developers to negotiate with and pay thousands or millions of copyright holders, material that’s copyrighted is off-limits. The developers and their models suffer the debilitating limitation of using only non-copyright material.

Australia’s only domestically trained large language model project, Maincode’s Matilda, is trained on copyright-free material and narrow domain data; it is not a frontier system and does not claim to be. Nor can it be without access to material that’s generally copyrighted.

US tech company Meta downloaded more than 160 terabytes of copyrighted material from shadow libraries, such as Anna’s Archive, that preserve repositories of otherwise unavailable media, to train its AI models. Internal emails unsealed through a litigation process revealed downloads from multiple shadow libraries between April and July 2024. In June 2025, a US federal court ruled the training use was fair, but the judge said the ruling was driven by the plaintiffs’ failure to develop evidence of market harm; it was not a general endorsement of training on copyrighted works.

In this area, China is permissive by design: it lets its AI developers freely use copyrighted material. Representatives from Anna’s Archive, the largest shadow library, said in early 2025 that the archive provided high-speed data access to about 30 companies training large language models, most of them Chinese. The European Union has imposed licensing that slows its own developers while doing nothing to constrain models trained elsewhere. Australia, Britain and New Zealand have no legal pathway at all. Constraint binds those who observe it. Advantage accrues to those who do not.

The security risk is compounded by how these models learn. Pre-training shapes a model’s foundational representations of language, reasoning and knowledge. Fine-tuning cannot adjust what was never learned. The pre-AI, human-authored body of work is finite and non-renewable. Research confirms that models trained iteratively on synthetic, AI-generated outputs degrade in robustness and generalisation. Shadow libraries preserve what commercial licensing systematically excludes: out-of-print texts, non-English works, academic materials and culturally significant works without clear rights-holders. No single national institution could feasibly assemble a comparable collection, and current copyright laws make it illegal to try. These works are finite. Once inaccessible, they cannot be recreated.

Countries that cannot train their own frontier-scale AI systems become consumers of foreign AI models, inheriting assumptions and failure modes reflecting someone else’s design decisions across every domain those models touch. Data governance is now a strategic-resource issue comparable to critical minerals and advanced semiconductors.

This matters because foundation models can be compromised with surprising ease. A bugdoor is a vulnerability that appears accidental but functions as a covert access vector. In AI systems, bugdoors are spread across millions of model parameters rather than identifiable in discrete code, making them exceptionally difficult to detect. Research by Anthropic, Britain’s AI Security Institute and the Alan Turing Institute demonstrated that as few as 250 malicious documents can produce a backdoor in a large language model, regardless of model size or total training data volume. Separate research by Anthropic has shown that once embedded, such backdoors survive every standard safety technique applied to them.

The 2025 US House Select Committee report on Chinese AI company DeepSeek illustrates what Australia could open itself up to by using foreign models: systematic censorship in accordance with Chinese law, with user data flowing through infrastructure subject to foreign national-security legislation. But the risk is not confined to Chinese systems.

The United States and Five Eyes partners maintain legal authorities to compel technology providers to grant intelligence access and, in some cases, to build new technical capabilities into their systems. It would be strategically incoherent to assume that AI models, far more general-purpose than any prior infrastructure layer, would be exempt from similar obligations. What matters is whether the leverage is visible, and whose interests it serves.

If Australia cannot verify the source of the foundation models its defence systems depend on, it has a strategic vulnerability it has chosen not to close. These models cannot be meaningfully audited after training; the behaviours embedded in their weights are not legible to external review. What can be verified is what went in, and a lack of transparency on this data introduces strategic risk.

Making Australia’s copyright principles strategically viable requires three things.

First, a sovereign corpus trust. Licensing alone is insufficient if the bodies of work it seeks to govern are taken offline or made inaccessible before frameworks are in place. Australia’s national and state libraries have existing collection mandates that offer a strategic starting point, but preserving the largest pre-AI text archives as public-interest infrastructure may require enabling legislation to do so at scale.

Second, supply-chain procurement. External foundation models deployed in defence and high-trust government contexts should require verifiable training-data provenance as a condition of use. The Tech Policy Design Institute’s AI Agency framework provides an emerging tool for assessing where these capability gaps lie, mapping 101 capabilities across six layers of the AI ecosystem, such as electricity and compute infrastructure through data assets, models, and governance. The framework scores each capability based on national maturity and strategic dependence.

Third, a scalable licensing mechanism analogous to the collecting societies that resolved similar tensions when broadcast radio emerged. It should be author-governed, scalable and jurisdiction-aware. The architecture is achievable. Licensing does not constrain developers who are willing to train without permission. What it does is give states that currently restrict training a legal pathway for their own developers to build domestically, rather than forcing them to choose between creator backlash and permanent dependence on foreign models.

These decisions compound. Every month without a domestic training pathway deepens reliance on models trained elsewhere, under conditions Australia did not set. Sovereignty in AI is not declared; it is built deliberately, early and with a clear understanding that the window is closing. It must be Australian made.

This article was published by The Strategist.

SHARE WITH: