Can AI Empower Indigenous Language Revitalization?
From Alice Springs to AI Advocacy: Lawson Stapleton's unique path to language preservation
When we talk about the power of artificial intelligence in the language space, our minds often go to high-volume use cases: global brands, massive datasets, dozens of languages translated at scale. But there’s another side to this revolution, one that has the potential to make an even more meaningful impact: the use of AI to revitalize and preserve Indigenous languages.
In this edition of AI in Loc, I sat down with Lawson Stapleton, an Australian localization expert who resides in Finland and is a passionate advocate for Indigenous language equity. From his unique upbringing straddling Western and Indigenous worlds, to his pioneering work building Australia’s largest Indigenous interpreting program, Lawson brings a deeply personal and practical perspective to the challenges facing endangered languages and how small language models (SLMs) might be the technological lifeline we didn’t know we had.
What follows is a wide-ranging conversation about cultural resilience, linguistic justice, and the quietly revolutionary role that community-driven AI can play in empowering the languages most at risk of disappearing forever.
Stefan Huyghe: Lawson, we’ve known each other for a while, and I’ve always been fascinated by your passion for Indigenous language work. Can you give us a bit of background on who you are and why this topic is so close to your heart?
Lawson Stapleton: Sure. To be honest, I’m no one important, but I do have a real passion for this. I grew up basically between Adelaide and Alice Springs, and that part of Australia has a much higher density of Indigenous communities compared to places like Sydney, Melbourne, or Brisbane. So I had the blessing of growing up with one foot in the Western world and one in the Indigenous world.
Later, while working in the LSP industry in Adelaide, that background became a kind of calling. Over eight years, my main focus was building what turned out to be the largest Indigenous interpreting wing in the country. It took off in a big way, and that experience gave me a deep understanding of both the obstacles and the potential in this space.
Stefan Huyghe: And now that AI is entering the picture, do you see a new opportunity?
Lawson Stapleton: Absolutely. Given everything I’ve seen and experienced, I honestly believe we now have a revolutionary tool at our fingertips. I’m not saying that for the sake of drama, it’s real. If we apply AI in the right way, particularly small language models, we can create breakthroughs that were unimaginable just a few years ago.
The Scope of the Crisis: Understanding the scale and urgency of Indigenous language endangerment
Stefan Huyghe: Not many people realize just how many Indigenous languages exist, or how many are at risk of disappearing. Can you share some numbers to help us frame the scale of the issue?
Lawson Stapleton: Sure. Globally, we’re looking at somewhere between 6,900 and 7,200 languages spoken today. Of those, around 3,100 are considered endangered. And here’s the thing, regardless of population size or current health, Indigenous languages almost always fall into that endangered category. They’re almost all on the path toward extinction unless something changes.
Take Canada, for instance. Post-colonial era, there were around 450 Indigenous languages. Today, that number is closer to 70. In Australia, we used to have about 400 to 500, now we’re down to between 60 and 100, depending on how you count them. So the loss is massive.
Stefan Huyghe: And I imagine most of these are under-resourced as well?
Lawson Stapleton: Exactly. “Endangered” and “under-resourced” often go hand-in-hand. These languages tend to lack written materials, trained linguists, documentation, and technological infrastructure. And that makes the challenge even more complex, because without foundational resources, it’s hard to build any kind of preservation effort, let alone incorporate AI. But that’s also where the opportunity lies.
The Missing Middle: Why LSPs Matter
Stefan Huyghe: One thing that really stood out in the presentation you originally prepared, and in our conversations, is your perspective on the role of LSPs. How do you see them contributing to Indigenous language revitalization beyond just traditional translation work?
Lawson Stapleton: That’s such a good and big question. I think LSPs have a real future in this space, not just because of their technical expertise, but because this space desperately needs a mediator. You’ve got Indigenous communities on one hand, and government departments on the other. There's often mistrust between them, and a history of projects that didn’t go well. You need someone who can bridge the cultural, political, and technical gaps.
Stefan Huyghe: So LSPs as more than vendors, almost like facilitators?
Lawson Stapleton: Exactly. LSPs can bring these groups to the table and help them actually talk to each other. But it goes beyond that. The way LSPs typically work, taking a project, sending it off, delivering it back, doesn’t apply here. When you're working with Indigenous languages, it’s far more hands-on. It’s collaborative. You’re working with the community, not just for them.
Stefan Huyghe: That kind of cross-sector collaboration, between governments, LSPs, and Indigenous groups, seems rare. Why is that?
Lawson Stapleton: It comes down to structure and inertia. In many post-colonial countries, language programs for Indigenous communities often get dropped into generic departments that aren’t equipped for this kind of work. For example, a Department of Languages might be great at handling immigrant languages, like English to Japanese or German, but Indigenous languages are different. They fall through the cracks. The departments aren't specialized, they don’t have the right relationships or expertise, and the bureaucratic machine can’t move fast enough.
And to be honest, most LSPs either don’t get involved because they don’t know how, or they try and give up because it’s too complicated. There’s also the issue of mismatched timelines, governments work on financial year cycles, communities work at their own pace, and LSPs are stuck in the middle. But again, that’s exactly why they matter. If done right, LSPs are the perfect mediators. The problem is, very few are even trying.
Lessons from the Land: What Indigenous languages can teach us about sustainability, learning, and storytelling
Stefan Huyghe: Let’s zoom out for a second. You've done some recent research on Canada as a kind of microcosm for global Indigenous language loss. What can we learn from the Canadian context?
Lawson Stapleton: Canada’s an interesting case. Under the Trudeau government, they allocated a staggered budget between 2020 and 2024 that reached about a quarter of a billion Canadian dollars for Indigenous language revitalization. So, the money is there. The legal mandates are there. But despite all that, many Indigenous languages are still in decline.
What that tells me is that this issue isn’t just about funding, it’s about how deeply these languages are tied to ways of life. Losing a language means losing historical knowledge, sustainable practices, and unique worldviews. And here’s the irony: at a time when we’re increasingly worried about climate change and environmental degradation, we’re ignoring the very communities who have been living sustainably for thousands of years.
There’s a great book I came across called Dark Emu by Bruce Pascoe. He argues that before we push forward with modern environmental policies, we should actually pause and learn from Indigenous communities, how they manage land, how they use controlled burn-offs, how they fish or farm without destroying the ecosystem. In many cases, scientific experiments are now backing up these methods as being more effective than our own.
Stefan Huyghe: That’s fascinating. And it seems like that ties back into language, not just as a means of communication, but as a repository of knowledge.
Lawson Stapleton: Absolutely. Language shapes how people think and how they teach. In Indigenous communities, learning isn’t based on textbooks and lectures, it’s about storytelling, kinesthetic learning, communal memory. And if we’re being honest, not everyone learns well through reading and writing. There’s something powerful in recognizing that storytelling and oral transmission aren’t primitive, they’re just different, and in many ways, more inclusive.
When we talk about revitalizing Indigenous languages, we’re not just preserving words. We’re preserving worldviews, ways of teaching, and modes of understanding that could actually make us better learners, better citizens, and better stewards of the planet.
Enter AI, A Workflow for Revitalization: The six steps to supporting Indigenous languages through AI
Stefan Huyghe: Let’s bring AI into the picture now. This is where your passion for Indigenous language work intersects with mine. You’ve outlined a six-step process for using AI to support minority languages. Can you walk us through what that looks like?
Lawson Stapleton: Absolutely. This is where things get exciting, because it brings hope and action together. I developed this workflow with a friend who’s an AI engineer, we wanted to make sure we were grounding the approach in both linguistics and tech.
Recommended by LinkedIn
The first step is establishment. That means choosing the right model and API hosting for the language. It’s not just about picking what’s popular, it’s about what’s appropriate. What’s the capacity? What are the limitations? What are the regional obligations? Many Indigenous communities, especially in Canada or Australia, want to retain ownership of their data, so local or government-secure hosting is often essential.
Step two is data acquisition. That includes raw textual data, speech-to-text recordings, and even old scanned documents. Legal rights over the content matter here too, so there has to be care around copyright and consent.
Step three is digitalization and modernization. This is where you take old material, often biblical or colonial documents, and make them usable. For instance, the Bible is surprisingly central to a lot of revitalization projects. It’s massive: the New and Old Testaments contain nearly 800,000 words, which turns into around 100,000 usable sentences. And since these texts exist in dozens of endangered languages, they’re a valuable data source, if you can modernize the language.
Stefan Huyghe: I had no idea the Bible played such a role.
Lawson Stapleton: It’s wild, right? Hebrew’s revitalization used it as a foundation. And languages like Hawaiian and Māori saw major revitalization movements in the ‘70s and ‘80s using similar source material. Even in my hometown of Adelaide, we’re seeing this pattern with the Kaurna language, sifting through old texts and bringing them back to life.
The fourth step is validation and enrichment. That means cleaning the data, restructuring it for AI models, and enriching it through back-translation and community consultation. You can’t do this without linguists, it’s not just a technical job.
Then comes model fine-tuning, adjusting the AI to reflect culturally specific language use, what can and can’t be said, and ensuring the model can go both ways, translation and conversation.
Finally, step six is testing and deployment. Building the model into real-world workflows. And to be honest, projects can stall at any one of these steps if the right people aren’t involved.
The Power of Small Language Models (SLMs): Less data, more relevance, why SLMs may outperform LLMs in this domain
Stefan Huyghe: You mentioned earlier that we’re often leaving languages behind. Is that why you’re focusing on small language models instead of the big, headline-grabbing ones?
Lawson Stapleton: Exactly. When people hear “AI for language,” they think of large language models, LLMs, or neural machine translation. But when you’re dealing with Indigenous or under-resourced languages, that’s not always the right fit. These languages often don’t have massive datasets. That’s where small language models, or SLMs, come in.
An SLM can be conversational, searchable, and translatable, even with a smaller dataset. That means we can create tools that preserve the language, contribute to its use, and most importantly, empower the community to engage with it in daily life. It’s not just about building a museum piece, it’s about bringing the language back into the world.
And technically speaking, SLMs are a lot cheaper to run. They’re designed to do specific tasks, which means you need less data, less compute power, and less infrastructure. They’re also easier to host locally, which is huge, especially in communities with limited internet access.
Stefan Huyghe: Right. Infrastructure must be a big issue in remote Indigenous communities.
Lawson Stapleton: Absolutely. A lot of communities either don’t have consistent internet or any at all. That’s why local hosting and offline capability are so important. SLMs can run on a local server, be maintained and updated by the community, and provide immediate value, whether it’s at a clinic, a job center, or a school.
And here’s something we often overlook: a community doesn’t just want language access, they want control. SLMs give them that. They own the model, they shape it, and they decide how it’s used. That’s a powerful shift from most traditional tech deployments.
In fact, we’ve already tested this idea. A friend and I developed a small-scale model for Yamplatok, a language spoken in the Torres Strait between Australia and Papua New Guinea. It’s spoken by about 30,000 people, but only six certified translators exist. That’s not sustainable.
With a conversational SLM, we could create tools for check-ins at employment offices, registration at hospitals, or basic court intake processes. It takes pressure off the few available linguists and makes services more accessible, while still respecting the language’s integrity. It’s the kind of hybrid model that can truly make a difference.
Ethical AI for Indigenous Empowerment: Avoiding tokenism and building lasting value
Stefan Huyghe: We’ve talked a lot about potential, but what about the risks? How do we make sure that using AI for Indigenous languages doesn’t turn into a tokenistic gesture or another top-down project that doesn’t really serve the community?
Lawson Stapleton: That’s a really important question. And I’m going to be blunt, tokenism has been a real issue in this space. There’s a long history of governments and organizations launching flashy “language projects” just to check a box or justify funding. The results? Sometimes it’s a single children’s book, something like, “Hi, I’m Gerald, and this is my story”, and that’s it. Was that worth $60,000 in funding per language? I don’t think so.
We have to move beyond symbolic gestures. If we’re going to invest in revitalization, we need real, measurable outcomes, and that’s where AI can help. A small language model isn’t just a deliverable. It’s an active, living tool that creates access, builds skills, and actually supports the community’s needs.
But the ethics go deeper. Communities have to be involved from start to finish. They need to co-design these tools, not just be consulted at the end. That means understanding the culture, respecting their pace, and acknowledging the mistrust that often exists because of past failures.
Stefan Huyghe: And for those of us in the language or tech industries, what can we do to engage ethically?
Lawson Stapleton: First and foremost, go in with an open mind. This work isn’t about you, it’s about them. It’s about the community’s voice, their goals, and their way of doing things. You have to earn their trust, and that takes time. Learn the culture. Respect the timing. Consult on everything. If you skip that step, the project will either fail, or worse, cause harm.
It’s also important to understand that while these communities often lack resources, they do not lack perspective. They’ve been thinking about language loss and preservation for a long time. What they’ve lacked is a platform to be heard. So step one is: listen.
What Success Looks Like
Lawson’s vision for inclusive language tech
Stefan Huyghe: Let’s end on a hopeful note. What does success look like to you in this space? What would make you feel like all this work has paid off?
Lawson Stapleton: Success has a few layers for me. On a personal level, it’s seeing a community become truly bilingual in its daily life, not just in a ceremonial or symbolic way, but embedded in real systems: schools, banks, universities, health clinics. I want to see a world where Indigenous people can move between their languages with pride and ease, and where both are respected as equal.
But success is also systemic. I’d love to see a workflow where the government funds a project, and let’s be clear, the money is already there, then lets the LSP do what they do best: manage the process with professionalism and cultural sensitivity. At the same time, the Indigenous community leads the way on linguistic and cultural expertise. That’s the ideal triangle of collaboration.
And most importantly, success looks like economic empowerment. If we can build small language models and create language tech that genuinely supports community use, we also create employment opportunities. More Indigenous linguists hired. More local ownership. More money flowing back into the community, not just to consultants in the city.
That’s the full circle. When we preserve the language, we’re not just saving words, we’re investing in identity, knowledge, sustainability, and dignity. And that’s worth everything.
🎯 AI Enterprise Strategist ✔Globalization Consultant and Business Connector 💡 Localization VP 🎉Content Creator 🔥 Podcast Host 🎯 LocDiscussion Brainparent ➡️ LinkedIn B2B Marketer 🔥 LangOps Pioneer
7moThe full recording of our interview can be found here on the LocDiscussion YouTube Page: https://youtu.be/CHB9HXTKAUM
Translator & Software Developer
7moOne of the challenges in Africa is that many languages do not have a standard written form and some are not written at all. Maybe speech-to-speech will be the way forward? Btw I have tried using the Bible as a dataset in NMT and it doesn't get you very far when when you step into the modern world.
Research leader bridging technology and global languages and cultures | Director at Bold Insight | EPIC Board Member | Speaker, author & translator (JP–EN) | Good friend & mountain lover
7moGreat to read this! There’s some exciting work happening in this area by folks like Michael Running Wolf Claudio Pinhanez Muhammad Abdul-Mageed Masakhane Lelapa AI Te Hiku Media LingoAI Sunbird AI to name a few. I’m moderating a public forum on small models for the AI FOR DEVELOPING COUNTRIES FORUM that Edwin Trebels is part of next weekend, some discussion will be on language tech and indigenous considerations. ✌️ And hey just came across a wonderful new summary article worth checking out too! https://www.brookings.edu/articles/can-small-language-models-revitalize-indigenous-languages/
Engineering & Manufacturing Translation Services • Founder & Principal Consultant @ TechParlance
7moThis is exactly the conversation we need to amplify! 🔥 You've hit the crucial counter-narrative: true linguistic equity won't come from imposing massive, data-hungry AI models onto all languages. That often risks perpetuating digital colonialism. Great piece, Stefan!
Founder/Director at RoundTable Studio
7moThank you Stefan Huyghe and Lawson Stapleton for sharing your interesting insights on AI's potential role in preserving indigenous languages. It reminded me of my post from 3 years ago on the importance of revitalizing endangered languages by linking them with modern technology: https://www.linkedin.com/posts/teddy-bengtsson-89b32_roundtable-studio-helps-motorola-embrace-activity-6796431129927786496-vrz3?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAABBXcBC6GYe5pekZMjbmjNs0tvectgsLk