Mā te mahi tahi a ngā hunga pāpāho ngā taputapu reo Māori e whanake Media collaboration to develop Māori language tools

11 Sep 2024
Ngā Taonga has entered a new collaboration with Te Hiku Media and Radio New Zealand to develop Māori language tools

Media collaboration to develop Māori language tools

Te Reo Irirangi o Te Hiku o Te Ika (Te Hiku Media), Radio New Zealand (RNZ) and Ngā Taonga Sound & Vision have entered into a new collaboration to further develop bilingual transcription and speech recognition for te reo Māori and New Zealand English.

In a first for the public media organisation, RNZ has agreed to supply archival radio broadcasts to the iwi-led charitable trust to support its groundbreaking natural language processing tools. Ngā Taonga, the audiovisual archive of Aotearoa, which cares for the RNZ archive, is providing support to ensure the recordings are cared for and handled appropriately.

Natural language processing is a branch of artificial intelligence (AI) that uses machine learning (ML) technology. Machine learning is the science of developing algorithms and statistical models to perform complex tasks such as automatic speech recognition (ASR).

Te Hiku Media, as part of its Papa Reo data science project, has been developing tools that can transcribe te reo Māori, give pronunciation feedback, and turn text into speech. This has expanded to include NZ English and work has begun on sister languages to te reo Māori, such as, ‘ōlelo Hawai’i. The goal of using the RNZ material is to train the tools to recognise a wider spectrum of speech, dialects and accents.

Te Hiku Media CEO Peter-Lucas Jones said: “Our tools already outperform attempts by major tech companies at te reo Māori ASR, with an accuracy of 92%. This accuracy and overall quality are directly related to the quality of the data, the quality of our relationships, and the Māori language and data science capabilities of our team. This collaboration reflects all of these things, with our three organisations spending time building trust and understanding in our relationship and demonstrating our commitment to quality te reo Māori outcomes and data sovereignty.”

Te Hiku Media has been recognised globally as an advocate for Indigenous data sovereignty and community-led data science. Ngā Taonga and RNZ have put in place some strict parameters in terms of Te Hiku Media's use of the archival material.

RNZ regularly shares its content with other media partners, but this collaboration is the first time it is granting access to its content to help develop AI tools.

RNZ Chief Technology Officer Mark Bullen said it was very important for RNZ to partner with organisations that shared its value of te reo Māori as a taonga when considering the use of AI.

“RNZ is particularly interested in tools that help unlock our online audio, supporting improved search, discovery and accessibility in the form of captioning of content. This will support both our internal production capabilities and RNZ’s audiences. And indeed, part of the collaboration with Te Hiku Media allows for RNZ to have access to these tools. But we also felt that the values of Te Hiku Media, their well-documented position on data sovereignty and their support of whānau in preserving their oral histories aligned with our values and our responsibilities under RNZ’s Charter.”

Ngā Taonga Tumu Whakarae (Chief Executive) Honiana Love said the collaboration with RNZ and Te Hiku Media was an extension of some of the early testing the archive did with te reo Māori transcription tool.

“We definitely saw merit in the tool given our own cataloguing work and the number of requests we have for te reo Māori transcriptions from whānau. We care for the largest body of the historical recordings of te reo Māori and mātauranga Māori in the world and in recent years have made a commitment to cataloguing te reo Māori material in te reo Māori. Supporting this work to help recognise a wider spectrum of speech, dialects and accents is important to our work to protect taonga Māori.”

RNZ and Te Hiku Media signed a Memorandum of Understanding in December 2023, agreeing in principle to give access to RNZ’s recordings. Ngā Taonga, which holds the archival kōrero, then led a consultation process with iwi/Māori stakeholders.

Following this consultation process, which included a webinar as well as direct engagement, the three parties have agreed to take the following points into account as part of the collaboration:

a) there should be no pecuniary gain from the access to the recordings;

b) the material supplied is only that which has been previously broadcast and has no restrictions;

c) the taonga cannot be rebroadcast or reused in any form other than what has been agreed to;

d) a review of the tools to check for improved handling of dialect nuances and the accuracy of transcription, in particular for names no longer in use.

The first tranche of archival material being delivered is a selection from the already publicly available He Rerenga Kōrero collection. RNZ is in the process of integrating the Papa Reo API into its workflows, initially adding transcribed audio to the search capability and then allowing it to add captions to its player in the near future.

MEDIA END

Media enquiries for Te Hiku Media to Suzanne Duncan, 027 368 9241, media@tehiku.nz

Media enquiries for RNZ to Kim Grade, 021 391 599, kim.grade@rnz.co.nz

Media enquiries for Ngā Taonga to Julie Warmington, 021 879 886, juliewarmington@ngataonga.org.nz

Mā te mahi tahi a ngā hunga pāpāho ngā taputapu reo Māori e whanake

E mahi tahi ana a Te Reo Irirangi o Te Hiku o Te Ika (Te Hiku Media) rātou ko Irirangi Aotearoa (RNZ), ko Ngā Taonga Sound & Vision ki te whakawhanake i te hangarau tuhi i te reo Māori me te reo Ingarihi o Aotearoa.

He whakaaetanga tuatahitanga mā RNZ kia tuku i ngā kōrero reo irirangi o mua ki te tarahiti ohaoha e ārahi ā-iwi nei ki te tautoko i ngā taputapu tukatuka reo māori. Mā Ngā Taonga, te pūranga ataata-rongo o Aotearoa e pupuri ana i ngā kōrero o nehe RNZ, e tautoko te tiaki tika me te whakahaere tika o ngā taonga kōrero.

Ko te tukatuka reo māori he peka o te atamai hangahanga (AI) e whakamahi ana i te pūrere ako-aunoa. Ko te pūrere ako-aunoa te pūtaiao e whakawhanake nei i ngā hātepe me ngā tauira tatauranga hei kawe i ngā mahi uaua pēnei i te āhukahuka reo aunoa (ARA).

Hei wāhanga o te kaupapa o Papa Reo, kua whakawhanake a Te Hiku Media i ngā taputapu e taea ai te tuhi reo Māori, te whakahoki kōrero mō te whakahua reo, me te whakawhiti i te reo ā-tuhi ki te reo ā-waha. Kua whai wāhi hoki tēnei ki te reo Ingarihi o Aotearoa, ā, kua tīmata te toro atu ki ngā reo whanaunga ki te reo Māori, pēnei i te ‘ōlelo Hawai’i. Ko te whāinga o te whakamahi i ngā rauemi RNZ hei whakangungu i ngā taputapu ki te mōhio whānui i ngā reo, ngā reo ā-iwi me ngā mita hoki.

E kī ana te Upoko o Te Hiku Media, a Peter-Lucas Jones, “He pai ake kē ā mātou taputapu i ngā mea a ngā umanga hangarau matua ki te reo Māori ARA, e 92% te pāpātanga tika. Ko tēnei pāpātanga tika me te kounga whānui e hono ana ki te kounga o ngā raraunga, te kounga o ō mātou whanaungatanga, me ngā pūkenga reo Māori me te pūtaiao raraunga o tō mātou tīma. Ko tēnei mahi tahi e whakaatu ana i ēnei mea katoa. I kaha wānanga mātou i runga anō i te whai whanaungatanga me tō mātou manawanui ki ngā tino hua reo Māori me te mana raraunga.”

Kua rongo whānui ā-ao a Te Hiku Media hei kaitiaki mō te mana raraunga ā-Iwi taketake me te pūtaiao raraunga ā-hapori i te ao. Kua whakatakotoria e Ngā Taonga me RNZ ētahi here mātāmua mō te whakamahinga a ngā rauemi pūranga e Te Hiku Media.

Tohaina ai e RNZ tōna pātaka kōrero ki ētahi atu hoa pāpāho, engari ko tēnei mahi tahi te wā tuatahi i whakaaetia ai te uru ki te pātaka hei tautoko i te whakawhanaketanga o ngā taputapu AI.

E kī ana te Tumuaki Hangarau a RNZ, a Mark Bullen, he tino hira ki a RNZ te mahi tahi me ngā kaupapa e piri ana ki ngā uara o te reo Māori hei taonga i ngā whakamahinga i te AI.

"He tino pārekareka a RNZ ki ngā taputapu whakapūaho kōrero e āwhina ana ki te whakawātea i ā mātou oro ipurangi, e tautoko, e whakamāmā hoki ana i te mahi rapu me te tiki i ngā taonga. Ka tautoko tēnei i ā mātou ake mahi hanga hotaka me te hunga whakarongo ki a RNZ. Āna, mā te mahi tahi me Te Hiku Media e wātea ai a RNZ ki te whakamahi i ēnei taputapu. Heoi, i kitea te ritenga o tā Te Hiku Media me ō rātou uara, tā rātou tūnga ki te mana raraunga me te tautoko mārika i te pupuri i ngā taonga kōrero o te whānau ki ā mātou kawenga i te Rīpene o RNZ.”

E ai ki te Tumu Whakarae o Ngā Taonga, a Honiana Love, ko tēnei mahi tahi ki te taha o RNZ me Te Hiku Media he haere tonutanga i ngā whakamātautau o mua i mahia e mātou mō te taputapu tuhi reo Māori.

"I kitea e mātou te hua o te taputapu, i runga i ā mātou ake mahi whakarārangi kōrero me ngā tono a ngā whānau i ngā tuhinga e pā ana ki ngā kōrero kei ngā taonga reo Māori. E tiaki ana mātou i ngā oro kōrero ā-hītori o te reo Māori me ngā mātauranga Māori nui ake i te ao, ā, i ngā tau tata nei, kua whakaū mātou ki te whakapuaki, ki te whakarārangi i ngā rauemi reo Māori i te reo Māori anō. He mea nui ki ā mātou mahi te tautoko i tēnei kaupapa hei āwhina i te mōhio whānui i ngā reo, i ngā reo ā-iwi, me ngā mita, hei pupuri i ngā taonga Māori i runga, ka tika."

I hainatia e RNZ me Te Hiku Media tētahi Mahere Whakaaetanga i te marama o Hakihea 2023, ko tōna tikanga e whakaae ana i ngā kaupapa mō te toro atu ki ngā kōrero o RNZ. Kātahi ka kōrero a Ngā Taonga, te kaipupuri i ngā kōrero pūranga, ki te iwi Māori.

I tēnei whakawhiti kōrero tētāhi hui-topa me ētahi kōrero pū ki ngā tāngata me ngā iwi, ā, anei ngā tohutohu i whakaaetia e ngā rōpū e toru mō te kaupapa:

a) kāore e whiwhi moni i tēnei kaupapa;

b) ko ngā rauemi ka tukua, he mea whakapāho kē, kāore he herenga ā-ture;

c) kāore e taea te whakapāho anō, te whakamahi anō, i ngā āhua kāore i whakaaetia;

d) ko tētahi arotake i ngā taputapu hei tirotiro mō te whakahaere pai ake i ngā whakamāramatanga reo me te pūmau i te whakamaori, ā, mō ngā ingoa kāore i te whakamahia ināianei.

Ko te wāhanga tuatahi o ngā rauemi pūranga e tukuna ana he tīpakonga mai i te kohinga He Rerenga Kōrero e wātea kē ana ki te marea. Kei te whakamahere a RNZ ki te whakauru i te API o Papa Reo ki āna mahi.

KUA OTI

Mō ngā tono pāpāho mō Te Hiku Media, whakapā atu ki a Suzanne Duncan, 027 368 9241, media@tehiku.nz

Mō ngā tono pāpāho mō RNZ, whakapā atu ki a Kim Grade, 021 391 599, kim.grade@rnz.co.nz

Mō ngā tono pāpāho mō Ngā Taonga, whakapā atu ki a ulie Warmington, 021 879 886, juliewarmington@ngataonga.org.nz

About Te Hiku Media

Te Hiku Media is a charitable media and technology organisation, collectively belonging to the Far North iwi of Ngāti Kuri, Te Aupōuri, Ngai Takoto, Te Rarawa and Ngāti Kahu.

Te Hiku Media was established in 1990 and is part of Te Whakaruruhau, the iwi radio network. Māori language revitalisation is a core focus of Te Hiku Media.

Te Hiku Media successfully applied for Data Science Platform Funding in 2019 for Papa Reo, the only non-university based project to do so. Papa Reo is funded for seven years starting in 2020 by the Strategic Science Investment Fund held by the Ministry of Business, Innovation and Employment.

Papa Reo has a long, rich whakapapa and recognises and celebrates the work of the many te reo activists and innovators who have contributed to the important work of revitalising the language. Te Hiku Media is a product of that grit and determination and the vision of kaumātua that continue to guide their activities.

About RNZ

RNZ is New Zealand’s independent non-commercial public media organisation and has proudly been so for almost 100 years.

RNZ delivers a diverse range of content that reflects New Zealand’s culture, social and regional diversity. It serves as a platform for quality journalism, creating a space for open dialogue and informed discussion on topics that matter to New Zealanders.

RNZ has more than 60 content-sharing partnerships and collaborations in place. It means RNZ content is available to many online media, print, radio and television services in New Zealand and the wider Pacific. This improves the accessibility of RNZ content for New Zealand and overseas audiences and provides a valuable source of unique local content for other media.

About Ngā Taonga

Ngā Taonga Sound & Vision is Aotearoa New Zealand’s audiovisual archive with a collection spanning more than 100 years of rich history. We hold a unique position as the only dedicated audiovisual archive capturing the history of New Zealand, told in sound and moving image.

There are over 800,000 items in our care, dating back to 1895, and include film and television, sound and radio recordings. Ngā Taonga cares for the largest body of the historical recordings of te reo Māori and mātauranga Māori in the world.

Operating as an independent charitable trust, it is governed by a Board of Trustees and funded predominantly by Manatū Taonga, the Ministry for Culture and Heritage; the Lottery Grants Board; and Te Māngai Pāho.