Temps de lecture 19 min
Framasoft invites you to try out the prototype of Lokas, a new speech-to-text transcription application that respects your privacy. This functional demo is also an experiment by Framasoft in the field of AI, accompanied by the Framamia website, which we present here (in French).
🎈Framasoft is 20 years old🎈 : Contribute to finance a 21st year !
Thanks to your donations (66 % tax-free), the Framasoft association has been working for 20 years to advance the ethical and user-friendly Web. Find out more about some of our actions in 2024 on the Support Framasoftwebsite .
➡️ Read the series of articles from this campaign (Nov. – Dec. 2024)
Please note that this article is also available in French here.
Make note-taking easier with Lokas
Lokas is an application (for Android or iOS smartphones) that allows you to transcribe the sound of your voice into a text file.
Basically, during a meeting : you put the phone in the middle of the table, press the ‘Record’ button at the start of the meeting and the ‘Stop’ button at the end. A few minutes later, the application sends you a text file containing the sentences spoken by everyone.
Lokas can and will do many more things, but we’ll come back to that at the end of this announcement.
Who is Lokas for ?
Lokas is aimed at anyone who takes part in meetings. That’s a lot of people on the planet :)
However, we can share a few usecases.
First example : a nonprofit’s Annual General Meeting
Let’s imagine a nonprofit AGM. There are 15 people in the room, 2 moderators and 1 note taker. And a 2-hour meeting.
Concerns :
- Note-taking is exhausting
- The person taking the notes has limited participation
- The notes may be incomplete (a ‘blank’ due to a bathroom break).
What does Lokas offer ?
Lokas assists the note-taker, making it easier for him or her to participate (while still allowing for a pee break !).
Example of a transcription of a voice exchange using the Lokas application.
Second example : a workshop with teenagers
A workshop run by the ‘ Les petits débrouillards ’ association. 3 groups of 5 teenagers. A majority of girls in the groups.
Concerns :
- Note-taking can be very complicated.
- Boys monopolise the floor
What does Lokas offer ?
Lokas makes it possible to keep a record (audio and written) of what was said. It also makes it possible to compile statistics on speaking time, particularly by gender, so that we can see for ourselves that boys leave very little speaking time for girls.
Third example : a video meeting in a foreign language
Your activist collective is close to a Spanish association. Camille, a volunteer from your group, who speaks a little Spanish, will be doing the video with her contact in Madrid. The video will therefore take place in a foreign language.
Concerns :
- You need to be able to listen again with your head down
- You need a French transcript to share with board members.
What does Lokas offer ?
With Lokas, Camille will be able to listen to the video again, automatically transcribe it into French, and share it from your smartphone (by email, via Signal, Matrix, WhatsApp, Telegram, etc).
AI isn’t magic ✨. Neither is Lokas 🤷.
Lokas is just a tool. It can assist you in taking notes. However, like any tool, it shouldn’t exempt you from using your brain !
Writing (another highly sophisticated technology) was invented at least 3,000 years ago. So humanity has been able to get together and keep written records for at least that long. Without AI. Without smartphones. Don’t throw away several millennia of technology with the water of AI. A tool like Lokas could be useful in some cases, and completely gimmicky, even unproductive, in others. This is reminiscent of the concept of Pharmakon, a concept dear to the French philosopher Bernard Stiegler : Lokas, like any technical object, is simultaneously poison, remedy and scapegoat.
The web, for example, is both a technological device enabling participation, and an industrial system dispossessing Internet users of their data in order to subject them to omnipresent marketing that is individually traced and targeted by user profiling technologies.
In the same way, Lokas can be emancipating (by facilitating participation rather than note-taking), or on the contrary restrictive (meetings in a noisy bar can be interesting, but we shouldn’t do without them because the tool works better in a quiet environment), or frustrating (« The application has crashed, I don’t have any backup notes ! Technology is shite ! »)
Lokas, like a car, a hammer or a pen, is not a ‘neutral’ tool. It’s up to you, collectively, to decide whether and how you want to use it.
‘This is the story of an app…’
We thought it would be interesting to tell you how the Lokas app came about. It means lifting the curtain on what goes on behind the scenes at Framasoft, and understanding how we can decide to do (or not to do) such and such a project. It’s also about showing that sometimes, with a bit of luck and a bit of elbow keyboard, you can do things that might seem impossible. However, as this part is not essential, we’ll leave it up to you to decide whether or not you want to read it.
Click here to read the (improbable and fabulous) origin story of Lokas
The idea for Lokas has been in the head of pyg, a member of Framasoft, for three or four years now.
The original idea (code name : ‘ Brewawa ’) was mainly to come up with an application that would be able to calculate the speaking time of participants in a meeting. The (not at all hidden) aim was to easily demonstrate that during a discussion with people of different genders, it is overwhelmingly men who monopolise the conversation.
Various tests have been carried out in recent years (hi Gee, hi bnjbvr !) to study the feasibility of such an application. But the fact is that in 2020, even if the technical possibilities were there, they weren’t really available to our tiny association, especially on a project piling on all those that Framasoft was already carrying out.
‘It’s all about technical improvements…’.
However, with the evolution of softwares such as Vosk and Whisper, audio transcription capabilities (i.e. the ability to transform the sound of sentences into text) have considerably improved.
So much so that today, these technologies are used by a huge number of software applications (from YouTube and PeerTube to BigBlueButton and WhatsApp), and are often even integrated directly into devices (Samsung has clearly made this a selling point).
The last decade has also seen improvements in ‘diarization’ processes. This rather barbaric term is in fact the technique used to identify different⋅es speakers in a discussion. For example, if Alex, Camille and Fred are having a meeting, the diarization will know how to attribute to each their sentences (no, the software won’t guess the person’s first name, but it will know – more or less – identify that there were three participants, and say ‘This sentence was uttered by person #1. This sentence was said by person #2.’, etc.
This is obviously an essential phase in being able to understand ‘who said what’ in a meeting.
This process is still imperfect, but it is improving month by month. We therefore need to look ahead to 2026 or 2027 to imagine truly reliable diarization, but today it is ‘sufficient’ in 60 to 80 % of uses under ‘good conditions’.
‘It’s the story of an alignment of planets…’.
It just so happened that Framasoft had the skills needed to develop such an application.
Chocobozzz, developer of PeerTube, had already worked hard on the process of integrating Whisper into PeerTube, in order to be able to automatically generate subtitles for a video. So he’s very familiar with Whisper, its configuration options, its performance and so on.
Wicklow, developer of the PeerTube application, has been working for several months with the Dart language and Flutter SDK, which enables an application to be developed for different terminals (Android, iPhone, computer/tablet, web, etc.) in a single code base.
Luc, our favourite system administrator (it’s not complicated, mind you, we only have the one 😅 ) manages Framasoft’s entire technical infrastructure (around sixty physical computer servers). So setting up the machine that manages the transcriptions, installing it, securing it, etc, was child’s play for him.
pyg, former director of Framasoft, now the association’s digital services coordinator, has managed countless projects for Framasoft over the last 20 years. So one more, even in the middle of a campaign, wasn’t going to stop him.
With this range of skills, and the technical capabilities of the transcription and diarization software, the planets were aligned to launch such a project.
‘It’s all about luck…’
However, as is often the case, you also have to rely a little on chance or luck.
Indeed, pyg had somewhat dropped the idea of this application, simply out of ignorance of the technical advances in terms of diarisation.
It was while discussing the idea of this application at the last Framacamp, in July 2024, that Wicklow dropped a piece of information in the nick of time : ‘Ah, but you know, Whisper now does proper diarization.’
BIM 💣
‘Ah, very interesting ! But I imagine it would take a long time to develop such a free transcription application ?’ asked pyg.
‘Oh, I’d say in 3 days I can have a working prototype if Chocobozzz takes care of the server part.’
BANG 💥
So instead of enjoying his evening playing poker, pyg went off to his room and prepared a presentation of a dozen slides on a potential application project, which he presented to the association the following morning.
Some members were enthusiastic, others less so. And we can understand them : first, because it was adding yet more work to an already particularly busy and exhausted association. More, this project would use software derived from artificial intelligence, a technology about which we are (unanimously) very critical.
However, this application, which was to become Lokas, seemed to us to be a good way of ‘embodying’ the social purpose of Framasoft : to educate the public about the challenges of digital technology and the cultural commons.
This enabled us to move away from the pedagogical aspect, which is both essential and insufficient in terms of appropriation and self-determination. By creating a ‘manipulable digital object’, we could use Lokas as an additional opportunity to explain what AI is, its possibilities, but also its weaknesses. And so return to our ‘Pharmakon’ mentioned above.
What’s more, as well as being able to assist any collective holding meetings, this enabled us to put into practice, in concrete terms, an application bearing our values : a user-friendly tool, not exploiting users data, under an open licence, aimed above all at people who are changing the world for more social progress and social justice.
In the end, the majority of members present said : ‘Let’s go for it !’.
‘It’s (also) a story of limits’.
As mentioned above, the constraints were considerable.
A project inevitably costs time and money. Time and money that can’t be used elsewhere.
As you know, Framasoft lives off donations. So we have to run donation campaigns. And the end of the year was already particularly busy with the finalisation of various projects and their announcements.
In discussions with Thomas and Pouhiou, co-directors of the association, it was decided that Lokas should remain a project subject to strict limitations : it should cost less than €10,000 all-included ; it should not have a major impact on the missions of Chocobozzz, pyg or Wicklow ; and it should be completed (in ‘wasted time’) between mid-September and mid-November (in particular because of the validation deadlines for the Android and iOS stores, which we don’t control).
With such constraints, it was impossible for us to produce a well-finished product. So we’ve decided to focus instead on making a prototype available. Think of this prototype as a showroom house. We’ve produced this version not by focusing on a long-term project, with solid foundations, but rather as a ‘proof of concept’, developed rapidly, to see if the concept is sufficiently attractive and interesting for us to priorise the development of this application in 2025 (if donations are sufficient, that is !).
To give you enough ‘desire’ to see a version 1.0 of Lokas arrive one day, we called on the skills of Atelier Domino to create a logotype and a graphic charter. This led us to create the project website in-house : lokas.app
At the same time, Wicklow and Chocobozzz set about developing the prototype and the transcription server.
‘It’s a story just waiting to be written…’.
A fortnight’s work later (and an estimated cost of €7,500 all-in, with roughly half the time spent by Framasoft and half on services : Domino workshop, server hire, domain names, validation of Google & Apple app stores), we can proudly and somewhat anxiously present our prototype !
How does Lokas work ?
1. Get in the right conditions
Lokas, like all transcription tools, is imperfect. Outside noise, poor articulation, a faint voice in the background, people cutting each other off… These are just some of the reasons why transcription can be difficult.
As a result, plan to be in a quiet room, place the telephone in the centre of the table (the better the sound quality, the better the transcription), don’t hold several discussions at the same time, and… take ‘old-fashioned’ notes (paper+pencil, computer+pad, etc.) in case of problems.
Once you’ve done that, it’s very simple.
2. Start recording
Simply click on the ‘Record’ button. Position the phone so that it can best pick up the exchanges. And start your meeting.
To limit abuse, recordings are limited to 5 per day and per device.
Note that the language model managed by Lokas means that it can already be used in around fifty languages, including : Dutch, Spanish, Korean, Italian, German, Thai, Russian, Portuguese, Polish, Indonesian, Mandarin, Swedish, Czech, French, Japanese and, of course, English ! Other languages are supported, but recognition will be less effective.
At the end of the meeting, click ‘Finish’.
3. Send your file for transcription (and be patient)
You may wish to listen to your file again before clicking on ‘Send’.
Your file will then be sent to our server where it will be queued for transcription.
This stage can take from a few minutes to a few hours, depending on the number of files in the queue.
You can check manually whether your file has been transcribed, or wait quietly for the notification (the verification task is carried out every 15 minutes).
Once the transcript has been received
Once you have received the transcript, you can display it in Lokas.
You can of course share it (with the application of your choice : email, Signal, WhatsApp, etc.) to correct it.
You can also see the speaking time statistics (NB : this feature is relatively experimental). If you wish, you can assign a first name (or pseudonym) to each participants to make it easier to read the notes. To obtain speaking times by gender, you can also allocate them manually, obviously ensuring that you have the consent of the people concerned to communicate this information. Note that this information is voluntarily manual, and does not leave your phone, and is therefore not transmitted to Framasoft nor anyone else.
Confidentiality point : one of the special features of Lokas is that we respect your privacy : the audio file is recorded on your phone. At your request, it is sent to our servers, which will then transcribe it. Once the transcription is complete, a notification is sent to your phone ; when you open (in ‘My files’) the meeting in question, the transcription is then downloaded to your phone. Once this stage has been completed, and after a slight delay to ensure that everything has gone well technically, everything is deleted from our server : the audio file as well as the transcript. And if you give us names, pseudonyms or genres for statistical purposes, please note that we do not process this information in any way.
What about AI ?
At Framasoft, we are not at all fans of AI. We think that this technology (or rather this set of technologies) poses more problems than it solves. In fact, we tried to summarise our position on AI on the Framamia website, which we present here on the Framablog (in French).
So, isn’t it contradictory to use AI in Framasoft applications such as Lokas or PeerTube ?
In our opinion, no. For several reasons.
Firstly, as we wrote on the Framamia website, not all artificial intelligence models are created equal. Whisper, the software used for transcription, is a ‘specialised’ AI, not a ‘generalist’ AI like ChatGPT, for example.
‘Specialised models are optimised to solve a specific task efficiently. Their impact is often controlled, and may correspond to that of other software’.
Framasoft, on the Framamia.org website
Whisper is certainly an AI, but it runs ‘in isolation’ on our servers.
The algorithms used are more complex than a ‘Remove the red eyes from this photo’ filter with GIMP or Photoshop, but it remains a relatively simple model (with an input/output process) that uses infinitely less energy than a training model. In fact, inference (the process of using the model to perform a task) consumes much less energy than training. For example, running Whisper to transcribe an audio file lasting a few minutes requires relatively modest computing power.
Secondly, a project like Lokas does not require the purchase of 350,000 GPU chips for $9 billion, as Meta/Facebook recently did, which is roughly equivalent to Togo’s GDP in 2023. We don’t think we’ll be taking part in the growth of the AI financial bubble, or in the runaway growth of algorithmic capitalism.
Finally (and most importantly), with Lokas or PeerTube, we remain consistent with one of the values at the heart of Framasoft, namely respect for the confidentiality of your data. Indeed, we do not make any use of your files, apart from the task explicitly requested, for example transcription. They are not used to enrich an AI model based on your discussions, your identity, etc. We don’t keep audio or text files, we don’t have access to the names/first names/genders that you manually assign to participants⋅es in a discussion (that stays on your phone), etc. And, of course, your data is NEVER monetised.
In short, Framasoft doesn’t care about the content of your data, it belongs to you and is nobody’s business but yours.
Despite this, we respect the point of view of people who wish to boycott AI, and we understand the contradiction they might find in a technocritical association like Framasoft proposing projects using AI.
Our aim is to offer a tool that will enable people to think about the issues in a concrete way, so that they can form their own opinions and come to their own conclusions.
When is Lokas coming ?
You can Download the Lokas app on the Play Store, iOS (still in TestFlight on iOS, because they are 🤬… let’s say picky EDIT : it’s now available), (and soon on f-droid), or get the android apk directly from us here. But keep in mind it is a prototype (if you haven’t already, take two minutes to read ‘The Lokas Story ’ and understand why), so it’s normal that lots and lots of things don’t work !
We’ve already taken time, energy and a bit of money out of limited resources (did anyone ever tell you that we only live off your donations ? ;-) ). And, obviously, this POC is open source, the code is publish here on our forge.
So before going any further, we need to confirm that you are interested in this project. If the donations aren’t big enough, or if the contradictions are too strong : we’ll stop there (the code is free, so it won’t be ‘lost’).
If, on the other hand, you find it relevant, there are countless possibilities for future developments. For example :
- Complete redesign and accessibility (in prototyping mode, we went very fast, and Lokas is therefore very perfectible) ;
- Ability to (re)transcribe the file of your choice (from Lokas, a video or another application, for example) ;
- Add a ‘web’ mode to the application. This means you can use Lokas from your computer (similar to the Scribe server used by our friends at the Céméa) ;
- Add the possibility of automatic summaries of the transcripts, to quickly find the key points ;
- Translate the application (and the website) into languages other than French and English ;
- Ability to edit and correct the transcript directly from your phone ;
- Provide the option of obtaining the transcript in the language of your choice (e.g. a meeting in English transcribed into French, or vice versa) ;
- etc
But to do this, we’re going to need some staff time, and therefore money. So, at the risk of sounding insistent, we invite you, if you can, to make a donation.
Make a donation to support Lokas
The challenge : 20,000 times €20 donations for Framasoft’s 20th anniversary !
Framasoft is funded by your donations ! Every €20 you donate will be a new balloon to celebrate 20 years of adventures and help us continue and take off for a21st year.
Framasoft is a model of solidarity :
- 8,000 donors in 2023 ;
- over 2 million beneficiaries every month ;
- your donation (66 % tax deductible) can benefit 249 other people.
To date, we have raised €58,625 of our campaign target. We still have 29 days to convince our friends and raise enough money to get Framasoft off the ground.
So, challenge accepted ?
Pierre
Hi,
It looks very neat!
Will it be usable as a self-hosted offline server-client application?
P.
pyg
I don’t see what could prevent it from working offline (well, if you use the client app, it will still have to be able to exchange with the server).
The client part and the server part are free.
On the other hand, the documentation for setting it up is brief (not least because running software like whisper is not always easy). https://framagit.org/framasoft/lokas/
So, for the time being, the answer is ‘yes, but you need good technical knowledge’.
If we have the financial resources, we can free up some time to improve the documentation and document different installation scenarios.