useMYvoice
An open investigation into voice recognition for voices it was never trained on.
This is not a finished product. It is a structured experiment with published methodology, open data, and honest results. The work is in progress. The findings will be shared as they emerge.
Voice-to-text for the statistical centre
If your voice sits outside the norm, no setting fixes it.
Voice-to-text is built into every device you own. It was trained on the statistical centre of human speech — standard accents, typical vocal patterns, expected pacing. It works well for the majority of users because it was built to work well for the majority of users.
For people with dysarthria, cerebral palsy, motor neurone disease, a stutter, a non-standard accent, or simply a voice shaped by a life outside the mainstream — these systems fail in ways that no setting can correct. The failure is not a bug. It is the predictable result of optimising for the statistical centre.
Existing professional-grade solutions offer meaningful personalisation, but at a price point and complexity level that puts them out of reach for the people who need them most.
An open investigation
useMYvoice is a structured research programme that benchmarks how well current voice-to-text systems can be trained to hear atypical voices — and publishes what it actually finds, including results that are disappointing.
-
Honest benchmarking
Three systems tested under identical conditions. Results published as they are, not as we would like them to be.
-
Open methodology
All training scripts, datasets used, and evaluation methods published for individual and charitable use.
-
Practical questions
Not academic benchmarks for their own sake — answers to what an individual or small organisation could actually deploy.
-
Lived experience first
The research is shaped by the experiences of people whose voices the systems fail, not by assumptions about what those experiences must be.
Would your voice be a useful test case?
If mainstream voice recognition consistently fails you — whatever the reason — I would like to hear about it. Not to sell you anything. To understand whether what we are measuring reflects what people actually experience. There is no form.
- Call or WhatsApp
- Call 07709 968 153 WhatsApp
Methodology — for those who want the detail
Three systems. One question.
The study begins with a standard language model and a structured training programme across three systems, chosen to represent different points in the cost and complexity spectrum available to an individual or small organisation. Each is trained and retrained iteratively using the TORGO database — an established academic dataset of dysarthric speech from speakers with cerebral palsy and ALS. Accuracy is measured at each stage to map the learning curve and identify where improvement plateaus.
- Dragon Professional 15
- Role in study The established professional standard. Tested as a baseline — the best currently available commercial solution before any open alternatives are considered.
- Whisper + LoRA refinement
- Role in study A recent large-model approach using parameter-efficient fine-tuning. Tested to establish whether state-of-the-art accuracy is achievable without full retraining infrastructure.
- VOSK
- Role in study A lightweight offline framework sized for low-cost ARM hardware. The candidate for eventual hardware deployment — this study is designed to validate that choice before committing to it.
Three practical questions
The goal is not to declare a winner. It is to produce honest, reproducible benchmarks that answer three questions a real person or organisation would actually need answered.
-
What can each system actually achieve with atypical speech after training?
-
How much training data does it take to reach meaningful accuracy?
-
Does the accuracy plateau high enough to be useful in a real deployment?
Why VOSK matters for hardware deployment: VOSK's architecture runs efficiently on a Raspberry Pi Zero — the same hardware used in other Inclusion Vault projects. If the benchmarking confirms its accuracy plateau is sufficient, it becomes the basis for a low-cost, fully offline, personalised dictation device.
If the evidence points elsewhere, that finding is equally valuable and will be published as clearly.
Why this is being done openly
All methodology, training scripts, datasets used, and results will be published for individual and charitable use as the study progresses. There is no commercial incentive to present the results favourably. The point is to produce something useful to anyone trying to solve the same problem — whether that is another researcher, a charity, or an individual trying to decide which tool to use.