- Documentation is now shipped as HTML docs, PDFs, epub and via the interactive tutorial page. This is an attempt to bring the documentation up to date, make it easy to read in the format you prefer (and on the device you prefer, if you want to load that epub onto your iPad), and even to let folks who just want to dive in and get going use the interactive tutorial and then follow up with reading the docs when they have questions later.
- Fix for a small bug in the singleton implementation of AudioSessionManager
- A significant fix to CMUCLMTKModel which improves probability assignment for unigrams (previously there was a chance that one of your unigrams would have an incorrectly-low backoff)
- LanguageModelGenerator is no longer required to be used with the cmu07a.dic pronunciation dictionary. If you have another dictionary with the identical format and an acoustic model which matches the phonemes in that dictionary, you can use it instead.
- LanguageModelGenerator can now generate a language model from a text file. Since there have been some issues directing developers to the CMU Language Tool, this is the start of the process of only doing language model generation by using LanguageModelGenerator and phasing out recommending the use of the CMU online tool.
- PocketsphinxController now defaults to not returning a hypothesis if the hypothesis has no content. This can be overridden if your app logic requires receiving empty hypotheses.
- A lot of code cleanup and simplification of both the framework and the sample app.
- The Mandarin Chinese acoustic models and all of the pre-rolled language models are now part of the separate distribution OpenEarsExtras. I’ve moved the Mandarin acoustic models because they are large and not the most-commonly-used elements and I really wanted to reduce the download size for the average user, and I hope it isn’t too much of an inconvenience to download them separately. If this is annoying or a hardship, let me know and I will reconsider — I’m happy that there are interesting Chinese projects being done with OpenEars and it’s just a bandwidth decision; I had the idea that I would gradually add all non-English acoustic models to OpenEarsExtras.
- I have moved the OpenEars documentation and Framework folder to the root of the distribution folder since there is no longer any reason that developers have to interact directly with the Xcode project for the framework unless they want to. If you are linking to the framework via reference you may need to fix this reference to point to the new location of the Framework folder.
- FliteController now uses packaged voices of the class type FliteVoice which are compiled into frameworks. OpenEars ships with one of these frameworks for the SLT voice (see the sample app for implementation) and the other eight voices are available for download in the OpenEarsExtras distribution here. There are several reasons for this:
- Because the voices now have their own initializers, they are faster because they aren’t being registered in FliteController’s say: method anymore.
- Their unique settings like their pitch and speed are now encapsulated inside their packaging rather than being set in a long list of conditions in Flitecontroller.
- Since most developers were using one voice at most, and it was usually 16-bit SLT, it was wasteful to have everyone downloading a 100MB distribution of which 85MB were uncommonly-used voices. All the other voices are still available for download in the OpenEarsExtras package at BitBucket, but it’s no longer necessary to download them in order to get started with OpenEars.
- Most developers found the steps for recompiling the framework with only the used voices confusing, resulting in most developers shipping their apps with all the voices enabled meaning that the app binaries were extremely large. Because this was difficult and/or counterintuitive, it also lead to an impression that OpenEars required shipping a large binary when that isn’t the case. It is much more intuitive to just add voices that you are using instead of having all voices automatically added and having to find the instructions on removing them. It also means that there is no requirement to ever recompile the framework project now; all of the OpenEars options can be controlled by your app at runtime.
- Using a voice that you haven’t successfully added to your project will now result in standard Objective-C errors and warnings, whereas previously if you used a voice that you had commented out of the framework you would get a crash without any warnings at build time.
- The OpenEars framework can now be compiled in vastly less time.
So, I hope that it’s clear why I took the step of making an API change here in order to switch over to FliteVoice as a new class with packaged voices. OpenEars still ships with the SLT voice and the documentation and tutorial page have explanations and examples about how to use the new voice packages, but please feel free to bring any stumbling blocks over to the forums as well because I know that API changes can be jarring and this was one where providing backwards-compatibility would not have been practical.