Major overhaul of single-syllable audio

As some of you might have already noticed, we have uploaded newly recorded audio for all single-syllable words in the new apps (that’s the Android app and Skritter: Write Chinese on iOS) This includes all individual character readings and single-syllable words. This audio update will also be coming to the website in the future once we update to the v3 endpoint for vocabs.

Since the recordings for single characters is used very often, we thought it was worthwhile to go through all of them and make sure both pronunciation and audio quality is good throughout. It’s now also consistent, with all audio being recorded by the same person, Xiaolu, who has recorded for Skritter before and ought to be familiar already.

If you’re using the new apps and notice any single-syllable audio that is wrong in some way or where the quality isn’t top-notch, please let us know so we can fix it! I have manually listened to every single recording, but since there are about 1600 of them, errors could have slipped through.

For the curious of you out there, there are roughly 400 syllables in Mandarin, which leads to 1600 combinations if each syllable can be pronounced in all four tones. Naturally, that is not the case; there are plenty of syllables that can only be pronounced with three, two or even one single tone. Rather than trying to keep track of these, we recorded all of them. Thus, this really should cover everything.

For the really curious, we also grabbed a list of all single-syllable items listed with a neutral tone in Skritter and recorded those. This is a bit tricky, because neutral tones usually don’t exist in a vacuum and can’t really be produced naturally without context, but a shorter, lighter pronunciation is used instead, somewhere slightly above the middle of the tone range.

Enjoy the new audio!


HI Olle,

I don’t want to ruin your party, but purely out of curiosity why didn’t Skritter bought some male and female voice audio files. In the meantime Skritter could have built functionality that made it possible for us to choose between listening to a male or a female voice. :thinking:

As is often the case with good stuff (and I agree that providing more audio is good, of course), it’s a matter of priority. We offer manually recorded audio for tens of thousands of words, but there are actually lots of words that don’t have any audio or where the audio was recorded a very long time a go with inadequate quality. So we naturally prioritise recording new audio or replacing bad audio over providing alternatives for words where we already have audio of high quality!

Yes Olle I totally agree with you, it’s always a matter of priority. But no one dares to tell the management that recording 1600 single-syllable items is a waste of time if you also can buy them with the same (or even beter) quality.

Aren’t we talking about different things here? How the audio is acquired is probably a separate discussion (and in my experience, having done this more than once, is not as easy as it sounds) from whether it’s worthwhile to add alternatives to already existing audio. The added utility for users of having high-quality audio for common words is orders of magnitude higher than having alternative, male audio. That doesn’t mean that it wouldn’t be sweet to have additional audio, but it means it probably won’t happen while there’s still a backlog of vocabulary that has no audio or low-quality audio. But maybe you meant something else?

Glad to hear.

By the way, when is the new app going to include the sound for individual characters within words, as exists in the legacy app?

The new app is oddly silent!

This is a much missed learning opportunity both for extra aural practice (repetition, repetition, repetition) and especially for practice distinguishing between how characters are intoned on their own as opposed to how they are intoned within a compound.

Please make Skritter natter away at me again. Going silent is going backwards.

Our main reasons for not playing single-syllable audio within multi-syllable words were that:

  1. It would be jarring to hear several different speakers say different parts of the word
  2. Individual syllable audio does not necessarily match word-level audio (sandhi)

Now that we have updated all the single-syllable audio, the first is no longer an issue, although the second one is. This is maybe not a problem for non-beginners who already know about tone changes, but it’s pedagogically awful to teach people to say e.g. 老師 with a full third tone on the first syllable.

We discussed this in a meeting yesterday and concluded that now that the first reason mentioned above is no longer an issue, giving users the option to toggle single syllable audio within multi-syllable words is a worthwhile option to have. However, because of the second reason, we decided that it should not be the default behaviour.

As for when this will be implemented in the app, 3.8 is a reasonable estimate. We have other, more urgent updates and fixes that have higher priority, but we will get around to adding what you want into the app. Hope you can endure the silence until then! :slight_smile:


That’s wonderful news, thank you! An option is a great idea.

Olle, I’ve been noticing more often recently some Taiwanese-variant pronunciations in the audio. I’m assuming from the spelling of her name that Xiaolu is not from Taiwan, so I suppose I am hearing some older audio and it is just a coincidence that I am noticing it now.

I’ve been noticing more often recently some Taiwanese-variant pronunciations in the audio. I’m assuming from the spelling of her name that Xiaolu is not from Taiwan, so I suppose I am hearing some older audio and it is just a coincidence that I am noticing it now.

This is somewhat complicated to address, but I’ll do my best.

I assume that by “Taiwanese-variant pronunciations”, you mean cases where the standard pronunciation differs, such as 研究 being pronounced yánjiū in PRC standard and yánjiù according to Taiwanese standard. The pronunciation of these words have not been updated or changed in any systematic way ever, as far as I know. Also, they (almost) only affect polysyllabic words rather than single characters, which were the focus on this overhaul. As Skritter only supports one reading for each (polysyllabic) word, that reading should be the PRC standard. If you find cases where this is not true, please let us know!

But if you mean that the actual recordings don’t match the pronunciation shown, that’s another issue entirely. Xiaolu is from Beijing and has passed the relevant pronunciation exams required of teachers in China. I have also manually listened to all the uploaded audio. That doesn’t mean that there are no errors, but they ought to be very rare! If you find any of these, please let us know as well. :slight_smile:

There is another possibility though, but that ought to have gone away when this update was rolled out fully (i.e. around when this thread was started). The way we did it was to give Xiaolu audio the highest priority, which for some single characters with multiple pronunciations meant that her audio was being played first, even if it was not the pronunciation listed first. So, for example, if 长 is listed with both cháng and zhǎng, it would play zhǎng first if we had Xiaolu audio for that, but not cháng. Is this maybe what you are referring to? That problem should have gone away completely once all audio was uploaded, because now we have Xiaolu audio for every conceivable syllable. If you’re still seeing this, there’s something wrong.

Another possibility is that you’re using legacy iOS, in which case none of the audio updates (neither this nor earlier ones) will have any effect. But I assume that’s not the case since you’re discussing the update, just mentioning it here in case someone else strolls by later and wonder where all the awesome audio is!

1 Like

Your assumption is correct; I was referring to words like 研究 and 头发 where the tones differ between the mainland standard and the Taiwan standard.

Separately, I have noticed a few mistakes on the sample sentences recorded by Fiona. Sometimes a word she says is mispronounced, or else it doesn’t correspond to the text. Other times I think it is an audio issue where the beginning of the sentence is cut off. I am mostly using the most current web version of Skriter when I am studying the HSK lists for which she recorded the sentences, and I don’t see any place for the user to report an error.

Okay, I see. There’s no reason to believe that these differences in polysyllabic words should have become more common recently, so my guess is that it’s just coincidence or some kind of noticing effect. Ideally, we should restructure our data to allow multiple pronunciations for polysyllabic words, but that’s not at the top of the priority list at the moment.

However, we should always follow the standards we have set. If we don’t, we’d appreciate if you (or anyone else) contact us. A “report error” button is on it’s way, but in the meantime, just email and specify what it says now and what it’s supposed to say. That goes for sentences as well! We do our best to quality control, but we have to rely on user feedback to catch errors!

The male voice saying 披萨 just made me laugh, he’s using autotune :joy:

1 Like

Yes, not very good. I removed the audio since we already had a second (and better) recording for that word. Note that the update I’m talking about in this post only covers single-syllable audio. We’re working on overhauling very common (HSK1-4) word-level audio too, but we’re not done yet.

But… But… But… But… Okay, no laughing only studying. Got it. :woozy_face:

1 Like

Did it reach “so bad it’s good” territory?!?

Depends on what you fancy, I guess, typical “husky robot”. :slight_smile:

Sounded like T-Pain was talking chinese in his bathroom :smile: