誤 font change/inconsistency

powelliptic · March 9, 2020, 4:19pm

Today Skritter stumped me with a character I had never seen before:

Somehow I’ve made it this far without ever having seen the non shu-zhe-zhe 誤. It’s not a bad thing to be confronted with – I’d like to be able to recognize common variants – but it’s a sudden change and (judging by moedict) not canonical. It also makes Skritter internally inconsistent:

Therebackagain · March 11, 2020, 4:22pm

Hmmm. When I go into any of my many Pleco dictionaries I cannot find the 言 form used with 吳 anywhere.

The traditional variants given in Pleco for the simplified 误 are the traditional 吳 as 悞 with radical 忄， or else as 吴 with radical 言.

So if Skritter is giving simplified 误‘s traditional form as 吳 with the radical 言, something would seem wrong, according to Pleco.

Except that, my phone’s keypad writes the traditional form of simplified 误 as 誤！ Yet this form is not in any of Pleco’s dictionaries.

Pleco’s PLC and CC dictionaries give the alternate traditional forms as 吳 with 忄or 吴 with 言 only.

It would appear Skritter’s form agrees with our keypads.

Anyone have insight into this?

powelliptic · March 11, 2020, 5:36pm

I noticed the same thing with the plain text font in Pleco, though I hadn’t remembered seeing it before. Pleco’s stroke page is quite explicit, though:

I’m pretty sure it’s a font question – i.e. it is considered a stylistic change, not a different character – so I don’t think it matters which dictionary you’re looking at. One theory I have is that both Skritter and Pleco are using device-provided fonts, and they are what changed. (Another possibility is that I just never noticed Pleco before and Skritter switched which font it uses.)

moedict, mdbg, yellowbridge, and Skritter all graphically depict the SZZ form, but have inconsistent results (except for moedict) when they delegate rendering to the browser. yellowbridge, for example, shows the SZZ form on one page:

Screen Shot 2020-03-11 at 09.54.53

and the H form on another:

Screen Shot 2020-03-11 at 10.12.06

For a dictionary, I think that is rather sloppy.

Apomixis · March 11, 2020, 6:02pm

If you press the button to switch the Pleco stroke order font (the lower left corner of the stroke diagram), the font switches and you see two different stroke diagrams showcasing the difference you’re highlighting… At least it shows two differences on my Pleco.

So, it does appear to be font-related.

SkritterOlle · March 12, 2020, 1:33pm

This is complicated and this answer will be long, but I’ll do my best to sort it out. For a more thorough discussion, please refer to this article I wrote a few years back.

Tl;dr: Yes, it’s at least partially a font issue, but it’s much more complicated than that. Sorting out and displaying all characters correctly is something we are striving for, but we are currently prioritising other areas of the app.

To start with, how characters are rendered in text (so the prompt in Skritter, for example, not the canvas) is determined by what font is being used. We currently use the same font for simplified and traditional, which works most of the time, but not always. If a certain character does not exist in that font, it defaults to the system font.

Some character variants have different code points, meaning that they are, from the font’s point of view, different items. So it’s theoretically possible to have a font that shows 吳 and 吴 differently (if they show up as being the same, the font in your browser does not make a difference between them). Here’s a picture so you know what I’m talking about in case they look the same for you:
Screenshot from 2020-03-12 13-58-49
So, in a sense, this is similar to separating simplified and traditional characters, which is obviously not merely a font issue. It requires both that the right variant is used and that both variants can be displayed properly in that font.

The problem is that these characters are not listed in any table of simplified characters. They are the same from an everyday, communication perspective; they just happen to be treated as separately on computers.
However, there’s a strong overlap with simplified and traditional characters and these variants. The first variant, 吳 is preferred in e.g. Taiwan and the second, 吴, is preferred on the mainland. But you can’t say one is the simplification of the other.

In your case, 誤 looks extra weird because it is obviously a traditional character that is being rendered with a font that can’t handle that particular component according to the relevant standard (Taiwan MoE, in our case).

So, what you’re seeing is a font that can’t handle all of this properly. To my knowledge, there are no such fonts that can handle both correctly (but if anyone knows of one, I’d be really, really interested in hearing about it).

One option to fix this specific instance would be to use a separate font when displaying what might be traditional characters, but that comes with its own problems and introduces its own inconsistencies, which I’ll explain in a bit. We do care about traditional (a majority of the team uses traditional characters), so this is something we want to fix. However, we also know that a very large majority of users aren’t studying traditional, so we’re prioritising other things that improve the experience for all users before fixing this.

To make things even more complicated, there are characters that share the same code point, meaning that fonts (and by extension, computers) cannot, even in theory, differentiate between them.

For example, 骨 is always rendered with the small box inside the top box on the same side in any given font. In Taiwan standard, it should be on the right, but in PRC standard, it’s on the left. Since these are identical from the computer’s point of view, there is no way to render them differently, unless two separate fonts are used (one which only follows Taiwan MoE, the other which follows PRC standards) and switched between somehow based on context.

And this is where the writing canvas comes in. In theory, we can do anything we want there, because it’s not a rendered font, it’s a manually constructed collection of strokes. However, since we (and every other learning app I have ever seen) treat 骨 as being the same in simplified and traditional Chinese, it won’t work, because the box is supposed to be on the right in one standard and on the left in another.

This makes things fairly complicated. For example, if you know 丰 (S), we don’t assume that you know 豐 (T), but it is safe to say that if you know 骨 (with the box on the right) you also know 骨 (with the box on the left). This is not a traditional/simplified split, it’s another split that largely (but not entirely if you include Hong Kong, for example) mirrors the traditional/simplified split. And of course, if you know 吳 you also know 吴.

None of this is insurmountable. We can and want to provide better support for this in Skritter, but it won’t happen right now because other things have higher priority. I wrote this answer to show that this is actually a very complicated question and it’s not just a matter of changing fonts or fixing an error specific to Skritter. These things can be fixed, but they require substantial changes to character-related data and how it’s handled and displayed! We’ve been building a system to allow us to bypass tricky font and unicode limitations like this, so this will be fixed eventually in the future.

Apomixis · March 12, 2020, 11:45pm

Thanks for the post on the standard references. I found a digital hardcopy scan of the stroke order dictionary for Simplified at Here

It has always bugged me that Skritter wants the last 3 strokes of 懂 written differently than Pleco demonstrates in its stroke-order diagram.

However, the linked reference agrees with Pleco. Does that mean Skritter is not following its standard?

Gwilym · March 13, 2020, 7:04am

The stroke order variations we have are thus. The teaching stroke matches that of Pleco and 现代汉语通用字笔顺规范, and we also allow 3 other variations, including the MOE Taiwanese variation (minus the separated 艸）, and alternate ways of writing 忄. Let me know if you have any other ways you want to write it based on relevant sources

Direct Link to image

Apomixis · March 14, 2020, 12:42am

But the problem is that when you are going through the regular writing prompt, the stroke hints DO NOT give you the order correctly. Here’s the pics, where I hold my finger down on the screen to have Skritter show me the next suggested stroke during the regular writing prompt (not the Learning/Teaching writing prompt)

So, it appears that the teaching writing prompt hints and the regular writing prompt hints are mechanically separate on the Skritter backend, and also incorrectly different.

It never occurred to me that the teaching order would be different from the writing hint order. I can’t really get back to the teaching prompt (I don’t see how to “Learn” A word that I’ve already learned). I can only see the regular writing prompt… And I had assumed they were both the same.

Looking back again at your image, it appears that the second block of script is what the writing hints are using, instead of the first block for both teaching and writing?

My opinion is that the hints given during the writing prompts should be the same as the order shown during the teaching/learning prompts. Do you agree? Or are they purposefully different?

SkritterJake · March 14, 2020, 6:34pm

They’re not purposefully different, it’s just a bug. We’ll fix it up!

No need to “learn” a character again. If you wanna see our official stroke order for any character just tap the gesture button on the far right below the canvas.

Thanks for the clarification on where you’re seeing the issue. That was helpful!

SkritterJake · March 19, 2020, 7:30pm

@Apomixis next suggested stroke issue should be fixed in the latest beta build. We’re rolling it out now!

system · April 18, 2020, 7:30pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.