Jump to content

The History of LanguageConverter on Wikipedia-zh


At first, Wikipedia-zh write one article in two pages — one for Simplified Han script (writing system) (zh-Hans), another for Traditional Han script (zh-Hant) and placed a link to different version (page) at the top of the page.

However, with the increasing articles, the community acknowledged that it will be hard to maintain different variants on different pages, and need to sync back and forth across different articles. Templates, help pages also need to be "doubled", and manual conversion costs a lot of time, especially with long content, source code and/or syntax.

so something like there are 2 systems of writing?

But with some common parts.


I was always wondering why there is that language region switcher. Has it something with China mainland or?

Something with Wikipedia-zh.

You mean variant menu + LanguageConverter, right?


Keep moving on: Then, the proposal for "making a content converter in MediaWiki" came out.

"With a content converter, we can easily making both writing system display at the same place."

There were many kinds of solutions take into the discussion.

And the first implement is more like a plugin or extension. It is using the simplest way: replace.

It also only provides two variants: Simplified script and Traditional script.

Then, the community discovered a big issue:

character-by-character conversion doesn't fit the need of reading with "native terms".

This also created another barrier reading the article even after the conversion.

Thus, the term conversion functionality is added to the converter.

Example: lift (BrE) <=> elevator (ArE); web log <=> blog.

And we now have the design and the functionality requirements of the feature.

As it is a key feature, LanguageConverter then being added/implemented into MediaWiki core. It is first binded to the Language PHP class.

As the local term functionality is part of the feature, zh-Hans-CN, zh-Hans-SG, zh-Hant-TW, zh-Hant-HK are now valid variants in the converter.

But it's not the same scope nowadays.

zh-Hans-SG was a combination of Singapore Simplified Chinese and Malaysia Simplified Chinese.

zh-Hant-HK was a combination of Hong Kong Traditional Chinese and Macau Traditional Chinese.

I didn't remember the order, but:

These 2 variants then divided to 4 variants: zh-Hans-MY and zh-Hant-MO are being added/split.

So now, combining with "unconverted", "Simplified Han script", "Traditional Han script", we have 9 variants in zh:

zh: unconverted
zh-hans / zh-Hans: Simplified Han script
zh-hant / zh-Hant: Traditional Han script
zh-cn / zh-Hans-CN: Mainland China Simplified Chinese
zh-sg / zh-Hans-SG: Singapore Simplified Chinese
zh-my / zh-Hans-MY: Malaysia Simplified Chinese
zh-tw / zh-Hant-TW: Taiwan Traditional Chinese
zh-hk / zh-Hant-HK: Hong Kong Traditional Chinese
zh-mo / zh-Hant-MO: Macau Traditional Chinese

And guess what happend next?

LanguageConverter have been split to another independent class in core MediaWiki.

Those converter files have been moved twice.

Also, the new VisualEditor still doesn't support LanguageConverter functionality, which become a blocker to make 2017 editors became the default editor on WMF zh sites.

And then it was moved to subdirectory of /language again.

Then, what will going next?

[ To be continued ]

Cookies help us deliver our services. By using our services, you agree to our use of cookies.