This repository is for the F-Droid version of the Android app 'Pinyin Web & EPUB and topolects'.  The following details are from https://ssb22.user.srcf.net/adjuster/pinyin.html



## Pinyin Web & EPUB and related apps

I maintained these Android applications under the pseudonym "Referenced
Expressions" between 2014 and 2024, and then a Google Play policy change
enforced the use of real names for individual developers going into 2025.

My reasons for using a pseudonym had been twofold. After having used material
from a certain religious website in developing and testing the app, I felt I
should link that website from the app's start page to acknowledge what I had
"tested on". But I didn't want to give anyone the impression that I was trying
to take credit for that site myself, plus I had an ill relative who might have
become more ill if he knew I was linking to a religious site, and for both of
these reasons I wanted to avoid putting my name on the app. In my case the
second reason is no longer present and the first reason is rendered moot by
new policy (I didn't _want_ to put my name there, I _had_ to), plus the
numbers of that site's readers using my app dropped off when the site
implemented more browsable pinyin of its own (so the issue had become
quantitatively smaller) so I felt I could now "unmask" this project. I don't
know if any other developer had to delete their work rather than be unmasked.

### Google Play Store listings

Since I saw an elderly lady still using the app on an eight-year-old Galaxy S2
phone in 2019, I decided to minimise the size of my apps by keeping the
different topolects in different apps and by removing some of the more obscure
words from the dataset for the Google Play versions (I'm assuming for example
that users outside of China will not require so many mainland place names).
The following are available on Play Store:

  * [Pinyin Web & EPUB](https://play.google.com/store/apps/details?id=org.ucam.ssb22.pinyinwol) (generally good-quality pinyin as far as automatic pinyin goes; should recognise when a site is providing its own pinyin and defer to it; paragraph audio supported on some devices) 
  * [Cantonese Web & EPUB](https://play.google.com/store/apps/details?id=org.ucam.ssb22.cantonesewol) (generally good-quality Cantonese as far as automatic annotation goes; choice of Yale, Sidney Lau or Jyutping; isolated word audio via [Gradint](https://ssb22.user.srcf.net/gradint/) server + paragraph audio supported on some devices) 
  * [Teochew Web & EPUB](https://play.google.com/store/apps/details?id=org.ucam.ssb22.teochewb) (I'm told the Teochew annotation is OK although I wasn't completely sure of my data sources) 
  * [Wenzhou Web & EPUB](https://play.google.com/store/apps/details?id=org.ucam.ssb22.wenzhoub) (beware this one is not as good at choosing the correct reading for ambiguous characters in as many cases as the above apps) 
  * [Fuqing Web & EPUB](https://play.google.com/store/apps/details?id=org.ucam.ssb22.fuqingb) (highly experimental annotator which I'm told is not very good and my test users have not yet been able to provide specific details for improvement) 
  * [Thai Romanising Browser](https://play.google.com/store/apps/details?id=org.ucam.ssb22.thaib) (this one did not work out so well, it's not maintained and I should probably remove it) 

### Side-loading links

If you are not able to use the Play Store, you may download the same APK files
from here, but you'll need to enable your phone's "Unknown sources" setting
and it won't update automatically.

  * [Pinyin Web & EPUB side-loading APK](https://ssb22.user.srcf.net/adjuster/PinyinWol.apk)
  * [Cantonese Web & EPUB side-loading APK](https://ssb22.user.srcf.net/adjuster/CantoneseWol.apk)
  * [Teochew Web & EPUB side-loading APK](https://ssb22.user.srcf.net/adjuster/TeochewB.apk)
  * [Wenzhou Web & EPUB side-loading APK](https://ssb22.user.srcf.net/adjuster/WenzhouB.apk)
  * [Fuqing Web & EPUB side-loading APK](https://ssb22.user.srcf.net/adjuster/FuqingB.apk) 

### Huawei AppGallery version

In 2019 US legislation forced Google to ban Huawei from using the Play Store
and consequently there was a period when Huawei sold Android phones in the UK
without the Play Store, some of which ended up with low-end customers of H3G
(Three), at least one of whom wanted to install my app without side-loading,
so as a service to these, and to other Huawei users, I applied to put my app
on the Huawei AppGallery as well.

This version merges multiple annotators, since I assumed app size was not an
issue on the 2019 phones and I really didn't want to go through the process of
setting up multiple different apps again. It also includes the mainland place
names I withheld from the Google Play version for app size reasons.

I wanted to give it a different variation of the app name in case anyone had
both installed on the same device, but the checkers didn't like me calling it
"Huawei edition" or "AppGallery edition" and they weren't even sure that
"merged edition" might be a promotional suffix, so in the end we went for
"Pinyin Web & EPUB and dialects" (I wanted to say "topolects" but that would
have taken me over the 30-character limit).

This version originally omitted the link to the religious site which is
blocked in China. Nevertheless by the end of 2022 regulations had been
tightened and Huawei was obliged to say _all_ apps require an ICP license to
be allowed in China, whether they link to religious sites or not, and ICP
licenses are not available to people without Chinese citizenship (unless you
want to pay the price of a house for some company to do it for you, or burden
a Chinese friend with the risk of legally vouching for your code), so Huawei
made my app available in every country _except_ China. So I restored the test
link for the convenience of the UK users who wanted that site, but a technical
problem prevented those releases from being submitted for approval until 2025,
then a Huawei tester marked the app as "having religious content" which
resulted in its removal from a bunch of Islamic countries (Afghanistan,
Bahrain, Iraq, Kuwait, Oman, Pakistan, Palestine, Qatar, Saudi Arabia, Turkey
and the United Arab Emirates) as well as China, although three of those
countries---Pakistan, Palestine and Turkey---were in fact allowing that
religion to legally operate, so perhaps Huawei is muddled about which
countries say what.

Meanwhile in 2024 Huawei rated all third-party apps containing any
unrestricted Web browser as "age 18+" globally, and disallowed this age rating
in China. (Google continued to rate browsers as suitable for all ages on the
grounds that the age rating of individual websites should be assessed
separately.)

AppGallery will pretend the app doesn't exist if you try to search for it from
a country where it's not listed, and I think people there who got it before
are not receiving updates (and Huawei regulations didn't allow my own update
checks, so this version doesn't prompt you to update if it's more than a year
old as the Google Play version does).

Here is the [AppGallery link for all other
countries](https://appgallery1.huawei.com/#/app/C102545265) (labelled "18+" as
discussed above---although I admit it seems bizarre that I can say "I made an
_X-rated_ app to help a retired lady learn Chinese").

### F-Droid version

This is currently a slightly modified build of the AppGallery version with a
different package name.

[F-Droid link](https://f-droid.org/en/packages/org.ucam.ssb22.pinyinfdroid/)

### Browser extensions

These are switchable between Pinyin, Cantonese, Teochew and Wenzhou.

  * [Mozilla Firefox "Pinyin Web" add-on](https://addons.mozilla.org/addon/pinyin-web/) (desktop or mobile) 
  * [Google Chrome "Pinyin Web" extension](https://chrome.google.com/webstore/detail/gdpihnamgpclgpigocgfmhckbkhelken) (desktop)   
(yes, this was the one I [deliberately broke at the Oxford China
Forum](https://youtu.be/aLMpbvh5udQ?t=11199) to show "AI" has limits)

If you're stuck on an old Windows PC with no Internet:

  1. Download [Pinyin-Clipboard](https://ssb22.user.srcf.net/adjuster/Pinyin-Clipboard.zip) or [Cantonese-Clipboard](https://ssb22.user.srcf.net/adjuster/Cantonese-Clipboard.zip), transfer to the PC via removable media or whatever and unpack
  2. Put the text you want on the clipboard and run. No "installation" necessary, just run the EXEs.
  3. On Windows 7+, click the **small "More options" link** to reveal the "Run anyway" option. You should have to do this only once. Sorry I haven't paid Microsoft to be a "known publisher" to make this warning go away.
  4. Please update manually from time to time as there is no auto update with these.

### iOS

I have not been able to port my main app to iOS or iPadOS because it is
fundamentally an extended Web browser and Apple has strict policies about
third-party Web browsers so I'm not confident I'll pass their test and I don't
think it's worth paying to try. (I was already rejected by Amazon on the
grounds that they didn't want any Web browser in their store other than their
Silk browser, so I really didn't think it was worth the hassle and expense of
trying for iOS.)

However, other iOS developers have managed to incorporate my code and/or data
into iOS apps that are not Web browsers. But **some of them had to charge
money** to cover Apple's ongoing developer fees and the costs of keeping their
Mac hardware sufficiently current for development (it's far more expensive to
be an Apple developer than to be a Google Play developer, so you won't find so
many "free with no ads" hobbyist public service apps on Apple). Anyway their
apps include:

  * Matthew Delmarter's [Equipd Bible](https://apps.apple.com/us/app/equipd-bible/id998059934) and [ServicePlanner](https://apps.apple.com/us/app/serviceplanner/id1268476525) apps (both paid) use a version of my annotator code and data for Mandarin, Cantonese and Japanese, 
  * Jon Hargett's [3lines.org app](https://apps.apple.com/us/app/3lines-org/id1352194221) (gratis) or [Chinese Annotator](https://apps.apple.com/us/app/download/id1195474397) (paid) uses some of my data for Mandarin and Cantonese, although it's not always up-to-date (this can sometimes be mitigated by manually refreshing the dictionary), and it's not able to use my _code_ but I was able to generate weightings to kludge its own code into giving a nearly-right result, 
  * Michael Buen's "Chinese Words Separator" extension (paid on Safari, gratis on Chrome) now at least includes words from CedPane, and Pleco with paid Flashcards component can also import CedPane and use it in its Reader, which is at least something even if it can't run the rest of my code.

### Online version

I'm no longer running a full [domain-rewriting
proxy](https://ssb22.user.srcf.net/adjuster/) with this functionality (at
least not on my public server) as in 2013 it was mistaken for a subversion
tool in a Russian religious extremism trial. I still have a [CGI that lets you
paste in your own
text](https://ssb22.user.srcf.net//ssb22.user.srcf.net/annotate.cgi) and it
can set up bookmarklets to annotate pages if you're adventurous. Since this
requires sending your text to my server, I suggest you use it only as a last
resort.

### Source material

My _code_ to compile annotators from examples is [Annotator
Generator](https://ssb22.user.srcf.net/adjuster/annogen.html) which I've
liberally licensed and also written up in an Overload paper, but the _corpus_
I use is not in a publishable state. It is made up of:

  * the 1990/91 PH Corpus, with some manual corrections by me (which I'm [not sure I can republish](https://ssb22.user.srcf.net/law/ph-corpus.html)), 
  * my [CedPane project](https://ssb22.user.srcf.net/cedpane/) (both the main file and the auxiliary gloss file), plus some of the unpublished entries that I'm insufficiently sure qualify for CedPane but think they're still acceptable for my own app, 
  * some extra gloss data from a friend's text-annotation project I had permission to use in my own apps, 
  * fallback single-character readings from the Unihan database with some manual corrections by me to get the most preferred reading, 
  * a bunch of extra example sentences I've put in as and when I've found it needs more help to get a particular case right, 
  * and quite a bit of Chinese-with-Pinyin text from [JW publications](https://www.jw.org/en/library/) with some normalisations and a few dozen (rare) typos corrected by myself (I used a UK legal exemption for responsible private download for non-commercial text mining but not to republish).
I'm frequently asked why I don't use CC-CEDICT, at least for definitions. I
avoid it because their "strong copyleft" license is not compatible with my use
of non-republishable sources---I'm not in a position to redistribute the
combined data under that license as apparently required. I did try asking CC-
CEDICT for an exception, but they asked to see my app and then stopped
communicating. Maybe they didn't like the mention of JW, as CEDICT was founded
by a Mormon and at least some Mormons see the JW religion as an enemy. They've
not merged CedPane into CEDICT either---I did _try_ to make sure CedPane
contains nothing JW-specific but I can't _guarantee_ anything so perhaps
that's a risk CEDICT would rather not take.

I don't mean to offend anyone but LDS websites were not able to provide me
with good-quality Chinese-with-Pinyin publications in a format I could parse
with correct word grouping, whereas JW.org could. It might make Mormons feel
better to know that I _did_ use an LDS Quad for **Japanese** data when making
the Japanese annotator for Matthew Delmarter's apps; I just couldn't find
decent _Chinese_ data from that source.

(And LDS are not the only ones frightened of JW data: I've also been hastily
cursed and shunned by Evangelicals for mentioning I used it in language
learning. I'm not sure which misconception causes them to overestimate the
threat level of a simple group to the point of such impetuousness: if data
were _that_ powerful, wouldn't growth rates be higher? _I_ don't like rock
music and it's not going on my playlist but I wouldn't write someone off for
saying it helped them, although sadly Dennett reported a surgeon cut out the
part of his patient's brain that had been linked to _Out ta Get Me_
\---perhaps even some of the most qualified people need more discernment
between curating their own input vs understanding others. I suspect most anti-
JW sentiment is actually a vestige of wartime hysteria when they were vilified
for refusing to fight, subsequently perpetuated by other groups overly afraid
of attrition, and rejustified using whatever additional unpopular opinions JW
happened to publish, but I'm not equipped to make a definitive diagnosis.)


### Copyright and Trademarks

All material (C) Silas S. Brown unless otherwise stated.  
Android is a trademark of Google LLC.  
Apple is a trademark of Apple Inc.  
Firefox is a registered trademark of The Mozilla Foundation.  
Google is a trademark of Google LLC.  
Google Play is a trademark of Google LLC.  
H3G is a trademark of Hutchison Whampoa Enterprises Limited.  
Huawei is a trademark of Huawei Technologies Co., Ltd registered in China and
other countries.  
JW.org is a trademark of Watch Tower Bible and Tract Society of Pennsylvania.  
LDS is possibly registered as a trademark in some countries by Intellectual
Reserve Inc (owned by the Corporation of the President of The Church of Jesus
Christ of Latter-day Saints) but I was unable to find which countries.  
Mac is a trademark of Apple Inc.  
Microsoft is a registered trademark of Microsoft Corp.  
Mormon is registered as a trademark in Europe held by Intellectual Reserve
Inc. which is owned by the Corporation of the President of The Church of Jesus
Christ of Latter-day Saints.  
Mozilla is a registered trademark of The Mozilla Foundation.  
Safari is a registered trademark of Apple Inc.  
Windows is a registered trademark of Microsoft Corp.  
Any other [trademarks](https://ssb22.user.srcf.net/trademarks.html) I
mentioned without realising are trademarks of their respective holders.

  *[APK]: https://ssb22.user.srcf.net/adjuster/Android Package Kit
  *[ICP]: https://ssb22.user.srcf.net/adjuster/Internet Content Provider

