S¯ã: A Syllabically Fixed-Length Romanization of Mandarin Chinese

The remarkable stringency of Mandarin Chinese phonotactics make for a set of possible syllables restricted and systematic enough that one could reasonably endeavor to devise a not-too-complicated romanization system where each possible syllable has the same number of characters. Here, I provide such a system where this number of characters is 3.

Here’s the plan: we take advantage of how Mandarin phonotactics prohibit any consonant clustering whatsoever (until one decides to interpret certain affricates as being actually clusters, but why would one want to do that?) and how only two consonants (both being nasals) are feasible for a syllabic coda. Since the Mandarin consonant inventory isn’t particularly large either (in fact only comparable in size to the English consonant inventory after counting all allophones as separate sounds), we just dedicate one character to a possible onset consonant, and absorb the possible ending nasal into characters dedicated to the syllable’s vowel(s), like how Polish uses ogoneks (ą) and Portuguese uses tildes (ã).

The hardest part of making a mapping is representing the wide palette of vowels Mandarin has to offer. Hence, the remaining two characters of a syllable that are not the onset consonant are dedicated to the vowel, including the absorption of a possible end nasal.

Here’s the mappings for the first character of the syllable, the onset consonant. In each cell representing a phone, the unparenthesized portion is the sound in IPA, and the parenthesized portion is the character used in this transliteration.

consonants_3mandarin

(For comparison, this is the chart of mappings for Pinyin romanization.

chinese_pinyin

)

No character needs to be assigned to the velar nasal (ŋ), as it never occurs in the start of syllables. Most character assignments are intuitive from the perspective of most Latin-script languages. The c/s/z system is modeled after the usage of these letters in Polish, Czech, and Slovak, which have similar affricate situations to Mandarin. The diacritic assigned to retroflex (the caron) is chosen in correspondence to the diacritic used for the slightly-more-anterior postalveolar sounds in Czech and Slovak. The circumflex is chosen to be the diacritic to represent alveolopalatal sounds because it is a caron upside-down, which reflects how Mandarin’s alveolopalatal series is in complementary distribution to its retroflex series.

Continue reading “S¯ã: A Syllabically Fixed-Length Romanization of Mandarin Chinese”

Advertisements

Gradescope is a Pleasant Surprise of Well-Thought-Out Design in Academic Software

Having been a TA for 3 semesters (and a college student for 11) has taught me that most academic software is insanely bad. Like, really, really bad. Gradescope is, among this vast wasteland of despair, not only an oasis, but a really pleasant one.

Given the baseline, I of course fear that I may just be judging Gradescope on too excessively low a bar. Am I giving it credit just for being able to have expected middle-click functionality?

I don’t think so. I believe that for a reasonable bar for software quality, Gradescope not only meets expectations, but exceeds them. Gradescope is actively nice to use, particularly from a staff perspective, and from what I’m used to with academic software, this is completely incredible, and deserves a treasure trove of praise.

In short: most software comes with negative surprises, realizations that it is harder to use than it looked like it was. Gradescope often comes with positive surprises, realizations that it is easier to use than expected.

Someone on the Gradescope team really understands quality user interface design. Elements of Gradescope typically do precisely what one expects them to do; they are given helpful names that well describe their functionality. Where one would want to directly click and edit text, one can in fact use such direct input. (To edit rubric items, one simply clicks on them and they become text boxes. It is not indirected via an edit button or the such. And oh hey! These text boxes support LaTeX!)

Common functionality comes with an assortment of hotkeys, exactly what a grader would be seeking once they have done the same actions many times in a row, and hotkeys that take the same functionality as buttons pop up upon mouse hover over the corresponding buttons. For hotkeys for rubric items, they are simply presented next to the items themselves, without hover even necessary, since as these are numbers, one would naturally want to be able to see the associated numbers at-a-glance rather than memorizing them.

A common regret of graders when grading papers by hand is realization upon certain submissions that a certain penalty or credit on the rubric is probably too harsh or too lenient, and then realizing that one would have to go through the entire stack of papers again to find the students whose grades one should adjust to meet a new standard. Does one have to do the same, but electronically, when using Gradescope? Of course not. Gradescope allows you to filter by a rubric item to see all submissions which have already been assigned that rubric item, and immediately have all the papers that should be reconsidered. If one is only changing the point value of that particular rubric item, one doesn’t even need to go through the papers; one just edits the score associated with it.

Both students and staff benefit from an easy-to-use regrade request feature, which allows for a nice communication channel with which to deal with regrades. As staff, you could have all the submissions in front of you and compare one student’s submission with others and more quickly decide what a fair thing to do is.

Gradescope is software that actually makes grading massively more efficient; there is none of what the rest of academic software does in making you wish you were still doing things the old way.

And every so often, Gradescope rolls new updates. These updates are well tested, are actually features (more useful than shiny), and play along nicely with what has been around before. Recently Gradescope rolled out a prototype of a handwriting recognizer. I’m already really happy with how many names it successfully recognized that we don’t have to manually match anymore.

Gradescope is proper technological innovation.

331 Hours Below Freezing

On the evening of last December 25, Boston dipped below freezing.

This in itself is not unusual; this is quite expectable for Boston in the winter. What’s different is that this time the temperature did not return to above freezing until just last hour. Boston spent a consecutive 13 days and 19 hours—from then until now, in the negative Celsius.

Specifically, these were the highs and lows of the days in passing:

Dec 26: 27°F/-3°C | 19°F/-7°C
Dec 27: 20°F/-7°C | 12°F/-11°C
Dec 28: 12°F/-11°C | 5°F/-15°C
Dec 29: 14°F/-10°C | 2°F/-17°C
Dec 30: 18°F/-8°C | 6°F/-14°C
Dec 31: 13°F /-11°C| 4°F/-16°C
Jan 01: 13°F/-11°C | 0°F/-18°C
Jan 02: 19°F/-7°C | 4°F/-16°C
Jan 03: 29°F/-2°C | 16°F/-9°C
Jan 04: 30°F/-1°C | 22°F/-6°C
Jan 05: 24°F/-4°C | 6°F/-14°C
Jan 06: 12°F/-11°C | 1°F/-17°C
Jan 07: 17°F/-8°C | -2°F/-19°C

For reference, the average Bostonian December high and low are respectively 41°F/5°C and 28°F/-2°C, and for January 36°F/2°C and 22°F/-6°C.

Winds weren’t forgiving in much of these times either. It often got too cold even in the inside of my dorm, so I actually spent a large portion of this time elsewhere, wandering from location to location, working on my thesis.

But yes, those of you who went home away from MIT for the winter vacation, this is what you missed.

Categories MIT

Dual Frontier Analysis

I. Introduction, with Example in Population and Area of Countries and Country-Like Entities

In this post, I introduce a way of looking at correlated data I will term “dual frontier analysis”.

What motivates this idea? Often, we like to compare entities via a certain “rate”, how much of one quantity there is for a unit amount of another quantity, across a set of entities. One example of this is population density. But if you, like me, have glanced at a population density chart of, say, the countries, you may have had one of the same first reactions as I have had: “the top of the chart is pretty much just a listing of city-states!” You might then proceed with questioning whether it really makes sense to compare this quantity for city-states versus for “more normal” countries. Maybe we want a way of looking at this data that better captures what our prior idea of what an “impressively high” or “impressively low” population density is: Bangladesh’s population density definitely “feels” more impressive, even if it’s not as numerically high as Bahrain’s.

There are probably solutions to this problem involving designing a prior distribution of likeliness of one variable in terms of the other, and then comparing percentiles along respective distributions, but going down this path requires crunching a lot of numbers and, more importantly, extensive knowledge in the ideas being analyzed already.

Here is another solution: output the data on the dual frontiers. If two attributes are somewhat correlated, a scatterplot for entities in these attributes probably looks something like this.

scatterplot_example

What we’re outputting is this.

scatterplot_example_2

That is, we’re outputting entities for which no other entity has both more of one attribute and less of the other attribute than this entity.

In this way, we would capture, for instance, the country with the highest population density among countries of similar size. (We could even extend this to become a quantitative metric for entities not on this frontier: the percentage of the way an entity is from one frontier to the other.)

One could also look at an entity in this data and compare it to neighboring entities and see how much larger in one attribute another entity must be to be larger in the other attribute as well (as otherwise, this entity would also be in the frontier), which shows how prominently impressive a particular entity is in the ratio.

Continue reading “Dual Frontier Analysis”

dzaefn No Longer

Long story short: I no longer wish to go by dzaefn. You can call me by my real name, or other options listed three paragraphs below. I’m going to try to stop referring to myself by this, and I’d like for you to stop referring to me as such as well. I don’t intend to abandon this name as much as possible, just to ease it out, let it stay where it is not easily changed, and just indicate when appropriate that I prefer other names.

There have definitely been people that I’ve expressly told that I am dzaefn, particularly at times in my life that I just didn’t like that my real name was what it happened to be. (Some of these people even concluded that was actually my real name. Oops.) In any case, both from this and from other effects, a lot of people, maybe you, primarily refer to me by the name ‘dzaefn’. I’m sorry to tell you that I just do not wish to be called this name anymore, and I apologize for inconveniences in mental nomenclature reassignments.

Why has ‘dzaefn’ fallen out of my favor? Honestly, I’ve just gotten to a point where I feel the things with what it stands for and the z-replacing-s got too silly. It derived from a previous username that I used, ‘d684n’ (which, in fact, I first used when making this here very blog, 6.5 years ago), which stood for ‘dotted sixth and eight-fourth-th note’ (an expansion of ‘dotted eighth note’), in which I replaced the s with a z because I thought the string ‘dsa’ looked too qwertylike. I’m serious, this is the origin of this username. I’ve had at least a few really confused faces and at least someone ask me if I was trolling (trust me, I’m not nearly that good at trolling) among the various times I’ve explained this username. Eventually, after changing what it stood for a few times, I decided it really doesn’t stand for anything at all. And eventually (read: now) I decided if I’ve gotten to this point I should really just discard this. It really wasn’t a put-together-that-well name, and it didn’t become better. I don’t feel any remote sense of juiciness about the name anymore.

So what should you call me? Several options. My full first name is fine. Unlike the times in which I’ve told people to call me dzaefn, I’ve come to better terms with my own name. In case you’re not sure what my full first name is, it has six letters. More online, there’s at least three usernames over the past year I’ve gotten convinced I plan to keep as monikers permanently (which is honestly the first time in my entire life I’ve actually felt so): 004413 (my main username these days), 0xGG (the username I use in gaming), and xer0a (pronounced hay-ru-ah, [‘heɪɹuɑ]), the last of which you might have noticed is my username on WordPress now. If you don’t like that these names take more than a syllable to pronounce, the second half of my full first name is a fine shortening.

There will be several places where I’ll just let the fact that my username is dzaefn carry on. It turns out that my starting to use ‘dzaefn’ falls quite near my coming to MIT, and my current decision falls very near the end of my formal times at MIT, so it just makes sense and is convenient to have it associate with my MIT presence: I’ll still use it for logistical MIT business, and I’ll retain the /u/dzaefn account for posting to /r/mit, and I’ll just remind people that it only happens to be my username, and that I don’t wish to be referred to as such any longer.

That’s all. Happy birthday to Satyendra Bose today, and Isaac Asimov tomorrow.

Categories MIT

Slamming Donald Trump

What do you criticize, joke about, or insult Donald Trump for? Is it one of these things…

List A

  • Funding a costly, useless, and inflammatory border wall
  • Creating a fraudulent university
  • Bragging about sexual assault
  • Withdrawing from the Paris climate deal
  • Incredibly high frequency of lying
  • Bragging that he could shoot someone and still not lose voters
  • (this list goes on)

…or is it one of these things…

List B

  • His hair
  • Being ‘orange’
  • His voice
  • Being fat
  • ‘Small hands’

…?

If you take from List A, cool. If you on occasion take from List B, okay, people sometimes feel a need to make memetic references.

But if you sincerely consider yourself someone against Trump, and reference items in List B more than items in List A in your jokes and your slams, maybe you should take a moment and reflect on what it is about Trump that you are intending to fight against.

If you sincerely believe items in List B are more prominent disqualifiers than items in List A for Trump being in a position of power, than even though you’re against Trump and I’m against Trump, I’m very definitely not with you. In fact, I think I’m against what you stand for.

If you don’t sincerely believe so and believe List A has the important stuff, but you’re invoking List B’s items with greater frequency anyway, congratulations, it seems you and I agree there’s a nearly endless list of disturbingly problematic characteristics of Donald Trump you could pick on, and you decided that rather than anything in that list, you’d rather go for the low-hanging fruit of List B. In particular, you’re promoting the judging of people by their physical characteristics, rather than their actions, so you’re probably even exhibiting what you’re nominally fighting.

But everyone should reflect on why society has molded List B to indeed be the low-hanging fruit, the items that evoke the largest laughter or applause in the room. Think about how quickly people react or resort to the physical level, and that, in the supposedly mostly liberal circles in which anti-Trump is the normal, attacks on the physique stay. Maybe one should consider whether resorting to the such is naturally human, and whether one should really be dismissing judging on physical appearance as inhuman rather than a fault of humans to work against. And if one can have leeway for people making List B remarks on the left side, then one ought to consider possible exceptional natures of corresponding remarks on the right side.

And if you’re the type of person who watches liberal late-night comedy that’s saturated with List B insults, and pat yourself and your friends on the back for being the morally righteous ones, check what you pat yourself with.

Oh, and if what you make fun of is receiving golden showers in Russian hotels:

  1. So what’s your position on kink shaming?
  2. I hope you don’t complain about conspiracy theories. You’re invoking the literally unfounded, the sort of surmising that brings us delicacies like Pizzagate.