Common Names that are Also Common Words or Homophonic to Common Words

(and thus very vulnerable to puns)

Amber
Bob
Carol
Grace
Harry
Hunter
Jack
Joy
Justin (two words)
Kat
Lane
Mark
Matt
Max
Mike
Miles
Nick
Ray
Rich
Roger
Rose
Shirley
Taylor
Van
Victor
Wayne

Advertisements

Totals over Time

once a month for an average year: 12 times
once an hour for an average day: 24 times
once a day for an average month: 30 times
once a second for an average minute: 60 times
once a minute for an average hour: 60 times
once a year for an average American lifespan: 79 times
once a day for an average year: 365 times
once an hour for an average month: 730 times
once a month for an average American lifespan: 945 times
once a minute for an average day: 1440 times
once a second for an average hour: 3600 times
once an hour for an average year: 8766 times
once a day for an average American lifespan: 28760 times
once a minute for an average month: 43830 times
once a second for an average day: 86400 times
once a minute for an average year: 525960 times
once an hour for an average American lifespan: 690235 times
once a second for an average month: 2 629 800 times
once a second for an average year: 31 557 600 times
once a minute for an average American lifespan: 41 414 090 times
once a second for an average American lifespan: 2 484 845 424 times

S¯ã: A Syllabically Fixed-Length Romanization of Mandarin Chinese

The remarkable stringency of Mandarin Chinese phonotactics make for a set of possible syllables restricted and systematic enough that one could reasonably endeavor to devise a not-too-complicated romanization system where each possible syllable has the same number of characters. Here, I provide such a system where this number of characters is 3.

Here’s the plan: we take advantage of how Mandarin phonotactics prohibit any consonant clustering whatsoever (until one decides to interpret certain affricates as being actually clusters, but why would one want to do that?) and how only two consonants (both being nasals) are feasible for a syllabic coda. Since the Mandarin consonant inventory isn’t particularly large either (in fact only comparable in size to the English consonant inventory after counting all allophones as separate sounds), we just dedicate one character to a possible onset consonant, and absorb the possible ending nasal into characters dedicated to the syllable’s vowel(s), like how Polish uses ogoneks (ą) and Portuguese uses tildes (ã).

The hardest part of making a mapping is representing the wide palette of vowels Mandarin has to offer. Hence, the remaining two characters of a syllable that are not the onset consonant are dedicated to the vowel, including the absorption of a possible end nasal.

Here’s the mappings for the first character of the syllable, the onset consonant. In each cell representing a phone, the unparenthesized portion is the sound in IPA, and the parenthesized portion is the character used in this transliteration.

consonants_3mandarin

(For comparison, this is the chart of mappings for Pinyin romanization.

chinese_pinyin

)

No character needs to be assigned to the velar nasal (ŋ), as it never occurs in the start of syllables. Most character assignments are intuitive from the perspective of most Latin-script languages. The c/s/z system is modeled after the usage of these letters in Polish, Czech, and Slovak, which have similar affricate situations to Mandarin. The diacritic assigned to retroflex (the caron) is chosen in correspondence to the diacritic used for the slightly-more-anterior postalveolar sounds in Czech and Slovak. The circumflex is chosen to be the diacritic to represent alveolopalatal sounds because it is a caron upside-down, which reflects how Mandarin’s alveolopalatal series is in complementary distribution to its retroflex series.

Continue reading “S¯ã: A Syllabically Fixed-Length Romanization of Mandarin Chinese”

Gradescope is a Pleasant Surprise of Well-Thought-Out Design in Academic Software

Having been a TA for 3 semesters (and a college student for 11) has taught me that most academic software is insanely bad. Like, really, really bad. Gradescope is, among this vast wasteland of despair, not only an oasis, but a really pleasant one.

Given the baseline, I of course fear that I may just be judging Gradescope on too excessively low a bar. Am I giving it credit just for being able to have expected middle-click functionality?

I don’t think so. I believe that for a reasonable bar for software quality, Gradescope not only meets expectations, but exceeds them. Gradescope is actively nice to use, particularly from a staff perspective, and from what I’m used to with academic software, this is completely incredible, and deserves a treasure trove of praise.

In short: most software comes with negative surprises, realizations that it is harder to use than it looked like it was. Gradescope often comes with positive surprises, realizations that it is easier to use than expected.

Someone on the Gradescope team really understands quality user interface design. Elements of Gradescope typically do precisely what one expects them to do; they are given helpful names that well describe their functionality. Where one would want to directly click and edit text, one can in fact use such direct input. (To edit rubric items, one simply clicks on them and they become text boxes. It is not indirected via an edit button or the such. And oh hey! These text boxes support LaTeX!)

Common functionality comes with an assortment of hotkeys, exactly what a grader would be seeking once they have done the same actions many times in a row, and hotkeys that take the same functionality as buttons pop up upon mouse hover over the corresponding buttons. For hotkeys for rubric items, they are simply presented next to the items themselves, without hover even necessary, since as these are numbers, one would naturally want to be able to see the associated numbers at-a-glance rather than memorizing them.

A common regret of graders when grading papers by hand is realization upon certain submissions that a certain penalty or credit on the rubric is probably too harsh or too lenient, and then realizing that one would have to go through the entire stack of papers again to find the students whose grades one should adjust to meet a new standard. Does one have to do the same, but electronically, when using Gradescope? Of course not. Gradescope allows you to filter by a rubric item to see all submissions which have already been assigned that rubric item, and immediately have all the papers that should be reconsidered. If one is only changing the point value of that particular rubric item, one doesn’t even need to go through the papers; one just edits the score associated with it.

Both students and staff benefit from an easy-to-use regrade request feature, which allows for a nice communication channel with which to deal with regrades. As staff, you could have all the submissions in front of you and compare one student’s submission with others and more quickly decide what a fair thing to do is.

Gradescope is software that actually makes grading massively more efficient; there is none of what the rest of academic software does in making you wish you were still doing things the old way.

And every so often, Gradescope rolls new updates. These updates are well tested, are actually features (more useful than shiny), and play along nicely with what has been around before. Recently Gradescope rolled out a prototype of a handwriting recognizer. I’m already really happy with how many names it successfully recognized that we don’t have to manually match anymore.

Gradescope is proper technological innovation.

331 Hours Below Freezing

On the evening of last December 25, Boston dipped below freezing.

This in itself is not unusual; this is quite expectable for Boston in the winter. What’s different is that this time the temperature did not return to above freezing until just last hour. Boston spent a consecutive 13 days and 19 hours—from then until now, in the negative Celsius.

Specifically, these were the highs and lows of the days in passing:

Dec 26: 27°F/-3°C | 19°F/-7°C
Dec 27: 20°F/-7°C | 12°F/-11°C
Dec 28: 12°F/-11°C | 5°F/-15°C
Dec 29: 14°F/-10°C | 2°F/-17°C
Dec 30: 18°F/-8°C | 6°F/-14°C
Dec 31: 13°F /-11°C| 4°F/-16°C
Jan 01: 13°F/-11°C | 0°F/-18°C
Jan 02: 19°F/-7°C | 4°F/-16°C
Jan 03: 29°F/-2°C | 16°F/-9°C
Jan 04: 30°F/-1°C | 22°F/-6°C
Jan 05: 24°F/-4°C | 6°F/-14°C
Jan 06: 12°F/-11°C | 1°F/-17°C
Jan 07: 17°F/-8°C | -2°F/-19°C

For reference, the average Bostonian December high and low are respectively 41°F/5°C and 28°F/-2°C, and for January 36°F/2°C and 22°F/-6°C.

Winds weren’t forgiving in much of these times either. It often got too cold even in the inside of my dorm, so I actually spent a large portion of this time elsewhere, wandering from location to location, working on my thesis.

But yes, those of you who went home away from MIT for the winter vacation, this is what you missed.

Categories MIT

Dual Frontier Analysis

I. Introduction, with Example in Population and Area of Countries and Country-Like Entities

In this post, I introduce a way of looking at correlated data I will term “dual frontier analysis”.

What motivates this idea? Often, we like to compare entities via a certain “rate”, how much of one quantity there is for a unit amount of another quantity, across a set of entities. One example of this is population density. But if you, like me, have glanced at a population density chart of, say, the countries, you may have had one of the same first reactions as I have had: “the top of the chart is pretty much just a listing of city-states!” You might then proceed with questioning whether it really makes sense to compare this quantity for city-states versus for “more normal” countries. Maybe we want a way of looking at this data that better captures what our prior idea of what an “impressively high” or “impressively low” population density is: Bangladesh’s population density definitely “feels” more impressive, even if it’s not as numerically high as Bahrain’s.

There are probably solutions to this problem involving designing a prior distribution of likeliness of one variable in terms of the other, and then comparing percentiles along respective distributions, but going down this path requires crunching a lot of numbers and, more importantly, extensive knowledge in the ideas being analyzed already.

Here is another solution: output the data on the dual frontiers. If two attributes are somewhat correlated, a scatterplot for entities in these attributes probably looks something like this.

scatterplot_example

What we’re outputting is this.

scatterplot_example_2

That is, we’re outputting entities for which no other entity has both more of one attribute and less of the other attribute than this entity.

In this way, we would capture, for instance, the country with the highest population density among countries of similar size. (We could even extend this to become a quantitative metric for entities not on this frontier: the percentage of the way an entity is from one frontier to the other.)

One could also look at an entity in this data and compare it to neighboring entities and see how much larger in one attribute another entity must be to be larger in the other attribute as well (as otherwise, this entity would also be in the frontier), which shows how prominently impressive a particular entity is in the ratio.

Continue reading “Dual Frontier Analysis”