Thursday, November 29, 2012

Serendipity in languages

The three posts ( one , two , three ) below form a set of posts on the topic of serendipity in languages. Please do send me a note if you have thoughts or comments on this.

I really do prefer to use the terms 'pseudoreduplication'/'pseudoreduplicates' although I am proposing the (somewhat less elegant) terminology of 'serendreduplication'/'serendreduplicates'. Also, I would prefer to use the terms 'pseudohomophones'/'psuedohomophony' where I am using the (somewhat less elegant) terminology of 'serendhomophones'/'serendhomophony'. I will be using these terms interchangeably. The use of 'serend-' terminology is there to disambiguate matters if any confusion should arise.

Update: Thanks to my friend Krishna Kunchithapadam for his suggestions, including and particularly the point on retaining the use of the terms 'pseudoreduplication' and 'pseudohomophony'.

Second Update: Note that 'pseudohomophony/'serendhomophony' is a statistical concept. If a significant fraction of speakers of language B think that word XB in language B is a homophone of the word XA in language A, then XB and XA are 'pseudohomophones'/'serendhomophones'.

Third Update (Nov 30, 4:34 pm): Note that my definition of 'serendhomophony' is general enough to cover the case where word XB in language B is a homophone of the word XA in language A for a particular performance, for instance, for a single Youtube video of a song. A different Youtube video of the same song could lead to a different word XB-1A in language B to being a homophone of the word XA in language A.

You could think of the performance of a song performance you listen to on Youtube as a process. It is a process (cf. stochastic process) producing words XA1, XA2, .. XAn in language A. Simultaneously, it is also a process producing the words XB1, XB2, ..., XBn in language B for speakers of language B serendhomophonous with XA1, XA2, ..., XAn. And equally simultaneously, it is also a  process producing the words XC1, XC2, ..., XCn in language C (again serendhomophonously). And so on. (You could cover every language and dialect on earth).

A different performance of the same song on Youtube will be a process  producing words XA1, XA2, .. XAn in language A. But simultaneously, it will also be a process producing the words XB-1A, XB-2A, ..., XB-nA in language B for speakers of language B (serendhomophonously). And equally simultaneously, it is also a  process producing the words XC-1A, XC-2A, ..., XC-nA in language C (again serendhomophonously). And so on.

Note that there are two levels of homophonousness here : per performance and per language.

More on serendhomophones and serendreduplication

So I am coining two new sets of terms in the field of linguistics. The first set is 'serendhomophony'/'serendhomophones' and the second one is 'serendreduplicates'/'serendreduplication'. Perhaps, people have been implicitly aware that this stuff was out there. But bringing something to explicit awareness by codifying it in academic language counts. And I don't think anyone has actually described it in the mathematical language of chance and probability. So here is a very simple but very precise description of these two terms.

First, 'serendhomophone'/'pseudohomophone'. A 'serendhomophone'/'pseudohomophone' is defined as follows. Sometimes, you may have a set of words/phrases in language A that are homophones with words/phrases in language B. This homophony is usually quite apparent for speakers of language B who don't know language A. These word/phrase pairs are called 'serendhomophones'.

Why does this happen? Pure chance. Every sound in language B maps to some sound in language A. So when words in language A are read out in close succession as in speech or song, the words in language A map to some arbitrary words in language B. (This mapping is not fixed. Based on intonation et cetera, the mapping can even change.) So essentially, words and phrases X1, X2 and X3 in language A that happen to map to words and phrases Y1, Y2 and Y3 in language B when they are spoken in a particular way and in a particular sequence. Note that there is a lot of serendipity in terms of what the particular word Y1 will end up meaning in language B. (Sounds have a more or less arbitrary assignment to meaning, generally speaking.) Note also that languages often don't have the same set of sounds, and so the correspondence between pseudohomophones may not really be very exact and often is not.

Here is a 'misheard lyrics' version of the Tamil song "Kalluri vaanil kaayndha nilaavo?" (now often referred to as "Benny Lava") featuring the Tamil dancer Prabhu Deva.

Note that Buffalax does not use actual pseudohomophones, but a subset of the sounds in the 'misheard lyrics' version of the song are pseudohomophones.

The first few pseudohomophones for the words in the song would be the ones below.

'Kalluri' <=> 'Cull lure E'
'vaanil' <=> 'Vaughn nil'
'kaayndha nilaavo' <=> 'coin the nil ah woe''

Now, for 'serendreduplicates'/'pseudoreduplicates'. 'Pseudoreduplicates' are words/stems of different origins that are thrown together to form 'duplication' within a word. (These words are also called 'serendreduplicates' in my terminology). 'Pseudoreduplication' is the phenomenon wherein a word looks like a reduplicate in that there is some apparent duplication within the word but the word is, in fact, not a reduplicate, and the reduplication has, in fact, arisen by chance.

In a nutshell, 'serendreduplicate'/'pseudoreduplicate' words have some 'duplication' in them that has arisen by sheer luck. It is worth emphasizing this point. Why does reduplication arise? Well, by pure chance. Or, you could call it serendipity.

Serendreduplication and serendhomophony

First a note on the terms 'serendreduplication' and 'seredreduplicates'. A Google search for pseudoreduplication returns no less than 44 hits, out of which at least two ( 12 ) point to academic papers in linguistics. So I have decided to avoid the term 'pseudoreduplication' since that term is already being used in linguistics.

Footnote 2 in the paper by Avram has this to say about pseudo-reduplication. 'Also called “quasi-reduplicated forms” (Bakker 2003: 40), “phonological reduplicated base form” (Miller 2003: 290), “fixed forms” or “fossilized forms” (Wellens 2003: 226). The comparative Austronesian dictionary says this about 'Quasi-reduplicates' : "Quasi-reduplicates" are those words that lack the corresponding unreduplicated forms e.g. didi 'pig, swine', nisnis 'beard'. It seems that the original unreduplicated form may have been lost within the language leaving just the reduplicated form as a sort of fossil. Note that this is not at all what serendreduplication is. It is, well, something completely different.

Now, serendreduplication may be used to motivate the idea that many things in languages may have arisen purely out of chance. I can think of so many things like that. For instance, the (Chinese) name Yao (as in Dennis Yao) is the slang word in Tamil for 'sir' or 'man' (and comes from the word 'ayya'). Pure chance that it is the case. (Think of what it would be like if a guy walked up to a Tamil speaker and told him his name is "Yao Yennaya"). And there is also the more unfortunate case involving the firm "Lund International'. The name of the firm has a somewhat funny meaning in Hindi.

But the point is that this sort of serendipity arises. And this sort of serendipity is, in fact, exactly what Buffalax of misheard lyrics fame exploits all the time. So what should we call the pairs of words and phrases that sound the same in the 'misheard lyrics'? I say let us call them 'serendhomophones'. They are words or phrases that similar to each other but it is pure serendipity that it is so. Note that the correspondence is not exact. It is just "close enough". And so, that leaves us with the minor matter of naming the phenomenon. And that is simple. This phenomenon we shall call 'serendhomophony'/'pseudohomophony'.

Update: Another example of 'pseudoreduplication' would be the word 'metametals', a word that I just coined to describe elements which occur next to metals in the Periodic Table. Of course, many metals are also metametals.

The term 'metametals' could also be used to describe artificial 'metals' engineered to have certain types of properties, that is, a metamaterial that is a metal. The word 'metametal' is a good example of 'serendreduplication', and is probably a better example of 'serendreduplication'/'pseudoreduplication' than 'metamathematics'.                          

Second Update: Googling for 'metametals' returns plenty of hits.

Final Update: Here is a 'misheard lyrics' version of someone doing "Gangnam style". It is one funny video. But if you are a speaker of an Indian language as well as English, I absolutely insist that you watch it. That will give you an idea of what is going on in the brains of non-Tamil speakers when they hear the "Benny Lava" song. There is a remarkable correspondence between the lyrics for this version of the song and what you hear in your head if you are an English speaker who doesn't speak Korean. It is exactly the same thing that is going on with the "Benny Lava" song as well. Words that may not seem like homophones to you may, in fact, be homophones inside the brains of English speakers who don't speak Tamil.

Serendreduplication, a new term in linguistics

I would like to coin a new term in linguistics called 'Serendreduplication'. (I also sometimes refer to it as 'pseudoreduplication' since that seems to be a more natural way to describe what is going on.) Now, reduplication, as the linguists among you will know, is a morphological process in which, as SIL International's site puts it, "a root or stem or part of it is repeated."

Well, 'serendreduplication'/'pseudoreduplication' is the phenomenon in which although it seems like the root or stem of a word (or part of it) is repeated exactly, it is actually two different words with entirely separate origins that are combined serendipitously (or there is some other type of "happy accident") such that it looks like there is reduplication going on.

Two examples of this are 'metamathematics' and 'abracadabra' in English. First, 'metamathematics'. The prefix 'meta' comes from the Greek preposition μετά = "after", "beyond", "adjacent", "self" whereas the word 'mathematics' comes from the Greek word μάθημα máthēm - "knowledge, study, learning".

Note that the prefix 'meta' is always what is used to describe a concept that is an abstraction from another concept. It is just sheer chance that the prefix finds itself added in front of a word that sounds like it. Not my favorite example (and trust me, there is a far better one coming), but it gets us started on the right track.

Okay, onto the word 'abracadabra'. The word 'abracadabra' is a word that is not only fun to say but interesting in its own right. It is known to have origins in Aramaic. As Wikipedia puts it, 'Although at first glance "Abracadabra" appears to be an English rhyming reduplication it in fact is not; instead, it is derived from the Aramaic formula "Abəra kaDavəra" meaning "I would create as I spoke")" Princeton's Allison Chaney has a helpful introduction to the word on her site.

The first known mention of the word ABRACADABRA was in the 2nd century CE in a book called Liber Medicinalis [1] (sometimes known asDe Medicina Praecepta Saluberrima) by Quintus Serenus Sammonicus,physician to the Roman emperor Caracalla, who prescribed thatmalaria[2] sufferers wear an amulet containing the word written in the form of a triangle:[3]

A - B - R - A - C - A - D - A - B - R - A
A - B - R - A - C - A - D - A - B - R
A - B - R - A - C - A - D - A - B
A - B - R - A - C - A - D - A
A - B - R - A - C - A - D
A - B - R - A - C - A
A - B - R - A - C
A - B - R - A
A - B - R
A - B

This, he explained, diminishes the hold over the patient of the spirit of the disease. Other Roman emperors, including Geta and Alexander Severus, were followers of the medical teachings of Serenus Sammonicus and are likely to have used the incantation as well.

I came to think about this idea due to a quiz question by my friend Govind Krishamurthi which went as follows :

QUESTION: In Tamil examples of this linguistic construction are:
Mada-Mada (faster)
In Hindi, 
Examples would be

There are also several examples in Telugu.. but I didn't post all of them.. Some of you can give examples in Telugu, Kannada and other languages too.

So the question is what is this linguistic construction called?


That gives you a good idea of what reduplication is. It also gives you enough examples to play around with so that you can infer what serendreduplication/pseudoreduplication is as well.

Monday, November 26, 2012

Gangnam Style

The latest news from the exciting world of technology is that "Gangnam Style" has now become Youtube's most viewed video. It has received more than 800 million views. That is right. 800 mil. Anyway, here is the video. Enjoy!

Tuesday, November 20, 2012

Draper University

The latest news on Tim Draper's Draper University:
Draper and his project team will host a neighborhood meeting tonight to discuss the temporary pilot program at the hotel, located at 44 E. Third Ave. The incubator program is designed to support the successful development of startups through an array of business support resources and collaborative mentoring with the university’s network of contacts. 
Recently, the city just relaxed its downtown retail requirements that mandated storefronts on ground floors be used solely for retail purposes. 
Before the city relaxed the rules, however, the former owners of the Collective Antiques building, the Musich family, asked the city for a variance to allow a startup, SnapLogic, to operate on the ground floor. 
Some residents, however, urged the City Council not to grant the variance to keep downtown pedestrian and retail oriented.

Sunday, November 18, 2012

Big Data to Create 1.9M IT Jobs in U.S. By 2015

From :
Big data, which refers to data collected and analyzed from every imaginable source, is becoming an engine of job creation as businesses discover ways to turn data into revenue, says Gartner. By 2015, it is expected to create 4.4 million IT jobs globally, of which 1.9 million will be in the U.S. 
Applying an economic multiplier to those jobs, Gartner expects that each big data IT job added to the economy will create employment for three more people outside the tech industry in the U.S., adding six million jobs to the economy. That's the kind of estimate that presidential candidates, if they focused on IT's impact on the economy instead of fossil fuel fracking and pipelines, might jump on. 
But Sondergaard's estimate included a caveat -- namely, that there's a shortage of skilled workers. Only a third of the big data jobs will be filled.

Friday, November 16, 2012

Why you can't vote online

From MIT's Technology Review:
A decade and a half into the Web revolution, we do much of our banking and shopping online.   So why can’t we vote over the Internet? The answer is that voting presents specific kinds of very hard problems. 
Even though some countries do it and there have been trial runs in some precincts in the United States, computer security experts at a Princeton symposium last week made clear that online voting cannot be verifiably secure, and invites disaster in a close, contentious race.

Saturday, November 10, 2012

Comment : Comment to Chris Langan, the "smartest man in the world"

Below are some comments I made to Chris Langan. Chris Langan who? Well, Chris Langan was dubbed "the smartest man in the world" in one Youtube video and "the smartest man in America" by Esquire magazine. He has an amazingly high IQ - estimated to be between 195 and 210.

For years, Langan has been developing a theory he calls the CTMU ('Cognitive-Theoretic Model of the Universe'). It took me less than three hours to find the mistakes in his theory, a mistake which basically dooms it. Here are the comments that I made on the blog 'Three Quarks Daily'. Note the times at which I made my comments. The first one came at 3:18:00 PM and the follow up came at 5:37:00 PM.

Thursday, November 8, 2012

Comment : comment to Prof. Sieg Hecker

Here are some comments I sent to Prof. Sieg Hecker, a world expert on nuclear weapons.


In the case of India, I have been surprised that India continues to adopt what I would term an 'underdog' stance. India talks about nuclear 'have's and 'have-not's and justifies its own nuclear own program on this basis. I believe this to be a mistake. I think it would be better for India to adopt a policy of 'persistent ambiguity' (or, if you prefer, 'calculated ambiguity'). 

Sunday, November 4, 2012

China is building a 100-petaflop supercomputer

From Infoworld :
As the U.S. launched what's expected to be the world's fastest supercomputer at 20 petaflops, China is building a machine that is intended to be five times faster when it is deployed in 2015.
China's Tianhe-2 supercomputer will run at 100 petaflops (quadrillion floating-point calculations per second), according to the Guangzhou Supercomputing Center, where the machine will be housed.

Friday, November 2, 2012

Predicting what topics will trend on Twitter

From MIT news :
Twitter’s home page features a regularly updated list of topics that are “trending,” meaning that tweets about them have suddenly exploded in volume. A position on the list is highly coveted as a source of free publicity, but the selection of topics is automatic, based on a proprietary algorithm that factors in both the number of tweets and recent increases in that number.
At the Interdisciplinary Workshop on Information and Decision in Social Networks at MIT in November, Associate Professor Devavrat Shah and his student Stanislav Nikolov will present a new algorithm that can, with 95 percent accuracy, predict which topics will trend an average of an hour and a half before Twitter’s algorithm puts them on the list — and sometimes as much as four or five hours before.