Tuesday, November 12, 2013


Some Records Ought Not To Be Broken

A photo of a tatoo reading "Nothing Last's Forever!" inspired the comment, "'Everything First's Forever!' though." Indeed, while the first assertion is quite debatable--and may simply be a grammar and punctuation error--ordinals are special in a couple of ways. Once something is 'first', nothing else will displace it from that position and become 'first' in its place. The analog is true for 'second', 'third', and so on. But something which is 'last' may not always remain last --but it might, something which is best may not always remain best: something better may come along.

"Records are made to be broken," is a commonplace. In November 1914, Crown Prince Wilhelm Hohenzoller of Germany reportedly complained, in part,
"Undoubtedly this is the most stupid, senseless and unnecessary war of modern times...."

Is it still, or has there been a more stupid, senseless and unnecessary war since? Some records ought not to be broken.

Tags: :

StumbleUpon Toolbar Stumble It!

Sunday, November 10, 2013


Errata: Business Week Piece on Twitter

A piece of writing brought to my attention recently (The Hidden Technology that Makes Twitter Huge) led me to wonder, how does one say "noyer le poisson" in American? "Noyer le poisson" is an expression in French commonly used figuratively; literally translated it is "to drown the fish" (or "drowning the fish"). The figurative use means "to avoid a taboo topic or difficult subject, concealing it under a mound of details." "Beat around the bush" is probably a good approximation, but "snow job" might be better.

The Snow Job

The piece in question is hardly unique in attempting to snow readers, but much of the detail used to snow the reader is actually wrong and misleading, while a real technological innovation of Twitter's is not mentioned at all.  The reader is provided a supposed "look under the hood" at Twitter, the better to marvel at its genius. The originality is possibly overstated, the execution performance is understated, and look under the hood leaves the reader less well informed!

The omitted technology may not matter too much to shareholders as, valuable as it may be, it cannot explain much of Twitter's market value. Twitter uses Scala effectively: "Scala is one of the main application programming languages used at Twitter. Much of our infrastructure is written in Scala and we have several large libraries supporting our use....Our use of Scala is mainly for creating high volume services that form distributed systems...".

The Genius and Genesis of Twitter

There are lots and lots of prior practices and ideas that may have contributed to the Twitter service, including e-mail listserv (for broadcasting messages to subscribers using mail protocol) and pagers (and bippers).  SMS/text messages sent from mobile phones may have a role as a model.  But the real inspiration was people's status messages on IM! From Twitter on Scala (April 2009):
Twitter started as a hack project at a company called ODEO, which was focused on podcasting. As ODEO was having some troubles in its latter days as a company, they started experimenting, to keep engineers involved by letting them play around with ideas they had on the side. One of the engineers, Jack Dorsey, had been really interested in status. He was looking at his AIM buddy list, and seeing that all of these guys were saying, “I’m walking the dog,” “I’m working on this,” “I’m going to that.” He wondered if there was some way to make it easier for people to share that status. So he and a couple other engineers started prototyping what became Twitter on Ruby on Rails, which was the stack that ODEO was built on. And Twitter continues today to be primarily a Rails application, with a bunch of Ruby daemons doing asynchronous processing on the backend.
So, using Twitter might be likened to connecting to a buddy list where nobody ever sent messages, just status updates, and the status updates were logged for later perusal. Or an alphapager system wherein anyone can broadcast short messages to everyone listening on their channel. Or simply, as one of the developers put it, a transport-independent micro-blogging system. But who will be willing to pay for that, and how much?

Message and Content Containers

The BusinessWeek piece notes that a Twitter message has lots of meta-data or header information associated. This is true, but equally true of e-mail. The piece itself, saved as html, is over 200k bytes of stuff, very little of which is the text.
It is also questionable how reliable some of the information provided by senders is, particularly when their anonymity is of the essence. On a message sent via a VPN rather than a geo-tracking device (like a smartphone) the sender's whereabouts are not known any better than those of a mail sender. Some of the fields are optional, as well, and potentially of little analytical value. Among the "special" fields in Twitter messages, a study made by an intern at PARC in 2011 found that Only 66% Use Twitter Profile Location Field as Intended:
34% of Twitter users do not provide a valid geographic location on their Twitter user profiles. Instead, some of these users co-opt the field to make jokes, express their love for a particular celebrity or to shout back at Twitter that their location is "NON YA BUSINESS!" Others, meanwhile, provide no location information at all.


The factual errors concern the "snow" used to get the reader enthralled: API use and JSON. The use of technical terms was confusing, dazzling; it was not all wrong, really, but may have outstripped the author's understanding.
The first case is the explanation of JSON and its use. JSON is a set of syntax rules for representing labelled data, as is XML, and there are others. It happens that JSON uses a syntax very much like that used in ECMAScript, which (ECMAScript) served as its model. To call it "a simplified version of JavaScript" is not correct; leaving out all the control commands (loops and logic and such) is more than simplification, one does not have a programmable language left.

Similarly, and in the next sentence, "API essentially means “speaks (and reads) JSON.”" is not generally true. API essentially means "accepts requests from application programs and responds as best it can." It is an interface program, which may "speak" XML, html or something else instead of JSON. The flickr.com api, for instance, accepts requests using REST, XML-RPC or SOAP (but not JSON) and replies using REST, XML-RPC, SOAP, JSON, or PHP.

Another instance of confused reference to an API is in "For all the possibilities of APIs, there are also limits." A query result returned by a call to an API has flags set with warning messages, restricting publication; how is that in conflict with "all the possibilities of APIs"? It is not, "API" was worked in gratuitously.

Finally, and this may not bother all readers, stylistically writing of short text messages as if they were animals is inappropriate. "a tweet thrives," "once born, they're alone and must find their own way," is nonsense.

Note on the French figure of speech: Ref. (Figuré) Ne pas aborder un thème tabou ou un sujet difficile, le dissimuler sous un monceau de détails. from Wiktionary entry

Tags: :

Labels: , , , , , ,

StumbleUpon Toolbar Stumble It!

Wednesday, November 06, 2013


Abbreviating Lists: a Shorter Word for "Including" Would Be Nice to Have

When I first used INSEE statistical reports (long ago), one feature which caught my attention was the frequent use of"dont" entries. The French word "dont" can be translated to American as "including" or "of which". The statistical tables used this term to give partial breakdown of totals and aggregates, showing components of interest without itemizing all the details. For instance, one might show the number of households with pets followed by an indented line "dont chiens" to indicate the number of those households (or the portion) which have dogs. Other categories, such as snakes, gerbils, and so on might not be mentioned, especially if their frequencies are too low to be significant (and reliably measured).

Semantically, "dont" is used here to indicate "here is a partial list" of an aggregate just cited, implying that other members of the list may remain unnamed. In this sense, it is an abbreviated list: "this" + unnamed "other". It is a highlighting mechanism, with rhetorical value.

Such highlighting of parts of aggregates is not uncommon. In more general usage than reporting of statistics, one might say "I received three birthday cards in the mail this morning, including one from my brother." Or, "I received three birthday cards in the mail this morning, one of which was from my brother." The French could say (mixing languages for purposes of illustration) "I received three birthday cards in the mail this morning, dont one from my brother." It works just like "including" only smaller.

The concept came up recently capturing transactions in GnuCash. It is possible to split transactions, in that system, into multiple component lines. For instance, I enter my grocery store purchases with a separate line for pet food, for which I have a separate expense category. Similarly, I itemize other non-grocery purchases made at the grocery store, assigned to their respective expense categories. The lines entered must sum to the total, so I calculated what to put on the "just grocery" line, with some help from the input form's discrepancy calculation. The process could be easier for the user if a "dont" logic were available:
  1. enter the total, which generates a single entry.
  2. partially itemize, adding lines to be automatically deducted conserving the sum.
In other words, split and compensate.
Perhaps this is already possible, but I have not found how. Is the concept too "French" or generally easy to understand? If one said "it should have an 'including' capture feature" would anyone else understand? The wealth of other uses of "including", especially as a verb meaning aggregating or marshalling, makes it too ambiguous in this instance. But what other word would do better? Would "it should have an "of which" capture feature" be better understood? I believe that "il devrait y avoir une fonction de saisie 'dont'" would be understood by French developers (at least some of them).

A possible solution is to use "incl.", as is surely done quite often on reports. But how should it be pronounced?

Its American counterparts are "including" and "of which" but, what about "particularly" when used to achieve the highlighting?  One might also have said in the example used above, "I received three birthday cards in the mail this morning, particularly one from my brother," but to achieve an equivalent degree of highlighting in French one might say "...et plus particulièrement une de mon frère" not just "dont une de mon frère."

Tags: :

Labels: , ,

StumbleUpon Toolbar Stumble It!

This page is powered by Blogger. Isn't yours?