Thursday, December 29, 2005
Not-Too-Readable Text Files
Imagine my surprise: a downloaded zip file of 95 352 kb (or ko in French) contained one single text file (yes, ".txt") of 508 371 kb. Needless to say, I cannot browse it with Notepad.
I suppose I'll unzip it to a cd, and then... what? Can Wordpad or OOo read and display a file without loading it into memory? I guess I'll find out.
I can't wait to see its contents and hopefully understand why it couldn't have been split into smaller bits.
I suppose I'll unzip it to a cd, and then... what? Can Wordpad or OOo read and display a file without loading it into memory? I guess I'll find out.
I can't wait to see its contents and hopefully understand why it couldn't have been split into smaller bits.
Tuesday, December 27, 2005
Could It Be?
Reading the methodology notes on the National Adult Assessment of Literacy recently, I noted that 13 % of the households they visited turned out to be vacant or not dwelling places (see my post on December 18). I have a theory as to which database they used to seek out adults, literate or not: voter registration. It has been said that in some places, "cemeteries vote." Perhaps not just cemeteries!
Chicago Rules of Election Fraud have other tips for voter generation, and it is not just Chicago anymore! Christian Science Monitor:
Chicago Rules of Election Fraud have other tips for voter generation, and it is not just Chicago anymore! Christian Science Monitor:
MIAMI - Nothing says election fraud like a ballot cast from a cemetery. On paper, Manuel Yip seemed an active and regular voter in Miami, casting ballots four times between 1993 and 1997. There was just one problem: He passed away five years ago.
Closed is not Open
Nevertheless, one can park at the picnic area and walk along the side of the road to get a glimpse of this "boat ride": to enable canal barges to change altitude without a long (and time-consuming) series of locks, the "inclined plane" at Arzviller-St. Louis floats them into the large bucket which we see at the upper position on its track (left center, below) . Barges enter from the upper canal to our right, travel down to the basin (foreground) and exit to our left, or vice versa.
Monday, December 19, 2005
What Remarkable Creatures (we are)
Demographers (at the INED, in France) estimate that the world population of living humans reached six and a half billion December 19, 2005, give or take a couple of years. Fifty years ago, when we were only two point seven billion, they (demographers) predicted we would be 6.5 billion in 2005. Well done, humans! Rarely are long-range targets and plans achieved, and this, involving billions of people over a span of fifty years! We also managed to heat the planet to its highest temperatures known or estimable over the past 650,000 years. We are unstoppable, the sky is the limit! Or maybe not, but that is what we appear to assume, by and large.
In a recent article, Joel Cohen, of Columbia University and Rockfeller University, pointed out three or five remarkable demographic phenomena of our times (for those of us born before about 1960).
In a recent article, Joel Cohen, of Columbia University and Rockfeller University, pointed out three or five remarkable demographic phenomena of our times (for those of us born before about 1960).
- doubling of world population during one's lifetime: no-one born before about 1930 witnessed doubling of the world population, whereas anyone over 45 today lived in a world of only three billion in 1960 and over twice that number today.
- fastest population growth ever: between 1965 and 1970, population grew at 2.1 per cent per year. Such a rate was never achieved prior to the XXth century, and has since dropped substantially.
- voluntary reduction of population growth, from the 2.1 % p.a. noted above to the current 1.1-1.2 %. This necessarily includes the "Family Planning" effect in China, which may have debatable "voluntariness", but also include a lot of developed, European countries where the reduction has been voluntary.
- More mature population: Globally, the number of persons aged 60 years or overis expected almost to triple, increasing from 672 million in 2005 to nearly 1.9 billion by 2050...In developed countries, 20 per cent of today’s population is aged 60 years or over, and by 2050 that proportion is projected to be 32 per cent. The elderly population in developed countries has already surpassed the number of children (persons aged 0-14), and by 2050 there will be 2 elderly persons for every child. [source U.N. poster]
- Malthus vindicated: in 1950, one third of the population was in "rich" countries, but from 2005 to 2050 95% of population growth will occur in "poor" countries, raising their populations to six times that of "rich" countries. Nine countries accounting for over half the expected growth are India, Pakistan, Bangladesh, Nigeria, Congo, Uganda, Ethiopia, China, and the U.S.A. (due to immigration).
In 1980, the global workforce consisted of workers in the advanced countries, parts of Africa and most of Latin America. Approximately 960 million persons worked in these economies.
Population growth — largely in poorer countries — increased the number employed in these economies to about 1.46 billion workers by 2000.
But in the 1980s and 1990s, workers from China, India and the former Soviet bloc entered the global labor pool. Of course, these workers had existed before then. The difference, though, was that their economies suddenly joined the global system of production and consumption.
In 2000, those countries contributed 1.47 billion workers to the global labor pool — effectively doubling the size of the world's now connected workforce. read more
Sunday, December 18, 2005
Stalking Literacy
The recently published reports on the 2003 NAAL can be quite interesting. In particular, the "Web Documentation for NAAL General Audience Report" provides information on the sampling and statistical processing. It also demonstrates just how difficult to comprehend a combination of prose and quantitative information can become, and leads this writer to wonder about his own level; try this:
I wonder what the weighting for the final response rate was. It appears to relate 18,541 (18,186 + 355) to something, but to what? At any rate, 21.9% of the respondents selected to complete the background questionnaire and the assessment did not complete it, for reasons other than inability to communicate with the interviewer.
This is neither the first nor the last difficulty for the administrators. The first was to find households. Beginning with a nationally representative probability sample of 35,365 households (chosen from what database?), they encountered fully 13.2% (4,671 ) "were either vacant or not a dwelling unit". Subsequent screening (and representativity targets) then pared the group down to the 23,732 noted in the quoted paragraph above. Another five per cent of those tested did not properly complete all three scales. Amusingly, the prison inmates tested seem to have been more successful at staying in the sample and completing all three scales.
If I read their report correctly, the (non-prison) sample yield breakdown is this:
Next in this series: why this test is better than the OECD PISA test, and how it correlates to SAT verbal scores.
18,102 + 439 = 18,541 : final reporting sample plus language and mental disqualifications.
18,186 + 355 = 18,541 : completed background questionnaire plus language and mental disqualifications.
Ergo, 84 language and mental disqualifications were manifest during the assessment but not during the background questionnaire.
Based upon the screener data, 23,732 respondents aged 16 and older were selected to complete the background questionnaire and the assessment. Of the 23,732 household respondents selected, 18,186 completed the background questionnaire. Of the 5,546 respondents who did not complete the background questionnaire, 355 were unable to do so because of a literacy-related barrier; either the inability to communicate in English or Spanish (the two languages in which the background questionnaire was administered) or because of a mental disability. The final weighted response rate for the background questionnaire, including respondents who completed the background questionnaire and respondents who were unable to complete the background questionnaire because of language problems or a mental disability, was 75.6 percent.
I wonder what the weighting for the final response rate was. It appears to relate 18,541 (18,186 + 355) to something, but to what? At any rate, 21.9% of the respondents selected to complete the background questionnaire and the assessment did not complete it, for reasons other than inability to communicate with the interviewer.
This is neither the first nor the last difficulty for the administrators. The first was to find households. Beginning with a nationally representative probability sample of 35,365 households (chosen from what database?), they encountered fully 13.2% (4,671 ) "were either vacant or not a dwelling unit". Subsequent screening (and representativity targets) then pared the group down to the 23,732 noted in the quoted paragraph above. Another five per cent of those tested did not properly complete all three scales. Amusingly, the prison inmates tested seem to have been more successful at staying in the sample and completing all three scales.
If I read their report correctly, the (non-prison) sample yield breakdown is this:
18,186 | adults age 16 or older completed the background questionnaire, | |
of whom | 17,178 | completed at least one literacy task on each of the three scales—prose, document, and quantitative—included on the assessment. |
That leaves | 1,008 | who completed the background questionnaire but not the full core test. |
Deducting | 84 | who were disqualified for language or mental reasons during the test (i.e., after the background questionnaire) |
leaves | 924 | who completed some, but not all three scales. |
One accounting given is | ||
859 | respondents answered the background questionnaire but refused to complete the assessment for reasons other than language issues or a mental disability. For these respondents, answers to one assessment item on each scale were imputed based upon the answers from respondents with similar background characteristics. | |
plus | 65 | a wrong response on each scale was imputed for respondents who started to answer the assessment but were unable to answer at least one question on each scale because of language issues or a mental disability. |
The aforementioned 859 would then have included | ||
an additional | 504 | were unable to answer at least one question on each of the three scales for literacy-related reasons. |
so we surmise that | 355 | just plain had no time to waste on such foolishness. |
Next in this series: why this test is better than the OECD PISA test, and how it correlates to SAT verbal scores.
18,102 + 439 = 18,541 : final reporting sample plus language and mental disqualifications.
18,186 + 355 = 18,541 : completed background questionnaire plus language and mental disqualifications.
Ergo, 84 language and mental disqualifications were manifest during the assessment but not during the background questionnaire.
Awfully said
According to CNN, reporting on the results of the National Assessment of Adult Literacy, "Eleven million people is an awful large number of folks who are not literate in English, and therefore are prevented access to what America offers," said Russ Whitehurst, director of the Institute of Education Sciences at the Education Department.
How disappointing that he did not use awfully, the adverb, rather than awful, the adjective which is informally used as a shortcut for the adverb.
How disappointing that he did not use awfully, the adverb, rather than awful, the adjective which is informally used as a shortcut for the adverb.
Copyright: veritanerian
1) one defending and furthering the cause of verity (verita-n-erian);
2) doctor providing medical care to aminals
No, this word is not in the dictionary; Google finds no occurences on the whole Internet! At least there were none found as of yesterday. Tomorrow the situation may be quite different.
Veritarian might be easier to say, and it could be argued that it would have the correct meaning (verita+rian) but it seems to already have been hijacked to name a dietary religion (like vegan, vegitarian).
2) doctor providing medical care to aminals
No, this word is not in the dictionary; Google finds no occurences on the whole Internet! At least there were none found as of yesterday. Tomorrow the situation may be quite different.
Veritarian might be easier to say, and it could be argued that it would have the correct meaning (verita+rian) but it seems to already have been hijacked to name a dietary religion (like vegan, vegitarian).
Copyright: anthropophage
anthropophage: living organism which feeds on humans.
A number of similar or related words appear on Internet dictionary searches:
Inasmuch as the cells which eat other cells are macrophages, not macrophagites nor macrophaginians, I recommend anthropophage as superior to the alternatives.
By the way, why macro+phage for a phago+cyte? macro+phage is a: "big+eater". Is it just a shortened form of macrophagocyte? I suspect so, but have not found documentary support.
It seems that a phago+cyte is a cell which eats, but not necessarily other cells: "eater+cell". The act of cell+eating is cyto+phagy. Is cytophage a word? Not as far as I can tell, so I hereby copyright it, too.
Be that as it may, "It was a one-eyed, one-horned flying purple anthopophage" probably would not have sold nearly as well It was a one-eyed, one-horned flying purple people eater...
A number of similar or related words appear on Internet dictionary searches:
- anthropophagi,
- anthropophagic,
- anthropophagy,
- anthropophagite,
- anthropophagous.
Inasmuch as the cells which eat other cells are macrophages, not macrophagites nor macrophaginians, I recommend anthropophage as superior to the alternatives.
By the way, why macro+phage for a phago+cyte? macro+phage is a: "big+eater". Is it just a shortened form of macrophagocyte? I suspect so, but have not found documentary support.
It seems that a phago+cyte is a cell which eats, but not necessarily other cells: "eater+cell". The act of cell+eating is cyto+phagy. Is cytophage a word? Not as far as I can tell, so I hereby copyright it, too.
Be that as it may, "It was a one-eyed, one-horned flying purple anthopophage" probably would not have sold nearly as well It was a one-eyed, one-horned flying purple people eater...
Tuesday, December 06, 2005
Climate Change?
Strolling around town yesterday (i.e. the day before St Nicholas) I observed several Santas hanging around. I don't know whether the arrival of Santas in town before St Nicholas should be attributed to global warming. It does seem to reflect some changes in productivity of labor and supply chain management, though: more than once I spotted a pair of smallish Santas, reflecting attempts to use more less qualified labor. Meanwhile, a bunch of unemployed elves loitered in a garden.
Sunday, December 04, 2005
Fraud Cultures
Sen. Larry Craig, a Republican, told his home state constituents "Fraud is in the culture of Iraqis. I believe that is true in the state of Louisiana as well.” Craig was quoted as saying Thursday (October 13, 2005) in the Lewiston (Idaho) Morning Tribune.
But why bring the Iraqis into this? Why such a slur on the people Americans are dying to liberate?
I realize that the point he was trying to make was the need to have good accountability and oversight of the spending to rebuild after Hurricane Katrina. That is certainly true, and not just because of the risk of *local* graft. Was it 90 000 tons of ice that crisscrossed the country and ended up in warehouses, under FEMA's management? How were the cruise ship leases negotiated again? And the house trailers? How often, how many times will the federal government pay to rebuild in areas that never should have been built? Will the Army Corps of Engineers get the specs for the levees right this time, and supervise the work?
Anyway, I certainly hope the Iraqi fraud culture isn't contagious, although there are some signs it may have corrupted a few Americans in Iraq. It would be horrible if all those Americans managing the billions of dollars for rebuilding in Iraq caught it and started diverting funds or taking kick-backs for awarding contracts. That there have been reports of poor accounting, poor (or no) negotiation, and nine billion dollars has gone missing (probably just forgot to ask for a few receipts?), is bad enough!
Now, if the Iraqis would only stop denying their WMD stocks and hand them over they would be taking a big step toward overcoming their fraud culture, wouldn't they?
But why bring the Iraqis into this? Why such a slur on the people Americans are dying to liberate?
I realize that the point he was trying to make was the need to have good accountability and oversight of the spending to rebuild after Hurricane Katrina. That is certainly true, and not just because of the risk of *local* graft. Was it 90 000 tons of ice that crisscrossed the country and ended up in warehouses, under FEMA's management? How were the cruise ship leases negotiated again? And the house trailers? How often, how many times will the federal government pay to rebuild in areas that never should have been built? Will the Army Corps of Engineers get the specs for the levees right this time, and supervise the work?
Anyway, I certainly hope the Iraqi fraud culture isn't contagious, although there are some signs it may have corrupted a few Americans in Iraq. It would be horrible if all those Americans managing the billions of dollars for rebuilding in Iraq caught it and started diverting funds or taking kick-backs for awarding contracts. That there have been reports of poor accounting, poor (or no) negotiation, and nine billion dollars has gone missing (probably just forgot to ask for a few receipts?), is bad enough!
Now, if the Iraqis would only stop denying their WMD stocks and hand them over they would be taking a big step toward overcoming their fraud culture, wouldn't they?
Thursday, December 01, 2005
We, the lightbulbs of France
We've all heard (not lately) the classic "how many [target group of your choice] does it take to change a lightbulb?" (Answer: Three, one to hold the lightbulb and two to turn the ladder). A colleague a few years ago had a variant: "How many social scientists does it take to replace a lightbulb? Social scientists don't replace lightbulbs, they try to find out why the last one burned out!"
In a recent edition of the magazine "Courrier Cadres", a reader asked "How many unemployed turn down valid jobs offers? Out of how many offers?". Jean-Pierre Fine, Secretary General of the APEC (association for the employment of "cadres", or managers) replied that the APEC doesn't generate offers, they provide advice and information so that the unemployed can search fruitfully.
They do publish some stats in their magazine though, which give an indication: by category, the average number of candidate per job posting. Not all applicants are necessarily unemployed, however, so this does not measure the odds of getting a job, but since I suspect there to be a bias among hiring firms in favor of the employed over the unemployed (both because that could mean they are more immediately operational, and more successful) the odds are probably even worse than these!
On the other hand, of course, if there are 154 offers per month in a category, and 100 candidates per offer, it could be that the same 100 people are applying, and there are 1.5 vacancies per candidate! Given the specificities required, however, it seems very unlikely that the same candidates could vie for all jobs.
In a recent edition of the magazine "Courrier Cadres", a reader asked "How many unemployed turn down valid jobs offers? Out of how many offers?". Jean-Pierre Fine, Secretary General of the APEC (association for the employment of "cadres", or managers) replied that the APEC doesn't generate offers, they provide advice and information so that the unemployed can search fruitfully.
They do publish some stats in their magazine though, which give an indication: by category, the average number of candidate per job posting. Not all applicants are necessarily unemployed, however, so this does not measure the odds of getting a job, but since I suspect there to be a bias among hiring firms in favor of the employed over the unemployed (both because that could mean they are more immediately operational, and more successful) the odds are probably even worse than these!
2.1 Production Director (industry) | 80 |
3.2 Supply chain: logistics, purchasing | 90 |
3.3 Methods, Quality Control and Assurance | 80 |
4.3 R & D Project Mgmt. | 50 |
5.1 Marketing, Sales Director | 100 |
5.3 Marketing | 130 |
5.7 Sales | 70 |
8.2 Audit, Controller | 100 |
9.1 Informatics Director | 130 |
On the other hand, of course, if there are 154 offers per month in a category, and 100 candidates per offer, it could be that the same 100 people are applying, and there are 1.5 vacancies per candidate! Given the specificities required, however, it seems very unlikely that the same candidates could vie for all jobs.
Listserver Overload
It has been nearly a week since my last post here...what have I been doing instead? Among other things, trying to get caught up (ha! ha!) on my listserver reading. For some reason, once I subscribe for a listserver and read it for a while, I lose interest but don't unsubscribe. What's worse, I find it very, very hard to delete any of the messages I've received; as if I might need my own personal archive, off-line? And they don't take up anywhere near as much space as all the magazines, old clothes, broken appliances and so on that encumber my house.
One rationalisation is that editing the message threads down to something concise could be a contribution to some (free and open source) software projects, since documentation is often incomplete (why else would the topics be on the listserver?). Where, and in what form, my edited versions would be published is an open question...as is the appropriateness of inclusion or exclusion of the names of those who provided the answers and tips. Nevertheless, occasionally I go through some of the messages and try to edit them.
Actually, I'd rather not know how many, or how few, I've actually managed to process to something worth keeping. Most often, it is impossible, because:
I should add that my yield is especially low because I subscribe to digests, and spend much time searching for other digests containing posts on the topic at hand, rather than relying on threads. This (subscribing to digests) is a really bad idea if you plan to try to follow threads.
Just for perspective, my current backlog is:
I subscribed to another one or two this week, but they seem not to be active (nothing in my inbox, yet). There is also my college class list, but that is a completely different category, as far as I am concerned--usually the first thing I read after family and friends.
One rationalisation is that editing the message threads down to something concise could be a contribution to some (free and open source) software projects, since documentation is often incomplete (why else would the topics be on the listserver?). Where, and in what form, my edited versions would be published is an open question...as is the appropriateness of inclusion or exclusion of the names of those who provided the answers and tips. Nevertheless, occasionally I go through some of the messages and try to edit them.
Actually, I'd rather not know how many, or how few, I've actually managed to process to something worth keeping. Most often, it is impossible, because:
- the question or plea received no useful reply
- the reply was *RTFM* or search the list archives
- the issue concerns a version of the software which is now obsolete
- I don't really understand the matter well enough to rewrite it
- Copy-paste to a wiki or blog doesn't work properly because of code snippets, especially xml tags!
I should add that my yield is especially low because I subscribe to digests, and spend much time searching for other digests containing posts on the topic at hand, rather than relying on threads. This (subscribing to digests) is a really bad idea if you plan to try to follow threads.
Just for perspective, my current backlog is:
Total | Unread | List |
1622 | 1599 | FreeBSD Questions |
324 | 274 | FreeBSD Hardware and Firewire |
84 | 47 | PostgreSQL |
238 | 93 | COIN-OR open source operations research software project |
868 | 139 | Forrest (xml web publishing with Apache Cocoon) |
439 | 280 | xml-tech |
217 | 124 | Bull-I3 + jet (French university research and conference lists) |
92 | 68 | L'Internaute (leisure) |
I subscribed to another one or two this week, but they seem not to be active (nothing in my inbox, yet). There is also my college class list, but that is a completely different category, as far as I am concerned--usually the first thing I read after family and friends.