re the big data explosion


this post only to link to a couple of short articles discussing the recent view of big data as some kind of manna, a panacea, a means of showing what people are really thinking.. but as kate crawford points out here, some data-crunching types seem to easily fall in to the trap of thinking that correlation equals causation:

i was lead to this article via twitter and cory doctorow’s boing boing piece – where he notes that he had also written about this phenomenon in the guardian a couple of years back.

while just the other day i happened across another related piece in the sydney morning herald (link not presently available) discussing how attacks on businesses – relying on being found on the first page of a google search – render their websites virtually invisible when rival companies create codes for that page which lead to too many 404’s – causing google to remove them from their search engine hits..

If Donna Reed Blogged

Palin Family

Mama Grizzly: A preeminent mommy archetype in post-modern America

The New York Times Sunday Magazine featured an article about Heather Armstrong, an extremely successful blogging mom.

She is one of the few bloggers who wield that kind of clout. Typically, there are 100,000 visitors daily to her site,, where she writes about her kids, her husband, her pets, her treatment for depression and her life as a liberal ex-Mormon living in Utah. As she points out, a sizable number also follow her on Twitter (in the year and a half since she threatened Maytag, she has added a half-million more). She is the only blogger on the latest Forbes list of the Most Influential Women in Media, coming in at No. 26, which is 25 slots behind Oprah, but just one slot behind Tina Brown. Her site brings in an estimated $30,000 to $50,000 a month or more — and that’s not even counting the revenue from her two books, healthy speaking fees and the contracts she signed to promote Verizon and appear on HGTV. She won’t confirm her income (“We’re a privately held company and don’t reveal our financials”). But the sales rep for Federated Media, the agency that sells ads for Dooce, calls Armstrong “one of our most successful bloggers,” then notes a few beats later in our conversation that “our most successful bloggers can gross $1 million.”

By talking about poop and spit up. And stomach viruses and washing-machine repairs. And home design, and high-strung dogs, and reality television, and sewer-line disasters, and chiropractor visits. And countless other banalities of one mother’s eclectic life that, for some reason, hundreds of thousands of strangers tune in, regularly, to read. Queen of the Mommy Bloggers by Lisa Belkin is a contributing writer and the author of the Motherlode blog.


Among women who blog, Drummond and Armstrong are at the top. There are almost as many ways to measure reader traffic as there are blogs right now, but Nielsen estimates that Dooce sometimes has as many as six million visitors a month, and Pioneer Woman is in the same range. Both bloggers have best-selling books: “It Sucked and Then I Cried” is Armstrong’s story of postpartum depression; “The Pioneer Woman Cooks” is Drummond’s first book, a cookbook illustrated with photos of food and cowboys, including rear views of her husband, clad in Wranglers and chaps as he bucks broncos and brands calves.

Having a tale to tell is only the first step, of course. Still evolving is the art of making a living from that tale. Heather and Jon both worked in online marketing, yet they were hesitant about adding advertising to Dooce early on. More specifically, it was Heather who hesitated. She feared “selling out” and the reaction from readers. But after her postpartum breakdown, her therapist prescribed that she hire a baby-sitter to come every day. Ads became a way to pay for child care.

The Armstrongs started small at the end of 2004, with Google ads (the kind that appear on registered sites and pay anywhere from a few pennies to a few dollars, depending on Web traffic). Before long they had contracted with an agency that actively sought display advertisers, making Dooce the first personal Web site to accept significant advertising. When monthly income from the blog exceeded Jon’s paycheck for the same period, he quit his job to manage the business.

Armstrong’s readers responded as she’d feared. “They screamed, ‘Who do you think you are?’ ” she remembers. “ ‘What made you important enough to make money on your Web site?’ ”

Creatures of Information


Penelope Trunk - blog domo at Brazen Careerist

We walk the corridors, searching the shelves and rearranging them, looking for lines of meaning amid leagues of cacophony and incoherence, reading the history of the past and of the future, collecting our thoughts and collecting the thoughts of others, and every so often glimpsing mirrors, in which we may recognize creatures of the information. Jorge Luis Borges The Library of Babel


[ResponseIS Tweeting Better Than Blogging?] I’ve had to think about the social ‘net as a marketing opportunity for my job. I approached this by going out and sifting through the resources about current best practices. Because I’ve long be a skimmer of the marketing world as it is situated by the internet, I have also long known the most basic, challenge is making it possible for your customers to both: find your content, and, spend a quality moment with ‘it.’

That said customer might proceed to a trial–marketing lingo for doing something that you the provider knows he or she is doing–is almost the frosting on the cake of nailing down steps one and two.

Find and capture (attention.)

When I peruse the google analytics for ND2.0 or any of my own productions, I am impressed and dismayed in equal parts by their suggestive qualification of user behavior. They found us, and they spent an average of 2:02 minutes with us. (The realization of a trial here would be a comment.)

Awash in information, yet, somewhere in this ocean is content which may be found if time is invested. Stepping back from this opaque generality, is a slightly more refined generality: an individual invests time in a manner distinctive to him or her, is motivated by an overt or tacit goal, and, his or her’s success requires a successful act of retrieval and selection.

To give this description a finer grain, we would need to know something more detailed about the conjunction of: goals, time, tool, manner/regimen, medium, media, (and more.)

In this there would arise the positive question. For example, what characterizes the user most likely to read content of some specific length? There could be all sorts of ways to break down the previously mentioned descriptive elements.

Of course I am in possession of my own subject, myself. (Netdynamics was partly rooted in reflexive accounts.) I have a good idea about that which comprises the array of my own goals, what kinds of content focus both my time and attention, and, I also have a fairly rich terminology for establishing the baseline description concerned with characterizing what kind/type/disposition I possess.

If I integrate a rough and approximate sense of goal directed search-and-retrieval with this kind of baseline description, and, I then scale this conceptually to include all persons who could be differentiated in this way, I can then blast this downward to questions about Twitter and blogosphere. I reckon the devil would be in the details betwixt, for example, two extremes. One extreme is the person who meets their goals by exclusively spending not more than two minutes with any section of content, and, another person who only uses the internet to retrieve long-form content.

Following through with this sketch it seems we land in the interdisciplinary flux of psychology, anthropology, sociology, and, information science.

From this, there could be a folksy supposition: there are those users who are tend to express attention deficit disorder. This user’s time is easily waylaid. What would a causal hypothesis be once we establish that some users operate like this?


ND2.0’s two-to-four hundred visits per month are somewhere on the continuum of quantifiable responsive agency and activity. Our blog is more active than all the dead and lesser blogs. We haven’t invested the time to elevate its activity, yet the default is not completely shabby at all. Yet, we’re not aiming to address the complex problem of how to make our content retrievable, vital, and, incidentally, formulate its distribution in ways which match the various ways users deploy to meet their goals.


In three different feedreaders I have subscribed to a total of over 2,000 blogs. I keyword search through the blogs using the RSS client. Another way to look at this is that I have created a subset of blogs and severely limited the base data set. This would be contrasted with searching via Google. In the case of using Google, I am looking through a humongous data set, but, I also have to invest the time in wading through the false positives. My experience is that there’s lots of gold deep in the pages of a Google search, yet the time investment is often too much.

I don’t know what the actual figures are, however, for argument’s sake, say I spend 25% of my time ‘after retrieval,’ on average, using up 5 minutes per retrieved item. This is a somewhat complicated vector, right? This includes the twenty to sixty minutes–or so–I might spend reading a journal article or long magazine article. The other side of this measure is that I spend 25% of my time using time at the rate of less than five minutes per retrieved item. Leaving the 50% I require to search and retrieve.

(No matter what the actual distribution of time is, it shifts were I to drop out, so-to-speak, “off screen,” dealing with content I print out, or listen to.)


One last observation; when I look at my Twitter stream ( and or at blogs, I’m impressed by the implicit time investment of other users. And, I can make distinctions, such as the difference between Twitter users who are mostly scattering links, and, Twitter users who mostly are interacting with each other. Likewise, on blogs, I’m fascinated by comment threads. Not for their content, but because of their group relations and social-psychological context.

I’m very impressed by the blog Crooked Timber, where something like three dozen people are expressing (day in and day out,) deeply thought responses to sophisticated thinking.

There are many extremely successful blogs produced by a single person. Take for example Brazen Careeristfrom Penelope Trunk. It gets hundreds of thousands of hits per month.

A blog requires users who possess the right combination of traits, motivation, goals–as long as the blog is oriented to users. Obviously, there is the other side of the equation: those who develop and produce and distribute content for users.

As for Twitter, comments will follow. I will say this: it’s not well matched with my disposition. I prefer to manage serendipity rather than simply be subjected to it!

‘Caused’ / Enabling

(src) click for zoom in

I get the ’caused’ as opposed to caused, but my tentative reflection is that a distinction without a difference is implied in your remark, Apurr.

We know that social networks don’t cause anything. This would be the cybernetic view against presumptions that an instrumentality is ever wholly/directly/primarily causal. The various instrumentalities are networks for information conveyance in a “minded” system, so the network enables information to flow between instances of human consciousness. In turn a piece of information is propagated via other channels simultaneously, and, propagated as a consequence of, for example, Twitter. It could be said, a piece of information is off-loaded from the network conveyor and set on the, for example, mouth-to-ear conveyor.

In the larger minded system there are various conveyances. Web sites (including forums, al-Jazeera, blogs, newspapers, telephones, cell phones, one-to-one verbal, meetings, etc.. These sum to constitute an ecology. As Phillip Howard puts it:

Digital media didn’t oust Mubarak, but it did provide the medium by which soulful calls for freedom have cascaded across North Africa and the Middle East.

So: Digital media didn’t oust Mubarak, but it did provide a medium by which soulful calls for freedom have cascaded across North Africa and the Middle East.

Influence and effects and consequences and other social results can be measured and assessed. I remember several years ago discussing with a social network expert and graphic specialist what a social network diagram does and does not show. I suggested to him that the qualities of the relationships and their relational effects are not aspects of the network diagram we were talking about. Depictions of network relationships represent implicit schemata. These pictures include and exclude functional aspects, and often also represent slices rather than dynamics.

With respect to a system and the system of systems–and granting Batesonian mindedness–I suspect the question of causality can be addressed only at the point a lot more dimensionality is built into the analysis.


“There’s 80 million people in Egypt, and almost 40 percent are below the poverty line,” Sharma said. “Cell phone penetration is incredibly high, but the majority of the cell phones are not smartphones. A lot of the information that was getting out was from a very small critical mass of people that were able to tweet out of Egypt. Friends of mine in Cairo estimate that it’s less than 200 people who were tweeting from Cairo.”

“The reach of new media is spreading: as of December, 2009, there were over 2,300,000 Facebook users in Egypt. That’s 184 percent growth over the previous twelve months. While Twitter has yet to become the rage in Egypt that it is elsewhere, it has become a popular means for Egyptian activists to alert their friends and followers of arrests and intimidation by security forces.”
(Egypt, Twitter, and the rise of the watchdog crowd By Caroline McCarthy, CNET News on February 14, 2011)

According to a study released by the government-run Information and Decision Support Center in May 2008, blogging provides Egyptian youth “with a refuge where they [can] easily express themselves and their beliefs without restrictions.” The study also asserts that “from 2006 to 2008, a number of demonstrations and expressions of real political protest were associated in one way or another with cyber-protests on the Internet, tapping into the massive public mobilization of youth on political blogs.”

The study estimated that as of 2008 there were approximately 160,000 Egyptian blogs, which accounts for approximately one in four internet subscriptions in the country. The content of the blogs was broken down as follows: 30.7 percent covered a variety of topics, 18.9 percent were political, 15.5 percent personal, 14.4 percent business and culture, 7 percent religious, 4.8 percent social, and 4 percent focused on science and modern technology. Social networking, political action and its real impact in Egypt Sallie Pisch Bikyamasr blog March 21, 2010!

The U.K. government complained to Egypt after Vodafone Group Plc was ordered to send text messages seen to instigate violence as demonstrators demanded the ouster of President Hosni Mubarak.

U.K. Foreign Office Minister Alistair Burt contacted the Egyptian ambassador in London to discuss the order to Vodafone after the company reached out to the government, the Foreign Office said last night. British Foreign Secretary William Hague yesterday issued a statement calling the “abuse” of Internet and mobile-phone networks “unacceptable and disturbing.”

Egyptian authorities instructed the local mobile-network operators, which also include Etisalat and France Telecom SA’s Mobinil service, to send messages under emergency powers provisions. Vodafone, the world’s biggest mobile-phone operator, said yesterday that the messages were not written by the mobile- phone operators. (U.K. Complains to Egypt on Ordered Vodafone Messages By Jonathan Browning and Thomas Penny – Feb 4, 2011 Bloomberg)


On the Ground at Social Media Week: The Internet & Uprisings in the Arab World: Are We Already In A Post-Social Media World? By Faye Anderson on February 9, 2011

Egyptian Crisis: The Revolution Will Not Be Tweeted By Mark Evans – January 31st, 2011 Sysomos blog

I wonder if the critical mass–with respect to social media–for effective social instigation may be a matter of a confluence of early adapters along the spectrum of internet media in a context where there aren’t a lot of internet users overall. In Egypt’s case, there is huge mobile (but not smart) phone penetration. Also, there apparently are longstanding face-to-face ‘network’ regimes too.

Beta App Cosmos


Generated via Flame, a nifty time-waster and iamge generator driven by Flash. Found at Smashing Magazine, Bizarre Websites On Which You Can Kill Time With Style.

There’s no DESIGN link category here until now, so Smashing Magazine inaugurates it. Smashing Magazine is a terrific, really second-to-none web design resource. Among numerous strengths, there are three I find especially useful. First is its collection of external resources; second is its concentration on CSS and HTML5 and WordPress; third is the wealth of design tips and tutorials available in Smashing Magazine’s very deep archive.

Speaking of wasting time, sometimes it is useful to waste time contemplating potential time-wasters. I know of no better web site for this than The Museum of Modern Betas.

The MoMB is a site dedicated to collecting webbased applications on a beta trip. The MoMB recently turned five, currently about 9000 sites are listed.

…major, major sink. MomB tagcloud.



Frank has entertained a comment on the previous post. Frank, I don’t have any idea what are your commitments are, or, how we might find some common ground. However, in noting this, I have already followed the crumbs, put a little time in, allowed my fascination to be evoked.

I suppose I’m a fairly good boolean text searcher. At the same time, there are several basic problems I run into constantly. I’ll mention two and also note the problems are inflected with my own “one-sided” intuitive style.

(one) If I need to unravel some technical problem having to do with my computers or software, I will discover the answer eventually, but only after sifting through false positives. One of the most aggravating of such returns is when somebody on some forum somewhere has solved the same problem and doesn’t offer the solution. This is very common with respect to trouble-shooting WordPress plug-in conundrums.

(two) Sometimes I find out my search choices lack precision. This matters a great deal when this fault in precision basically dooms the search goal. For example, it took about 10+ hours of searching to learn that the technical term for serendipity–and this in a very small literature in social psychology–was fortuity. In retrospect I should have started with a search for synonyms.

There’s also a net positive which crops up: when I come upon a thick resource about which I was ignorant. In this case, there is the twin problem of a resource being hidden from me, and, the ‘how’ of its being revealed.

I’ll introduce a vector here. It would be neat if the search application could recognize along with me my growing frustration, and, recognize that I am making a common mistake, (as in case two.) Obviously it would be much more efficient if the false positive didn’t show up at all.

Okay, so: meanwhile. . . There is a lot of brainpower being spent on specific goals. We know these goals are spread out in various fever dreams, among which are the optimizing consumer acquisition, predictive forensics, expert resource aggregation, and other such goals.

With Bayesian “reasoning engines” embedded in software to drive the recognition process, computers can begin to approach the everyday capabilities of the human mind for sifting through chaos and finding meaning. “Bayes gave us a key to a secret garden,” says Lynch. “A lot of people have opened up the gate, looked at the first row of roses, said, ‘That’s nice,’ and shut the gate. They don’t realize there’s a whole new country stretching out behind those roses. With the new, super powerful computers, we can explore that country.”

Leading to:

The problem tells you how to solve the problem. That’s what the next generation of computing is going to be about: listening to the world.”

(Between Bateson and Heidegger,) I would suggest context isn’t likely to be anything probalistic so we might dispense with A.I. in its strong form. I can, if asked, articulate why I’m guessing the way I’m guessing, and, I’d expect the intelligent machine to do the same and tell me why its choosing to make some subject (or whatever,) intelligible in the way it has, apparently, chosen.

Still, I like the idea of the machine intelligence listening to the problem behind the problem. And, its thin context in my world, yet I’d be grateful to receive the heads up that “you can’t get there from where you’re trying to be at.”


Okay, so I downloaded Chrome when it burst on the OSX scene. The situation is this: if it offers me something extra special, I haven’t put in the time to figure its benefits out. If it’s supposed to help me hook into the cloud then the same is true for the darn cloud. I use Google docs and Google reader infrequently. I really have to be hit over the head about a potential benefit and this means it has to be inescapable.

google in a library building

‘Search’ is quasi-democratic and available to anybody with a web browser and computer and connection. I am a big user of the search and retrieval interfaces provided by various libraries and think nothing of pulling materials, including books, across oceans, as-it-were.

google knowledge

Are more people are accessing written information in the internet age? If so, to what extent? Somewhere I have the study from the NEA, I believe, that quantifies reading in the USA. I’ll have to dig it up, but I recall it states that 42% -(ish) of American adults read their last book cover-to-cover in their last year of formal education. Also, nowadays, the eBook is posed to surpass the printed book in sales sometime in the next decade.


Google by the numbers. Zettabyte= 10^21

The game tree (ie. Shannon) number for possible positions in chess is on the order of 10^46; number of atoms is guesstimated (by Chapman) at 4×10^79. (The number is wrong on the Google By the Numbers page and is evidently a transcription error from Chapman’s grain-of-salt quantification.)

Subscribe: Entries | Comments

Copyright © NetDynam 2.0 2017 | NetDynam 2.0 is proudly powered by WordPress and Ani World.

Proudly using Dynamic Headers by Nicasio WordPress Design