Friday evening – words

OK… so this may seem either inspired or pointless to you.. but let me tell you what I did last night instead of sleeping.   At work yesterday, I had been experimenting with a new data visualization tool that IBM is making available to anyone. It’s called ManyEyes.. and it’s a pretty cool concept. Anyone can upload their own data to the site..  there are a ton of very interesting pre-created visualization tools which you can use to explore the data. You can also create your own visualizations  and upload them. The idea is to allow folks to share data and share in it’s analysis, If that doesn’t sound like fun to you… then… you probably shouldn’t read any further…
     Anyway.. I was exploring the site looking at some of the very cool ways that folks had created to crunch and display data.. Many of the visualizations are dynamic.. that is you can explore them interactively..  many of them are beautiful, too. While most of the analysis and stuff was for processing numerical data, I found a few visualization for text analytics.  When I saw that,  I did what any red blooded geek would do at 12:30 in the morning, .. I thought of sources of large amounts of text data that I could analyze..   Of course !.. My blog.. I decided to run these analytics on the a years worth of my log entries starting from the first day, 3 days after Sam died. . I found one of my blog archives which had each of the first years 365 entries in web page format (HTML).then I figured out some processing on how to remove the  the pictures, the headers and the rest of the HTML formatting.. Then I concatenated it all into a singe dataset and uploaded it to ManyEyes. 

I then started looking at the data.. what would it tell me ?   I’m not even sure what I was looking for.. I felt like one of those kabalah mystics who looked through old religious texts looking for secrets in the patterns of letters and words.  The first thing I learned was that In that first year, I had written exactly 708,576 words (including punctuation).. Now.. that alone was pretty interesting .  I then started looking at word frequency…  One tool shows you a ’tag cloud’ of word frequency. The larger the word, the more it occurred.. ’Sam’ was the word I typed most frequently last year… I wrote Sam’s name in my blog 3,394 times in 365 days.. and said his name to myself 1000 times that as I wrote . That  makes sense… and is somehow .. i don’t know.. comforting ?   What else did I say frequently ?

Diane. Max, Gabe, 
Love,  Great Friends,
House, Cool. Work,
Night, Today…

Sounds like Haiku.

Then I looked at the frequency of 2 word phrases..   Sam’s death, Sam died, Sam’s passing.. all hard things to say.. but I said them often.. Good friend, felt good, cell phone… happy birthday also had high counts..  I know it’s funny.. but the deeper I looked at this the more I could see how it captured my year.. I can’t explain it.. but if you’re interested, You can look at the data yourself here  ( Never mind the MySpace anti phishing warning ) , Try entering any word or 2 word phrase and look at it’s frequency . It’s kinda fun.

There’s also a word-tree analytics tool that I tried.. It generates a concordance tree…  .. kinda like you’d find in a bible reference or quotations book. The idea is , you type in a word.. then it groups all the next words that occur after that word in your text .. and it groups all the next words that occur after that, etc.. for all the text.. The net is, you can look up any word or  phrase.. and understand its context going forward or backward.. In some way, it’s a way of looking at all the stories I’ve told  over the last year and seeing how they are threaded  together .  Maybe an illustration would help.. Say I type in ’Max; it shows me all the possible words that follow the word ’Max’ in my entire blog..

Let’s say I chose the word ’And’  .. I then see all the phrases that can come after ’Max and… ;

Now I select Gabe and see all the things that Max and Gabe did together in my blog.

You can follow this all the way down to a single sentence.. .. At any point you can ’shift click; to start the tree search over at your current word..

You can run the process in reverse and look at all the stories that end with a  word or phrase.. like ’Friends of Sam’..   You can try this process yourself by clicking  here.

So… I find this  amazing.. if your brain works like mine..   If you’re brain is wired  differently, you might be asking ’what’s the point’ ? What did it tell me ? That that first year without Sam was unimaginably hard ? That we had wonderful friends and family ? That stuff I already knew .  What I    know is that I got a great sense of peace thumbing through all this stuff.. I felt somehow that I’d captured an important and heartbreaking part of my life in day to day stories.. and that somehow woven into  those stories were a deeper story about grieving, healing, growing, aging, loving, crying.. all that..   I’m sure this means alot more to me than it will to anyone else .. but then.. who’s blog is this anyway 🙂

Gotta run.. love to you all (the word ’love’ appears 1396 times !) Gnite Sam
-me ( which occurs 3688 times)

ps.. The word ’I’ occurs 16,710 times !!!!!!!  .. whoa.. what an ego !)