X

Twitterize Yourself makes visual sense of big data

The need for data analysis continues to grow, so look for new ways to analyze and visualize data to make it more consumable.

Dave Rosenberg Co-founder, MuleSource
Dave Rosenberg has more than 15 years of technology and marketing experience that spans from Bell Labs to startup IPOs to open-source and cloud software companies. He is CEO and founder of Nodeable, co-founder of MuleSoft, and managing director for Hardy Way. He is an adviser to DataStax, IT Database, and Puppet Labs.
Dave Rosenberg
3 min read
@dr138 is a trailblazing bore on Twitter
@dr138 is a trailblazing bore on Twitter Visual.ly

A tool from Visual.ly that analyzes and visualizes users through Twitter posts demonstrates one way make easy-to-understand visualizations of big data.

One example of the increasingly important trend of combined analysis and visualization is evident in Visual.ly Labs' Twitterize yourself application, which provides a good representation of how companies can use large data sets to quickly identify user characteristics to increase engagement or upsell goods and services.

I spend most of my time in my day job looking at data across the extended software development life cycle (design, develop, deploy, maintain, etc.), looking for the patterns that show how software goes from design into development and then into production, and observing how the data sources interact with each other.

We recently identified more than 60 sources in which developers and operations teams might be interested, not including e-mail, existing social networks, and other tools they personally use.

Of course, not everyone uses every data source, but the mixing and matching of data across boundaries has introduced a whole new world of complexity and a need for analytics that help make sense of data--regardless of the type of data, or the people or machines that generate it.

There is so much data from so many sources that it becomes more and more critical for companies to start looking at their data analysis strategy early on in the product development process in order to best take advantage of the information available.

The important thing to keep in mind about all of the data you collect is that it should add intelligence to your processes. Whether you're trying to sell more product, make better decisions, or take actions faster, the data you gather is what should drive everything your business does. (Note: this is a great focus area for startups)

Visual.ly's tool uses data from your tweets, combined with data from Infochimps, a data access provider to help form a picture (literally and figuratively) of you as a Twitter user, including key statistics and a partial personality profile.

The "outfits" and "accessories" are chosen based upon keywords most associated for you. For example, if you tweet a lot about wineries, you will see a wine glass in your hand. Go to the ballpark often? Then your avatar will be dressed in a sports jersey, and so on. There are a total of 11 possible outfits that depict your overall Twitter personality and 29 different accessories that could be assigned to show your interests.

While it's certainly fun to have an infographic of yourself--especially one that dresses somewhat like you--the more interesting parts here are how much data is already out there about you and how much more analysis could be done, if you aggregate a bit more data. The University of Vermont, for example, recently analyzed some 46 billion words tweeted by 63 million users since 2009 to reveal that we are less happy than we were two years ago.

Think about being able to correlate your Twitter profile score with Amazon purchases, Facebook updates, credit card charges, or other consumer-oriented transactions. The data trail is extensive, and over time, companies are going to get better and better at making sense of it all. (And yes, you should be at least a little bit afraid of that.)

The challenge for businesses is that the tooling for this type of analysis is still pretty difficult to use, if available at all. If you consider that each site mentioned above is a silo, you'd first have to aggregate all of the different data sources, then run analysis against the full set, and then against subsequent smaller data sets (by user, group, location, etc.) to reveal the most relevant findings.