Ask any Data Scientist and they will tell you that the process of ‘wrangling’ (loading, understanding and preparing) data represents the lion’s share of their workload – often up to as much as 80%. However, that number is not as alarming as it may at first seem. To understand why, let me tell you about my living room.
At the end of the nineteenth century, electricity was starting to have a profound effect on the world. As dramatized in the excellent novel The Last Days of Night, Thomas Edison battled with George Westinghouse (the latter aided by Croatian genius/madman Nikola Tesla) for control over the burgeoning market for electricity generation and supply. The popular symbol of the electrical revolution is of course Edison’s famous light bulb, but almost more important was the humble electric motor.
Your company has a Marketing Strategy, right? It’s that set of 102 slides presented by the CMO at the offsite last quarter, immediately after lunch on the second day, the session you may have nodded off in (it’s ok, nobody noticed. Probably). It was the one that talked about customer personas and brand positioning and … Read more
If you’re like me, and have succumbed to the unpardonably bourgeois luxury of hiring a cleaner, then you may also have found yourself running around your house before the cleaner comes, picking up stray items of laundry and frantically doing the dishes. Much of this is motivated by “cleaner guilt”, but there is a more … Read more
Congratulations! You just got the call – you’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve been given carte blanche to go hire … Read more
As the final season of Mad Men came to a close this weekend, one of my favorite memories from Season 7 is the appearance of the IBM 360 mainframe in the Sterling Cooper & Partners offices, much to the chagrin of the creative team (whose lounge was removed to make space for the beast), especially … Read more
My eye was caught the other day by a question posed to the “Big Data, Low Latency” group on LinkedIn. The question was as follows: “I’ve customer looking for low latency data injection to hadoop . Customer wants to inject 1million records per/sec. Can someone guide me which tools or technology can be used for … Read more