Positing particular personae (say that slowly) isn’t something new when it comes to website design: The FutureNow guys have been doing it for more than five years, and there are a number of other content engagement “experts” that have their own ways at better segmenting and understanding your ultimate audience. The process of using particular personae can be a way to develop websites that can deliver higher click-through rates and improved customer experience. All well and good, but what about improving the internal data access experience too?
That was the subject of a session at the Teradata Users Conference in Washington DC in October. I heard about how you can use personae to segregate and better target your data owners and data users. It is an intriguing concept, and one worth more exploration.
(An example of virtual data marts at eBay, more explanation below.)
The session was led by Gayatri Patel, who works in the Analytics Platform Delivery team at eBay and has been around the tech industry for many years. There aren’t too many places that have as much data as eBay has: each day they create 50 TB’s worth and they have more than 100 PB per day that is streamed back and forth from their servers. That is a lot of collectibles being traded at any given point. And something that I didn’t really understand before: eBay is a lot more than a marketplace. They have developed a large collection of their own mobile apps that are specific for buying cars, or fashion items, or concert tickets for their specific audiences. In the past they have had difficulties in trusting their data, because two different metrics would come up with different numbers for the same process, so that often meetings would be consumed with different groups presenting conflicting views on what was actually going on across their network.
Patel has come up with mechanisms to focus her team’s energies on particular use cases to better understand how they consume data, and to supply her end users with the right tools for their particular jobs. To get there, she has worked hard to develop a data-driven culture at eBay, to identify the data decision-makers and how to help them become more productive with the right kinds of data delivered at the right time to the right person.
Let’s look at how she partitions her company of data heavyweights:
- First are the business executives who are looking at top-line health and metrics of their particular units and have relatively simple needs. They want to drill down deeper to particular areas or create operational metrics and get more narrow and focused areas of particular data sets. Let’s say they want to see how weather-caused shipping delays from sellers are impacting their business. These folks need dashboards and portals that are one-stop shops where you can see everything at a glance, post your comments and share your thoughts quickly with your business unit team. Patel and her group created personal pages with a “DataHub” portal called Harmony, that makes sure all of their metrics are current and correct, and where the executives can bookmark particular graphs and share them with others.
- Second are product managers who are looking to learn more about their customers, and want to do more modeling and find the right algorithms to improve their marketplace experience. “We followed some of our managers around, attended their meetings and tried to understand how they use and don’t use data,” Patel said. Her team came up with what they call the “happy path” or what others have called the “golden path” – the walk that someone takes during their daily job to find the particular dataset and report that will help them do their job and make the best decisions. “Each product team has a slightly different path in how they interact with their data,” she said. “Our search development teams are more technical and data-savvy than the teams who work on eBay Motors, for example.” Her team has to constantly refine their algorithms to make the happy paths more evident and useful and well, happier for this group of users.
- Third are data researchers and data scientists. These folks want to go deep and understand how everything fits together, and are looking to make new discoveries about particular eBay data patterns. They want more analysis and are constantly creating ad hoc reports. Patel wanted to make this group more self-sufficient so they can concentrate on finding these new data relationships. Her team created better testing strategies, what she called “Test and Learn,” which has a collection of short behavioral tests that can be quickly deployed, as well as more longitudinal tests that can take place over the many days or weeks of a particular auction item on eBay. “We want to fail fast and early,” she said, which is in vogue now but still is something to consider when building the right data access programs. Patel and her team have developed a centralized testing platform to make it easier to track company-wide testing activities and implement best practices.
- Next is your product and engineering teams. They do prototypes of new services and want to measure their results. These teams are creating their own analytics and constantly changing their metrics using methods that aren’t yet in production. For this group, Patel made it easier for anyone to create a “virtual data mart” which can be setup within a few minutes, so that each engineer can build their own apps and create specific views pertinent to their own needs. (A sample screen is shown above.)
eBay has three different enterprise data efforts to help support all of these different kinds of data users. They have a traditional data warehouse on Teradata, three of them in fact. They have a fourth warehouse which is semi-structured and called “singularity” that has more behavioral data for example. Finally, they use Hadoop for unstructured Java and C programs to access. The sizes of these things is staggering: Each of the traditional data warehouses is 8 TB and the other two are 42 and 50 PB respectively.
As you can see, the eBay data landscape is a rich and complex one with a lot of different moving parts and specific large-scale implementations that meet a wide variety of needs. I liked the way that Patel is viewing her data universe, and having these different personae is a great way to set her team’s focus on what kinds of data products they need to deliver for each particular group of users. You may want to try her exercise and see if it works for you, too.