artificial intelligence, Semantics

What Netflix Could Do with Its Recommendation Engine to Excite Me as a Customer

You Might Like House of Cards, But I Couldn’t Possibly Comment

My colleague Peter Sweeney, the founder of Primal, and I were talking recently about Netflix using AI, specifically deep learning algorithms, as part of efforts to further improve its recommendation engine. I’ll admit, instead of being excited at the prospect of more insights being gleaned from my viewing history, my first reaction was concern about yet-another bubble along the lines of, or even worse than, the infamous search engine filter bubble, only this time for recommendation engines.

True, we learned earlier this winter that Netflix has re-engineered Hollywood. Netflix has a very rich and extensive categorization scheme — the product of analyzing movies and TV shows from an amazing number of angles, and presumably also following the trails of relationships that customers perceive among the films and TV shows that they view. I think Netflix provides good recommendations, certainly better than they did a few years ago. But frankly, the recommendations still hit dead ends quite often and its easy to get stuck in the rut of more of the same old thing. So my fear upon reading about what Netflix is doing was basically this: deeper mining equals even deeper rut — even more of the same old thing. And that could easily be the case. Just going deeper into analyses of the content itself, as well as my past preferences for it, might well add more categories to their classification scheme, but it doesn’t endear me more to Netflix if I still just get recommendations based largely on my past preferences, only now using more specialized categories. I’m still stuck in a rut.

What I want to experience is more of what I like to call ‘designed serendipity’. If Netflix or Amazon or one of their peers are truly uncovering deeper and more nuanced patterns, particularly within the content itself, but also about my viewing preferences, then they could use that new data to make the recommendation experience more interesting and more compelling for me — giving me something I could actually get excited about. How could they do that? They could start by proposing content from adjacent categories based on walking their classification scheme. Because there would presumably be more, finer-grained categories, exploring some of the neighboring ones could add some fun while still keeping the risk low of jarring me with an off-the-mark recommendation. They could also take those lower-level elements and apply them in somewhat different contexts, preserving the elements that I like, but also mixing in some new twists. They could even try combinations of the lower-level elements, as they’ve done fairly successfully already at higher levels of their classification scheme.

Let me use some examples to illustrate the sorts of things I’d like to see. The West Wing and House of Cards are both political dramas. But at a deeper level, The West Wing is much more about the camaraderie of the White House staff as a team, with politics and political intrigue as more of a plot device for the personal interaction. House of Cards on the other hand is more of a psychological thriller set in a political context. The political maneuvering and bold back-stabbing are core to the show — and for me at least, that’s what makes it fun to watch. Those are fairly subtle, but significant differences that if deep learning can expose, would establish its customer value. Put another way, just because I like House of Cards (the Netflix version, but even more so the original BBC version), does not mean The West Wing would be a good recommendation for me, since such a recommendation is based on more superficial similarities between the shows. I’m a friendly, collaborative, team-oriented person in my real-life, so I’d rather see ego-maniacal scheming and back-stabbing as part of my diversionary viewing!

If Netflix could ‘context lift’ those elements of House of Cards that I do like and then reapply them in different contexts, that would excite me. For example, because I like House of Cards, I might like The Tudors better than The West Wing, even though The Tudors is a historical drama. The Tudors has more of that scheming and back-stabbing (or head chopping!) that I like. While it’s a political drama of sorts, it isn’t in the same sense as The West Wing and House of Cards, so it might not come up as an obvious recommendation. To make that recommendation is more deep and subtle. I also happen to like The Americans, a drama that is political only in an espionage context, but again also a thriller with lots of unexpected twists and turns. And I hate to admit it, but I also like Revenge. Revenge has almost nothing to do with politics, but shares the dark scheming and plotting of House of Cards. Would Netflix be able to recommend either of them to me based on House of Cards? If they get to that level, they’d have my customer loyalty.

At that point, the only thing still missing for me would be adding even more pleasant surprises — turning down the ‘designed’ aspect and turning up the ‘serendipity’. What if I want something that’s conceptually related to a past viewing interest, but still quite different? If I watched Planet of the Apes (particularly the original), wouldn’t Jane Goodall‘s documentary for Animal Planet, Almost Human, be an interesting recommendation? It would for me! Or what if I want to broaden my horizons and try something completely different than what I’ve been watching? Can Netflix put my past preferences in a blender and recommend something really novel and out-of-the-ordinary? Or alternatively can I just throw at Netflix some topics that I’ve been thinking about or have a point-in-time interest in and get a recommendation made-to-order at that particular moment, based on what I just provided, or perhaps subtly influenced by my viewing history? Do those things, Netflix, and I might become a loyal customer for life!

Tony

Enhanced by ZemantaOther Related Articles:
Advertisements
Standard
Semantics

Teaching a Martian to Make a Martini

English: Liquid nitrogen storage facility at t...

What Happens When a Martian Makes a Martini? (Photo credit: Wikipedia)

In my last blog post, I stated I felt expert systems were an important forerunner of today’s emerging digital personal assistants and any other software technologies that include an element of ‘agency’ — acting on behalf of others, in this case the humans who invoke them. For someone or something to act on your behalf effectively, they need to understand many specific things about the particular domain they are tasked with working in, along with some general knowledge of the type that cuts horizontally across many vertical domains, and of course they need to know some things about you.

Chuck Dement, the late founder of Ontek Corporation and one of the smartest people I’ve met, used to say that teaching software to understand and execute the everyday tasks that humans do was like teaching a Martian visiting here on Earth how to make a martini. His favorite Martian, George the Gasbag, like the empty shell of a computer program, didn’t know anything about our world or how it works, let alone the specifics of making a martini. Forgetting for a moment George’s physical limitations due to being a gasbag, imagine trying to explain to him (or to encode in software) the process of martini-making — starting with basically no existing knowledge.

First, George has to know something about the laws of physics. He doesn’t need to understand the full quantum model (does anyone actually understand it?), but he does need to be aware of some of the more practical aspects of physics from the standpoint of how it applies to everyday life on the surface of Earth. Much of martini-making involves combining liquid substances. Liquid substances need to be confined in a container of some sort, preferably a non-porous one. The container has to maintain a [relatively] stable and upright position during much of the process. The container holds certain quantities of the liquids. For a martini to be a martini and to taste ‘right’ to its human consumers, the liquids have to be particular substances. Their chemical properties have to meet certain criteria to be suitable (and legal) for use. The quantities of the liquid have to measured in relative proportions to one another. The total combined quantity shouldn’t (or at least needn’t) exceed the total quantity that the container can hold.

You need some ice, which involves another substance — water — its liquid form having been transformed into a solid at a certain temperature. If you are making the martini indoors in most cases or outside when the temperature is warm, the process of producing ice from water requires special devices to create the required temperature conditions within some fixed space. And so on and so forth. You can pull on any of those threads and dive into the subject. Think of having a conversation with a 4 or 5 year-old child and answering all the “Why?” and “How?” questions.

Of course there are at least two major different processes that can be used to mix the liquids along with the ice. They involve different motions — stirring the liquid within the container versus shaking the container (after putting a lid or similar enclosure on the previously ‘open part’ of the container to keep the liquid from flying out). The latter begs the question: is the open ‘part’ of the container really even a part of it, or the absence of some part?

There are allowable variations in the substances (ingredients), both in terms of kinds and specific brands (gin versus vodka, Beefeater versus Tanqueray for gin). Both the process and the ingredients often come down to the specific preferences of the intended individual consumer (take James Bond, for example), but may also be influenced by availability, business criteria such as price or terms of supplier contracts, and whether the consumer has already consumed several martinis or similar alcoholic beverages within some relatively fixed timeframe (don’t forget here to factor in the person’s gender, body size, previous night’s sleep, history of alcohol consumption, altitude, etc.). The main point here is simply if they’ve had several such drinks, their preferences may be more flexible than for the first one or two!

Whew!!! All that just to make a martini? That’s all to illustrate that encoding knowledge for everyday tasks is non-trivial. No one ever said developing intelligent agent software would be simple. But as previously mentioned, George doesn’t need to know everything about every aspect of the domains involved in martini-making. Going overboard is a sure recipe for failure. Knowing where to draw the line is the key and so a healthy serving of pragmatism is recommended. A place to start is I think even getting in the ballpark of knowledge about everyday things and applying that approximate knowledge to practical application uses. Since you don’t always know beforehand how much knowledge you need, I’m a fan of the generative approach to semantic technologies (see my related blog post on approaches to semantic technologies). The generative approach allows agility and flexibility in the production of that knowledge, as well as providing ways to tailor it for individual differences.

And speaking of individual differences: how will George recognize when I’m ready for him to make me a martini? What are the triggers and any prerequisite conditions (like being of legal drinking age in the geo-location where the drink is being made and consumed). Well, I could always ask George (or my personal, robotically-enabled, martini-making software assistant), but I trust that he knows me well enough to recognize that telling look that says, “I could sure use a drink, my friend,…especially after all the knowledge I had to encode to enable you to make one.”

Cheers!

Enhanced by Zemanta
Standard
Semantics

How To Get Semantified

First, let me say I’m pretty sure semantified isn’t even a ‘real word’ (yet), but I’ve seen it popping up lately, so I thought I’d help make it into a real word if it isn’t already.

Like anyone into semantics, I love defining categories and then classifying things into those categories. I guess it sort of goes with the job territory! So I’m going to share with you my category scheme for approaches to semantic technologies. I define four categories: constructive, inductive, blended or hybrid, and generative. In practice, specific instances of approaches falling within any given category may draw upon some of the elements of the other categories. In the case of the blended or hybrid approach, I’ll claim it involves a tight coupling of two of the approaches and it’s different enough to be its own category. Descriptions of each of the four categories follow.

Constructive

Constructive approaches essentially hand-craft their semantic models. As a knowledge engineer, I’ve been involved in several projects using this approach and I can tell you it can be really hard work with often slow progress. Some projects using the constructive approach are done by relatively small, dedicated teams of knowledge engineers and some are more community-based or crowd-sourced type efforts. Some produce proprietary or private models and some open or public models. A few are general purpose, like Cyc/OpenCyc, but most are targeted at specific vertical domains such as finance, travel or healthcare. I view the Semantic Web’s Linked Open Data (LOD) models as being constructive models. Some constructive models are developed for internal use and some for use by and/or sale to others. Some are explicitly exposed as models – conceptual schemas or ontologies, or at least taxonomies. Some are embedded behind applications and are never made visible to their consumers.

The constructive approach is a good fit if you want to produce a relatively-static semantic model for a well-bounded and relatively-static target domain. This approach has often been used when the resulting semantic model is shared and is intended to be consistent across the set of shared users. Although hand-crafting a large, complex model may not be a wise endeavor for the faint of heart or those with not a lot of time to spare, a constructive approach may be quite tractable if there’s a large, enthusiastic community contributing to the development (and maintenance!) of the model and if the problem space lends itself to ‘divide and conquer’ tactics.

Inductive

As the name implies, this approach involves inducing semantics through techniques such as topic clustering and other statistical [text] analysis applied against large volumes of corpora – think millions or hundreds of millions of documents. In other words, this approach can be described as machine learning or analytics performed over big data sets (or Big Data sets, to use buzz-worthy terminology).

Google is the star example here. Think about how Google Page Rank works with statistics based on the number of links to and from a given Web resource to some other resource, down to the keyword level in many cases. With enough data, you can create indices and associated statistical models based on the relationships among those resources and then use those to retrieve search results, suggest related topics, etc. Simple text indexing works pretty much the same way, where you extract keywords, analyze statistics about the frequency of their occurrence within a document (using for example term frequency inverse document frequency or TF-IDF algorithms), their co-occurrence with other keywords, etc. There are of course more complex algorithms for text analysis, as well as algorithms for images, voice and other multimedia types. In any case, it’s all about statistics and statistical patterns and relationships. This approach therefore works best when there are big data sets available to feed the analysis. Put another way, this approach makes sense if you have a really large amount of data and you want to be able to relate it (i.e., to index it) to other data in a relatively ad hoc, dynamic fashion. I would further describe this approach as being more actionable than reflective, so you should use this approach if you care primarily about operationally using the indices to provide results or answers, and not as much about creating and persisting specific, explicit semantic models behind those results or answers.

Blended

This approach is a blended or hybrid approach involving both constructive and inductive approaches. Typically this approach involves starting with a small-ish core or ‘upper’ ontology that’s typically comprised of quite broad classes or categories and then using that to help classify the topics or concepts that are induced via algorithms like cluster analysis. Where the topics or concepts aren’t already in the starter model, then the output of the [deeper] induction process can be used to extend the model with these new more specialized concepts.  Unlike the pure inductive method, here the model itself and richly-indexed content are both targeted outputs of the process. This process goes on in the standard “lather, rinse, repeat” fashion until you run out of compute power or money, or simply cannot statistically-induce any more semantics.

This approach is most appropriate if you want to create extensive, multi-dimensional, relatively-persistent semantic (i.e., concept) indices for large amounts of data and then use those indices to intelligently retrieve relatively-small numbers of highly-relevant results. Examples of such applications include information discovery within enterprise content management systems and question answering assistants, such as for customer care systems.  This approach may not be feasible for massive amounts of Web data that changes constantly. But for large sets of data that are relatively more persistent – like enterprise information – this approach can produce higher quality results over time. Of course given the additional processing, this approach can be slower than a pure inductive approach, so it may require the introduction of optimization techniques, particularly for real-time applications.

An example of a technology using this approach is a start-up under the umbrella of Frost Data Capital (formerly Frost Venture Partners) called MAANA, Inc. I had the opportunity to help them during the early, incubation stage of their life-cycle. Without getting into details, I can say they are doing leading-edge work in multi-dimensional semantic indexing, specifically over Hadoop/HDFS-based data stores, and that work includes innovative optimization techniques for large, enterprise-scale data sets.

Generative

The generative approach is a probabilistic approach that is essentially the opposite of the inductive approach. With this approach you start with a relatively small set of building block concepts. These are constructive primitives or atomic concepts rather than the broad or general concepts associated with the blended/hybrid approach. These get used with a set of generative rules to generate or synthesize candidate concepts that then get validated using a smaller set of reference corpora for evidential purposes.

The generative approach is applicable if you want to dynamically generate and utilize on-the-fly semantic models, particularly for highly-specialized or individualized topics that aren’t necessarily possible or feasible to model in advance. This covers two extremes, one where the volume of data is too small to lend itself to induction (for example, for new areas of data collection where there isn’t sufficient data yet) and to extremely large domains (where the sheer number of possible combinations and the cost to model those in advance using any of the other approaches would be prohibitive). In addition to the value of generating the individualized models themselves on-demand, this approach is valuable for content discovery and filtering (e.g., for applications such as personalized research or news readers) and for contextual knowledge building for personal assistants and other forms of intelligent software agents.

So far as I know, there is only one example today of a semantic technology that uses this generative approach and that’s a company I have been associated with for the past few years called Primal (www.primal.com).

Previously, during the last generation of semantic technology in the 1990s, I worked for another company that pursued this approach, albeit somewhat differently. That company was called Ontological Technology (Ontek) Corporation. In that case, the technology – which was called the Platform for the Automated Construction of Intelligent Systems or PACIS — depended upon a very precise, formal, foundational ontology from which all the other ontologies were to be – in theory at least – automatically generated. The driver for that was this: after having tried to hand-build a complete ontology for the engineering and manufacturing domains, the visionary behind PACIS decided to define a foundational ontology from which the ontologies for engineering and manufacturing – and potentially many other domains as well – could be automatically generated. It failed for obvious reasons. Or at least they were obvious after it failed. Frankly, the precision expected of that foundational ontology and of the ontologies to be generated from it – was unrealistic to achieve at that time and likely even still today. There were in any case many valuable results produced along the way, and from failure you learn.

That’s why I became associated years later with Primal. I felt Primal’s generative-based technology – which was a working prototype at the time I got involved – was much more pragmatic and practical, and scoped towards more achievable use cases. I spent a little over 2 years up in Canada working with the talented team at Primal to progress from prototype to Minimum Viable Product (MVP), through Alpha release, and now to the first commercial product built on that core technology — an intelligent automated content service. In other words, that technology, which dynamically generates a kind of taxonomy of user interests referred to as an Interest Graph, is now commercially in use.

Picking the right approach depends on the nature of the semantic modeling work you are doing and the resources available to do the work. Using the right approach, there will in any case be hard work ahead, but you should be able to achieve your goals. Choose the wrong approach, and as my son would say, it’s destined to end in an epic FAIL.

Standard
Semantics

Introducing Myself and N2Semantics

Image

This is my introductory blog. It’s hard to describe what I’m trying to do with N2Semantics without giving you a little background on who I am and what I’ve done so far in my career. I would say I’m a computer scientist or technologist, but not because I particularly like computers or information technology. I like what you can do with such technologies, or even more importantly, what they can do for you!

I spent the early part of my career designing and developing applications, primarily in the fields of product definition (engineering), development (manufacturing) and delivery (packaging, distribution, transportation). I ended up focusing on the representational aspects of such systems — their architecture, logic and data structures. I did a lot of enterprise information modeling, data and process modeling, system and database design, and software development and implementation (and support — let’s not forget support!). I found a passion in the challenges of reflecting the real world inside computer systems. That led me to gaining knowledge and experience in the field of knowledge representation and doing some of the early, pioneering work in conceptual models or conceptual schemas — what came to be known as ‘ontologies’.

Representing human knowledge in a form that computers can make use — and actually enabling them to make use of it — became a life-long pursuit. I’ve been pursuing it for over 20 years and I’m still pursuing it. Along the way, there have been lots of successes and also many failures. If you never make a mistake, you’re probably not pushing the edge of discovery. It’s from those failures that you learn (hopefully!) and they become the basis for progress and success. I’ll try to talk about some of the failures in future blog posts, as well as some successes.

I feel optimistic about pursuing intelligent systems today — more optimistic than I have ever been in my career. I feel the required technology components exist today, at least in sufficient form to put to productive, practical use. And that’s what I want to do. I’m not interested in doing fundamental research. I want to work with providers of leading-edge, innovative technologies and business people with real application ideas and challenges for which those technologies provide enabling solutions. While many technologies comprise pieces of this puzzle, in particular I focus on semantic technologies. I want to help companies use semantic technologies — along with mobile devices, content sources, social media, et al — to create intelligent software agents. In future posts I’ll talk about some of the [productive] ways software agents might assist us in going about our day-to-day lives. That’s what interests me and that’s what’s behind my starting up N2Semantics. The journey into semantic applications is going to be a fun, but challenging journey. Join me on that journey by following this blog.

Tony

Enhanced by Zemanta
Standard