Designing a Scalable Gamification Engine — Part 3— Event Processing

This is the third article in a series detailing my journey to create a robust, scalable, and performant platform that enables gamification for other applications.

Building upon my previous article, we have an overall architecture proposed and a suggested schema for the main components of our gamification engine. In this article we will dive into the event processing logic that will determine if received events are relevant to any configured goals.

For reference, here are the relevant schemas:

// Example goal definition
{
id: "goal-1234",
name: "Mobile Power User",
desc: "Log in at least 5 times on a mobile device.",
targetEntityId: "userId",
criteria: [
{
id: "criterion-9999",
qualifyingEvent: {
action: "log-in",
platform: "mobile"
},
aggregation: "count",
threshold: 5
}
]
}
// Example event that our system will need to process
{
clientId: "client-app-1234",
action: "log-in",
platform: "mobile",
userId: "john-doe-1234",
foo: "bar" // could have irrelevant data
}

You can read the goal schema as “goal metadata and criteria which describe which events are relevant to someone achieving the goal”. In this case, we define the “Mobile Power User” goal, and show off a sample event that would result in user “john-doe-1234” making progress towards that goal.

Notice that goal.criteria[].qualifyingEvent is intentionally generic — it is just a map of fields and values. To avoid unnecessary coupling and to promote broad usage, our gamification engine cannot make assumptions about which fields will be present within rules or events. A client should be able to define a goal with event data which is relevant to its application.

The generic schema introduces performance complexity as it prevents us from optimizing lookup queries based on known fields (e.g. creating indices for specific fields). We will may have to scrutinize this aspect of our system if performance is unsustainable.

So, given this schema and context, our task is to efficiently find relevant goals for every received event.

Let’s break our problem down into a simpler form. The most important construct of our goal schema is goal.criteria[].qualifyingEvent as this is what we use to determine if a received event is relevant to a goal. Additionally, in our system, progress is actually made towards each individual criterion within a goal, not just the overall goal itself. These two facts lead us to believe that criterion should be a first class citizen in our data model.

So, instead of focusing on goals, our event processing logic will hone in on the individual goal criteria. To facilitate this, we promote ‘criteria’ to a top-level construct in our data schema. Note that this requires a relationship in between entities and may encourage a move to a relational database at some point.

// Example goal definition
{
id: "goal-1234",
name: "Mobile Power User",
desc: "Log in at least 5 times on a mobile device.",
criteria: ["criterion-9999"]
}
// Example criterion definition
{
id: "criterion-9999",
targetEntityId: "userId",
qualifyingEvent: {
action: "log-in",
platform: "mobile"
},
aggregation: "count",
threshold: 5
}

Upon inspecting the new schema, we notice that if the data points of a criterion.qualifyingEvent are a subset of the data points of a received event, then we know that the received event is relevant to the corresponding criteria. This is a handy observation that mathematically describes the relationship in between goal criteria and relevant events.

Understanding this, we can allow for faster event processing by structuring our data to make it easier to identify subsets. We flatten our key-value criteria pairs into an array of encoded strings as such:

// Example criterion definition
{
id: "criterion-9999",
qualifyingEventFields: ["action=log-in", "platform=mobile"]
aggregation: "count",
threshold: 5
}
// Example (transformed) received event
["clientId=client-1234", "action=log-in", "platform=mobile", "userId=john-doe-1234", "foo=bar"];

We now have normalized string arrays in our criteria and events upon which we can detect subsets.

One approach for determining if a criterion’s conditions are a subset of a received event’s data would be to use Mongo set operators like such:

// Transform received event into array of name/value pairs
let receivedEvent = ["clientId=client-1234", "action=log-in", "platform=mobile", "userId=john-doe-1234", "foo=bar"];
// Execute Mongo query to find qualifyingEvent subsets
db.coll.find({qualifyingEventFields: "$not": {"$elemMatch": {"$nin" : receivedEvent }}}});
// criterion 9999 from examples above would be returned

This command effectively says “give me criteria where it is NOT true that at least one qualifying event field is missing from the received event fields”. In other words: give me criteria that are subsets of the received event.

Alternatively, there is a Mongo $setIsSubet aggregation that we could use, but our command returns the criteria themselves via a direct query, which is more desirable.

This approach will work, but I am concerned by a few things:

  1. Our received events may have lots of irrelevant data unrelated to configured goal criteria, yet we include those irrelevant fields in our subset calculations. This can be solved by stripping off fields from received events that are not present in a global list of known rule fields before checking for subsets.
  2. We will be needlessly reperforming subset calculations against identical events even though goal criteria almost never change. This can be addressed by introducing in-memory caching before querying the database. Caching can be configured for a reasonable period of time, say 5 minutes. User goals rarely change, and immediate goal recognition is not critical, so this is OK to start with.

Illustrated, our event processing looks like this:

Confirming Solution Appropriateness

We want to support ~1000 events per second and can’t let event processing become a bottleneck.

Luckily, the heavy lifting (subset calculation) will be handled by Mongo which can certainly handle ~1000 requests/sec given a moderately sized cluster with secondary nodes using in-memory storage.

Additionally our server-side in-memory caching will reduce the throughput reaching Mongo, and stripping irrelevant fields from incoming events will reduce processing time, too.

Engineer, Repairman, Tinkerer