How we designed a multi-profile system
How do we define a profile? One simple definition is, Profile is the user’s account who is currently logged in. But what if my family members are using my account, and their browsing behavior can be very different from mine. We have faced the challenge of personalizing their current browsing experience. This design is a step toward the ability to create and manage multiple profiles in the same user account and allow people to choose the respective profile and personalize their browsing experience.
Another goal of profile systems is to provide Size & Fit and Beauty profile as a feature on profile-level, where one account can have multiple profiles and a profile can have multiple folios like beauty folio, size & fit folio, and within the folio, there can be many attributes using which user browsing experience will be personalized. This project will also unlock many high potential stories to cater and provide deeper profile-level capabilities to bring uniqueness to the user experience, for example, beauty attribute, like skin attribute based product recommendation, ability to record skin profile over the time, and build skin profile graph etc.
What is a Profile system?
The profile system is the group of subsystems, services, and guidelines to govern and provide the capability to manage profiles and personalize user experience based on the currently chosen profile. We have abstracted the profile system into multiple subsystems, or micro-services, a very high level diagram of understanding the whole Profile system is below.
System Boundaries
Flock is the main service that would be responsible to create, manage and serve the profile and its attributes.
High level scope of Flock
- Flock can explicitly ask and serve the attributes to the user — Explicit attributes
- Flock can also implicitly store and serve the attributes — Derived attributes
- Unique profile attributes identified by other service, can be stored and served via Flock — 2nd hand attributes
What is out of scope of Flock?
- Predicting attributes values via Data science models is outside the scope of this system.
- Any analytics or batch processing of this data will be offloaded to The Myntra data platform.
- Size & Fit specific data and the recommendation based on Size and Fit is out of scope of this design.
- If we ask more domain specific questions to the users, that data will stay in that domain specific service, however if there is any unique attribute that can belong to the profile, it will go to Flock.
Requirements
- Create profiles under a user account.
- Provide relationships between profiles and accounts.
- Define new profile attributes dynamically.
- Support Implicit and explicit values in all the attributes.
- Support expiry of values at an attribute level.
- Native support of different types of attributes like boolean, number, date, Enum etc.
- Provide attribute update APIs for user path and nonuser path.
- Conflict resolution for an attribute of implicit or explicit sources and provide abstracted profile attribute values to clients.
- The support confidence score for implicitly driven attribute values with threshold values.
- Use case level attribute serving based upon dynamically based confidence threshold.
Throughput
Throughput expectation can vary from 300k RPM to 1M RPM, and the system should be horizontally scalable to support even higher throughput.
Latencies — Expected p99 percentile latency should be less than 100ms
Data Requirements — Our estimation was around 1.2 TB of data, and expected to be supported to 3–4 times so that the system is sustainable for the next 5 years at least.
High-level Design
Design Goals
On a high level, we had 2 goals to achieve, capability to manage profiles, capability to ask and serve multiple profile attributes, we should also be able to group the attributes, we should be able to dynamically adapt the product requirements to create new profile attributes for various use cases.
Design challenges
When we think of asking an attribute to the user, it’s very simple we’ll have one answer, but what if we want to derive the answer of that question/attribute, for example, gender also can be asked and can be derived, using data, using various models. Since attributes can become multiple, it becomes ambiguous to choose which one to serve because the answers can be multiple, and that when another concept kicks in, called conflict resolution. Now the challenge is if we also plan to resolve conflict in the time of serving attributes, service scaling will be a challenge.
The high level challenges were
- Hosting multiple attributes
- Conflict resolution
- Data modeling with the above 2 considerations
- System scaling
After doing a little brainstorming, we realized that solving all the problems using one system is not sustainable and then we started using the first design principle to segregate the responsibilities. There should be one service (S1), whose responsibility should be to capture the data from multiple sources, which will be read-heavy system, which can also take ownership to compute over huge amount of data collection and publish the final value of attribute. And there can be another service(S2), which will consume the final value published from S1 and store to serve via API, basically read-heavy path. If we think this way, those system becomes simple, segregated responsibility and scalable in it’s defined boundaries
S1 we called is Dragon
S2 we called is Flock.
Responsibility & Boundaries of Dragon
- Dragon will become the system which will allow multiple sources to send data, for example size & fit, MDP, Product finder etc. and store all the possible values of the attribute.
- Attribute values stored in Dragon is not gonna be served directly
- It will have one conflict resolution algorithm to compute the final value of the attribute
- The value of the attribute may have confidence also, because confidence of the value may not be 100%
- The attribute computed value can have expiration, and so it can also have a decay mechanism
- Dragon will not serve the data directly via API
- It will not do what MDP is doing, like store huge number of records and compute
- It will not store what is responsible to store in domain specific services like size n fit
- Dragon will play a better role when we’ll start using Data science to derive the various attributes and those attributes will be required to serve multiple use cases
Responsibility & Boundaries of Flock
- Flock should have capability to create/manage profiles of the user, via exposing CRUD API for the profiles
- Flock should persist Explicit, Derived and 2nd hand attributes, and serve
- In case of Derived, Flock will store the final computed unique values of the attribute, not multiple values of the attribute
- Flock will store the explicit attributes, the questions which are directly asked to the customer via profile flow since explicit attributes will have higher precedence over implicit, so that is not required to send to Dragon and go through conflict resolution, so Flock can directly store and serve it
- Flock has to be very careful to decide what is the profile attribute and what is not the profile attribute
- Flock should also store and serve those 2nd hand attributes, which can uniquely define user behavior, and can be used to serve for different use cases, for example waist size.
- Flock data should also replicate to MDP, so that we can run queries and build multiple product features like personalization, profile understanding etc.
- Attributes can be grouped together, we can call it bucket and we should be able to serve bucket level data via APIs
- We’ll never serve the attributes by taking dynamic input in the API as list of attributes, it can only be done via bucket
- Bucket will not have buckets. We had initially thought of it, but then it became technically complex to manage, which didn’t have any added advantage, so we stuck to just one level.
- Bucket defining and end-to-end management should be the responsibility of the Admin.
After the segregation of these responsibilities into 2 different services, they can be scaled independently.
High level design diagram
Flock is the service layer, which can host profiles and attributes, it has the persistence layer, where we are using MongoDB, and to scale the throughput it also has the Redis cache layer.
Dragon is a write-heavy system, which can receive the data from multiple other systems, perform conflict resolution and send the data to Flock for serving.
High level design — Integrated view of the profile system
Profile system is integrated understanding of the profile & attribute management, and the above diagram provides that view that how the entire life cycle of profile and attributes will be handled.
Flock is a central microservice that is responsible for hosting the profiles and attributes, but there are many other services that will play a vital role in building the overall profile system.
The survey service is going to own the asking question to the user and persisting its answer, questions can be designed in a form or Survey as well, and finally survey will send the profile attribute related questions and answers to Flock.
Size and fit service, will own the size and fit specific domain attributes, I’ve explained the difference in domain specific attributes and profile attributes in this document, check the section “Difference in profile attributes and non-profile attributes”.
Beauty service will be another service that will hold beauty specific domain attributes, same as Size and fit, but it may have different structures, and use cases, so we defined the different services.
Central data system will ingest the flock profile attributes, and will also be ingesting other profile attributes via different systems, and ultimately will be sending all the derived attributes to Dragon, and Dragon will finally resolve the conflict of attribute values and send to flock for serving purpose.
Low level design
Data modeling is one of the most complex problems of this project, let’s first look at the high level requirement on data modeling and the different approaches to solve it.
- For attributes, we don’t have a fixed type for all the attributes.
- We need secondary indexing on User_ID and Profile_ID on the attribute.
- Based on the source of data, the type of value can be explicit or implicit.
- For each attribute, we can have multiple sources. Dragon service will resolve the conflict. and store the final output.
- After resolving the conflict of the final output, we have to store the data in the same place which can be used in the Central data system as well.
- We don’t want to make any code changes every time a new attribute is onboarded to the system.
Approach 1 (profile and attribute segregation)
Collections are
Profile_attribute_meta — contains the metadata, folio relations etc.
User_profile — contains profile basic data, like name, relationship, default(yes/no)
User_profile_attribute — contains all attributes (explicit & implicit both) — will be used for personalization
- For each user there will be one record in following collections (b,c)
- Every field is nothing but an attribute, for example gender, affinity (except name, relationship and profile image)
- Confidence — There is a score associate with each attribute, each source
- Expiry — Every attribute has a field called created_at, that will be compared with meta_data for that attribute and evaluate if that attribute is expired or not
- In User_profile_attribute, explicit attributes and implicit attributes are different records — considering implicit data can auto expired using mongodb feature
- Explicit attribute will be an object, and Implicit attribute will be a list of objects — When we draft the first version, we were considering Explicit will always override the previous one, whereas implicit can be multiple.
Approach 2 (Two different collections for explicit and implicit)
Collections are
Profile_attribute_meta — contains the metadata, folio relations etc.
User_Profile_attribute_explicit — contains explicit attributes
User_Profile_attribute_implicit — contains implicit attributes
User_Profile — contains basic info associated with profile, like default profile
- For each user there will be one record in the following collections (b,c,d)
- If a user is new, then there won't be any data in Profile_attribute_implicit, it will be predicted by Data science team
- Every field is nothing but an attribute, for example name, gender, affinity, relationship
- Confidence — There is a score associated with each attribute, each source
- Expiry — Every attribute has a field called created_at, that will be compared with meta_data for that attribute and evaluate if that attribute is expired or not
- Explicit attribute will be an object, and Implicit attribute will be a list of object
Approach 3 (segregation of single and list attributes)
- Collections are
- Profile_attribute_meta — contains the metadata, folio relations etc.
- User_profile_attribute — contains single objects for one attribute
- User_profile_attribute_list — contains a list of objects for one attribute
- For each user there will be one record in the following collections (b,c)
- For read path — we’ll use User_profile_attribute
- Data science team will write to User_profile_attribute_all table
- There will be one process to resolve the conflict, which will read from User_profile_attribute_all and conclude one value and write in User_profile_attribute
- Every field is nothing but an attribute, for example name, gender, affinity
- Confidence — There is a score associated with each attribute, each source
- Expiry — Every attribute has a field called created_at, that will be compared with meta_data for that attribute and evaluate if that attribute is expired or not
- In both the collection, Different attributes will be kept in different documents
Final approach
Profile attribute meta
Profile bucket meta
Profile
Profile attributes
Profile implicit attributes
- Profile Bucket Meta is the collection of attributes, single attribute read-write both are discouraged, it will be possible via buckets only.
- Profile collection will contain only Name and Image, because they are fixed information, they are not used for behavior tracking, prediction, or personalization.
- Profile attributes is the collection for storing attribute name and their values, values can be single value or the list also, gender, affinity and all possible profile information will be part of this collection.
**How are we going to support the onboarding of new attributes?
**We have used two different collections attribute metadata and attribute bucket for managing metadata. Profile attribute metadata contains the definitions of attributes. If we want to introduce a new attribute, we can simply add it to this collection, and for that attribute read/write will work without any code.
collection_name: “Profile attribute meta” In order to organize the hierarchical structure of the attributes, we are maintaining the buckets.
collection_name : “Profile bucket meta”
A bucket can contain a set of attributes, so the bucket is a flat structure. Profile bucket metadata contains the bucket information, defining, un-definfing, re-defining of buckets will be happening in this collection. We have designed to not provide write/read attribute level data, we didn’t want to provide that flexibility so that we can control the traffic pattern and work well within the limitations to perform better. Maybe based on the most accessible bucket or bucket priority, we can choose a caching strategy, which won’t be possible without a bucket as a tight boundary.
Attribute details
Attributes data are persisted in 2 different collections for different purposes. Explicit data is something which will have one value as of now, whereas implicit data might have multiple values, so storing them in different collections.
Resolution — can happen based on score, time or any other logic also, so just mentioning the logic of resolution in this field
Profile implicit attribute is not needed at all in the Flock system, it would be required in the Dragon service, but we are discussing the schema to bring clarity of the data model to build the whole profile system. If you look at the schema below, explicit attributes will have one value, whereas implicit will always have a list of values.
collection: “Profile implicit attributes”
Definition of Source, Score, Threshold, and Expiry fields
There is a score associated with each attribute for different sources, sources can be Data science derived, in that case score can be the precision/recall kind of values, sources can be analytics based on styles the user has bought, again one score will be associated with, source can also be explicit input via some other product use case, like someone is triggering set of question and doing product suggestions, maybe they can also send the data to Flock, so this schema provides capability to persist the different values of same attributes via different sources, and each source can have it’s associated score.
Expiry — Every attribute has a field called created_at, that will be compared with meta_data for that attribute and evaluate if that attribute is expired or not
Why did we choose to keep gender as part of the attribute not profile?
Profile collection is just creation of profile, like Name and we have also considered Image as part of profile, image can be avatar or a real image also. Attributes of the profile can be gender, brand affinity, price affinity etc.. Gender is not a profile because gender means we need to start personalizing user experience. If we keep Gender in Attribute, it would be complicated to understand the data model, using which we are going to personalize the user experience. However this segregation is only on the data layer, on the front-end we have gender as it is also part of profile creation, and so handling profile and gender persistence is done on API level.
Types of attributes
We have classified the attributes into the following categories
- Explicit attributes — Flock can explicitly ask and serve the attributes to the user
- Implicit attributes
Derived attributes — Flock can also store and serve the implicit attributes, Dragon will own it, and resolve the conflict and send it to Flock for serving.
2nd hand attributes — Unique profile attributes identified by other services, that can be stored and served via Flock, for example Survey service is asking questions and finding that this is profile attribute, in that case, Survey sending directly to flock, is a 2nd hand attributes, it is not sent via Dragon.
Explicit attributes will be written and read directly via a Flock API, whereas there will be different write and read paths for implicit attributes.
Caching
In order to serve the high traffic use cases like gender, we are serving this via caching. For caching we have explored redis and aerospike in-memory cache, Both aerospike and Redis can fetch the data in small latency, so let’s look at their comparison views.
Pros and Cons for Redis vs Aerospike
- Redis is older technology, community support is much stronger in redis.
- Cluster size is limited in aerospike (communiy edition). In redis the cluster size can be much larger.
- Latency they are serving at almost the same time at the client side.
- As new versions are released, aerospike stops the support for older versions, and it does not support the cluster with different versions.
- Aerospike has the concept of bins. It will not be using any hash data structure.
- Resharding is always simple in aerospike. We have to add one host. It will take care of resharding, for redis it will take some downtime for adding a new node.
Redis looks like to suitable as we need exactly for caching and the recent data size may not grow to be huge, and keeping the above points in mind we decided to go with Redis.
What all is going to Redis
- Profile data (name, profile image)
- Attributes
- Default Profile
Redis schema
Redis Key :
:<User-UUID> Value : HSET with Profile UUID as hash key.
2. Attributes
Redis key:
:<Profile-UUID> Value: HSET with key as attribute name and value as an attribute value.
3. Default Profile — Redis key as User_ID, value is the payload for default profile include Profile_ID. This can cause a consistency issue in case we allow the change of default profile. Currently, default profile change is not part of the mocks we have.
The difference in profile attributes and non-profile attributes
Flock should store all the profile attributes, or some attributes or a few attributes, which attribute to store which not, and why, and this clarity is supercritical. If we do not define this, it will become a generic system to dump and serve any data, which eventually becomes difficult to manage, and will not scale.
Let’s consider user location, do we think it should be in Flock? Technically it looks simple, there are many personalization use cases which will demand location of a user or a profile and Flock should serve it, but if we think that way, why not store and serve user payment options also, maybe orders also, or maybe wishlist also, so we need to define this very clearly.
Any attribute which can uniquely identify the profile behavior and is not specific to a certain domain can be thought to store and serve via Flock, and those attributes can be called as profile attributes.
If we think of location, it should ideally go to location service, because on top of location data someone can ask to make more APIs, capabilities, like give me tier or pin code of the users, based on pin codes, give me top 1000 users, all such capabilities can not be build in Flock, isn’t it? so it becomes obvious that Location should be stored and served in location service.
Now consider, Size n Fit attributes, like we are asking the user to provide topwear size information, in that case we are asking via its own way, because of its product need, look at the below screenshot.
Now the question is, shall we store and serve all this data in Flock or not, or even Dragon or not, basically a profile system or not?
Applying the same logic which we applied in case of location data, since these are the information which are very specific Size n Fit domain, we should store them in SizeNFit service, not in profile service(Flock). Because based on domain specific information, the product would want to build its own capabilities, and building all of those capabilities won’t be the scalable in Flock, so that segregation is defined from now only, will give a better direction to the architecture.
So, Those attributes which are very specific to the business domain, should go to domain specific service, not in Flock.
There is one possibility that the domain specific service can have an attribute which will usually not change quite often, and which can help enrich the profile of the user can go to Flock also, we’ll call them 2nd hand attributes, for example waist size, bust size which are asked by SizeNFit service and stored also, but Flock can also store those attributes and serve it.
The purpose of storing 2nd hand attributes would be
- Profile completeness, we’ll know more about the profile
- Several product features can be built using Flock API like a recommendation, communication, etc. ( Note — If the product needs deeper data specific to that domain, we should ideally use domain-specific service, not Flock)
Have a look at the screenshot below
This is all specific to beauty folio, or beauty profile, so where should this data go?
Again applying the same principle, if the information is very specific to beauty, it should ideally go to beauty service (or BPC service), but those attributes which can be unique to profile can be stored/served in Flock also. For example skin type, that usually doesn’t changes so can go to Flock, but skin concern which can vary over a time, we can create a timeline view of skin data and create product features to give better experience to user about skin concern, skin issues etc., so that information should go to Beauty service. Since, we don’t have Beauty specific use cases as of now, meaning those attributes will be pretty much unique and will not have any other capability than just store and serve, we are choosing to store them temporarily in Flock only. But any next feature development specific to Beauty profile, we’ll create Beauty as a new service and store there. This create a beautiful thought process and roadmap when product will evolve technology will know exactly how to scale it, because we have thought about it.
Credits
Project Flock was developed by a bunch of engineers, and everyone's contribution is highly valued, as this was developed with lots of passion and technical depth.
Architect
Engineers Anshuman, Sarthuak, Shreya, Utkarsh, Parth, Tushar, Stuti, Suhas, Sushrita
Manager Venu Babu Nara, Prashant Kumar (Me)