Traditional Culture Encyclopedia - Photography major - How does the financial industry use big data to build accurate user portraits?

How does the financial industry use big data to build accurate user portraits?

The focus of user portrait is to label the user. A label is usually a highly detailed feature identification specified by people, such as age, gender, region, user preference and so on. Finally, you can integrate all the tags of the user and outline the three-dimensional "portrait" of the user.

In order to accurately describe the characteristics of users, we can refer to the following ideas, from the establishment of user's micro-portrait to the tag modeling of user's portrait to the data architecture of user's portrait, and analyze it layer by layer from micro to macro.

First of all, from a microscopic point of view, how do we grade the microscopic portraits of users? As shown in the following figure

General principle: on the basis of the first-level classification, gradually subdivide the above classification.

The first category: demographic attributes, asset characteristics, marketing characteristics, hobbies, shopping hobbies, and demand characteristics.

There are many methods of user portrait in the market, and many enterprises also provide user portrait service, so it is very difficult to upgrade user portrait. Financial enterprises are the first industry to start user portraits. Because of the rich data, financial enterprises can't start with data from many latitudes when making user portraits. They always think that the greater the latitude of user portrait data, the richer the portrait data. Some input data are weighted and even modeled. User portrait is a huge and complicated project. However, I spent a lot of time and found that only the portrait of the user was left, which was far from the business. There is no way to directly support the operation of the enterprise, and the investment is huge but the return is small. It can be said that it is not worth the loss and cannot be explained to the leaders.

In fact, the data latitude involved in the user portrait needs to be combined with the business scene, simple and capable, and has strong relevance to the business, which is convenient for screening and further operation. User portraits need to adhere to three principles, namely, crowd attributes and credit information, strong relevant information and qualitative data. The following are explained and analyzed respectively.

There is a lot of information describing a user. Credit information is an important information in a portrait of a user. Credit information describes a person's consumption ability in society. The purpose of any enterprise to make user portraits is to find the target customers, and the target customers must be users with potential consumption power. Credit information can directly prove customers' spending power, and it is the most important and basic information in users' portraits. A joke, all information is credit, that's the truth. It contains information about consumers' work, income, education, property and so on.

We need to introduce strong correlation information and weak correlation information. Strong correlation information is information directly related to the scene requirements, which can be causal information or information with high correlation.

If the range of correlation coefficient is defined as 0 to 1, the correlation coefficient above 0.6 should be defined as strong correlation information. For example, other things being equal, the average salary of people around the age of 35 is higher than that of people with an average age of 30, the average salary of students majoring in computer science is higher than that of students majoring in philosophy, the average salary of people working in the financial industry is higher than that of people working in the textile industry, and the average salary of Shanghai is higher than that of Hainan Province. From this information, we can see that people's age, education, occupation and location have a great influence on income and have a strong correlation with income level. Simply put, the information that has a great influence on credit reporting is strongly relevant information, and vice versa.

Other information of users, such as the user's height, weight, name, constellation and other information, is difficult to analyze its influence on consumption power from the probability. These weakly related information should not be analyzed in the user's portrait, which has little influence on the user's credit consumption ability and has little commercial value.

It is a principle of user portrait that strong related information should be considered and weak related information should not be considered in user portrait and user analysis.

For example, customers can be divided into age groups. 18 -25 is defined as young people, 25 -35 is defined as young people, and 36-45 is defined as middle-aged people. You can refer to personal income information and define people as high-income, middle-income and low-income people. Customers can also be defined as high, medium and low levels by referring to asset information. The categories and methods of qualitative information, finance can start from its own business, there is no fixed model.

Another principle of user portrait is to gather all kinds of quantitative information of financial enterprises, classify and characterize qualitative information, which is beneficial to quickly screen users and locate target customers.

The following contents will introduce in detail how to establish model output labels and weights according to user behavior. An event model includes three elements: time, place and people. Every user's behavior is essentially a random event, which can be described as: what user, when, where and what he did.

What users: The key lies in the identification of users. The purpose of user identification is to distinguish users and locate users at a single point.

The above lists the main methods of user identification on the Internet, and the acquisition methods are from easy to difficult. According to the user stickiness of the enterprise, the identification information that can be obtained is also different.

What time: time includes two important information, timestamp+time length. Time stamp, in order to identify the time point of user behavior, such as1395121950 (precision to seconds),1395121950.05438+02 (precision) Because that accuracy of microsecond timestamp is unreliable. Browser time accuracy, the accuracy can only be up to milliseconds. Duration, in order to identify the time that users stay on a page.

What location: user contact point, contact point. For each user contact point. Potentially contains two layers of information: URL+ content. Website: Each url link (page/screen) locates an Internet page address or a specific page of a product. It can be a page URL of an e-commerce website on a PC, a function page of applications such as Weibo and WeChat on a mobile phone, or a specific screen of a product application. For example, the Great Wall wine single product page, the WeChat subscription number page, and the clearance page of a game.

Content: the content in the url (page/screen) of each URL. It can be related information of a single item: category, brand, description, attributes, website information and so on. For example, red wine, Great Wall, dry red, for each Internet contact point, URL determines the weight; Content determines the label.

Note: The contact point can be a website or a specific functional interface of the product. For example, the same bottle of mineral water, the supermarket sells 1 yuan, the train sells 3 yuan, and the scenic spot sells 5 yuan. The selling value of goods lies not in the cost, but in the place of sale. Labels are all mineral water, but different contact points reflect different weights. The weight here can be understood as the different needs of users for mineral water. That is, the value they are willing to pay is different.

Label weight

Mineral water 1 // supermarket

Mineral water 3 // train

Mineral water 5 // scenic spot

Similarly, users browsing wine information in JD.COM Mall is different from browsing wine information in Shangpin Wine Network, which shows their preference for wine. The point here is that the weights are different on different websites. The construction of weight model needs to be built according to their respective business needs.

So the URL itself represents the user's label preference weight. The content corresponding to the URL reflects the tag information.

What is this? User behavior types have the following typical behaviors for e-commerce: browsing, adding shopping cart, searching, commenting, buying, clicking like, collecting and so on.

Different behavior types have different weights for the tag information generated by the content of the contact point. For example, the purchase weight is 5 and the browsing weight is 1.

Red wine1/Browse red wine

Red wine 5 // Buy red wine

Based on the above analysis, the data model of user portrait can be summarized as the following formula: user identity+time+behavior type+contact point (website+content), when and where the user did what. That's why it's labeled * *.

For example, user A browsed a bottle of Great Wall dry red wine worth 238 yuan on Shangpin Wine Online yesterday.

Label: red wine, the Great Wall

Time: Because it was yesterday's behavior, suppose the attenuation factor is: r=0.95.

Behavior type: the browsing behavior is recorded as weight 1.

Venue: The sub-weight of JD.COM wine menu page website is 0.9 (in contrast, the sub-weight of Shangpin wine menu page is 0.7).

Assuming that users really like red wine, they will go to a professional wine network to buy it instead of buying it in a comprehensive mall.

Then the user preference label is: red wine, with the weight of 0.95*0.7 * 1=0.665, that is, user A: red wine 0.665, Great Wall 0.665.

The selection of the above model weights is only an example for reference, and the specific weights need to be modeled twice according to business requirements. What is emphasized here is how to establish the user portrait model from the overall point of view, and then gradually refine the model.

This paper does not involve specific algorithms, but expounds an analysis idea, which can provide a systematic and framed thinking guidance for you when planning to build user portraits.

The core lies in the understanding of the user's contact point, and the content of the contact point directly determines the label information. Content address, behavior type and time decay determine that weight model is the key, and the secondary modeling of weight value itself is a natural advancement. For example, the model emphasizes e-commerce, but in fact, contact points can be redefined according to different products.

For example, for film and television products, I watched a movie "True colors of heroes". The possible labels are: Chow Yun Fat 0.6, gunfight 0.5, and Hong Kong and Taiwan 0.3. Finally, the contact point itself does not necessarily have content, but can also be summarized as a certain threshold, how many times a certain behavior has been exceeded, how long it has taken, and so on.

For example, game products, typical contact points may be, key tasks, key indicators (scores) and so on. For example, if the score exceeds 10000, it will be marked as a diamond user. Diamond user 1.0.

Percent has fully applied the user portrait technology in the recommendation engine. In the application of an e-commerce customer and new visitors to the activity page, relying on the personalized effect generated by the user portrait, compared with the hot-selling list, the recommendation effect has been significantly improved: the click-through rate of the recommended column has increased by 27%, and the order conversion rate has increased by 34%.

The internal information of financial enterprises is distributed in different systems. Generally speaking, demographic information is mainly concentrated in customer relationship management system, credit information is mainly concentrated in trading system and product system, and also concentrated in customer relationship management system. Consumption characteristics mainly focus on channels and product systems.

Hobbies and social information need to be introduced from outside. For example, the behavior track of customers can represent their hobbies and brand hobbies, and the location information of mobile devices can provide more accurate hobby information. Social information can be collected and analyzed with the help of the text mining ability of the financial industry itself, and can also be obtained directly on social networking sites with the help of the technical ability of manufacturers. Social information is often real-time information with high commercial value and high conversion rate, and it is the main source of information for big data prediction. For example, users ask interesting questions about Rome on social networking sites, indicating that users may have the need to travel abroad in the future; If the customer is comparing the Excellence of two cars, then the customer is more likely to buy a car. Financial enterprises can intervene in time to provide financial services to customers.

Customer portrait data is mainly divided into five categories, including demographic attributes, credit information, consumption characteristics, hobbies and social information. These data are distributed in different information systems, and financial enterprises are online in data warehouse (DW). All the strongly related information related to portraits can be sorted and concentrated from the data warehouse, and the original data of user portraits can be generated by batch operation and data processing according to the business requirements of portraits.

Data warehouse has become the main processing tool for user portrait data, which classifies, filters, induces and processes the original data according to business scenarios and portrait requirements to generate the original data needed for user portraits.

The latitude information of user portraits is not as much as possible. You only need to find five categories of portrait information with strong relevance, strong relevance to business scenarios and strong relevance to products and target customers. There is no 360-degree user portrait information at all, and there is no rich information to fully understand customers. In addition, the validity of the data should also be considered.

According to the principle of user portrait, all portrait information should be strongly related to five categories. Strongly relevant information refers to information strongly related to business scenarios, which can help the financial industry locate target customers, understand their potential needs and develop the required products.

Only strong relevant information can help financial enterprises effectively combine business needs and create business value. For example, name, mobile phone number and home address are strong demographic information that can reach customers, and income, education, occupation and assets are strong related information of customer credit information. Tourists, overseas tourists, car users, tourists, mothers and babies are strongly related to consumption characteristics. Photography enthusiasts, game enthusiasts, fitness enthusiasts, movie crowds and outdoor enthusiasts are the strong related information of customers' hobbies. The information published on social media, such as travel demand, travel strategy, financial consultation, car demand, real estate demand, etc., represents the inner needs of users and has a strong correlation with the application of social information scenarios.

There is a lot of internal information in financial enterprises. It is not necessary to use all the information in the user portrait stage, but only the information strongly related to the business scenario and target customers, which is helpful to improve the product conversion rate and reduce the return on investment (ROI), help to find the business application scenario simply, and is also easy to realize in the data realization process.

Don't make the user portrait work too complicated, which has little to do with the business scene. This will make many financial companies, especially leaders, lose interest in user portraits and see the business of user portraits, unwilling to invest in the field of big data. Bringing business value to enterprises is the main motivation and purpose of user portrait work.

After collecting all the information, financial enterprises need to classify and screen the quantitative information according to their business needs. This part of the work is suggested to be carried out in the data warehouse, but not in the big data management platform (DMP).

Quantitative classification of qualitative information is an important part of user portrait, which requires high business scenarios and tests the transformation of user portrait business requirements. Its main purpose is to help enterprises simplify complex data, classify transaction data qualitatively, and integrate the requirements of business analysis to commercialize data. For example, customers can be divided into life stages such as students, youth, youth, middle age, middle age and old age. The demand for financial services originated from different life stages is different. When looking for target customers, we can target customer orientation through life stages. Enterprises can divide customers into low, medium and high-end customers according to their income, education and assets, and provide different financial services according to their financial service needs. You can refer to its financial consumption records and asset information, as well as trading products and purchased products, qualitatively describe the consumption characteristics of customers, and distinguish e-commerce customers, wealth management customers, insurance customers, stable investment customers, enterprising investment customers, catering customers, tourism customers, high-end customers, civil servants customers and so on. External data can be used to identify customers' hobbies, such as outdoor lovers, luxury lovers, technology product lovers, photography lovers, high-end car demanders and other information.

Summarizing quantitative information into qualitative information and labeling it according to business requirements will help financial enterprises to find target customers, understand their potential needs, find target customers for products in the financial industry, conduct accurate marketing, reduce marketing costs and improve product conversion rate. In addition, financial enterprises can timely recommend products, design products and optimize product processes according to customers' consumption characteristics, hobbies and social information. Improve the activity rate of product sales and help financial enterprises to better design products for customers.

The purpose of using data for portrait is to provide data support for business scenarios, including finding the target customers and reaching customers of products. Financial enterprises' own data are not enough to understand customers' consumption characteristics, hobbies and social information.

Financial enterprises can introduce external information to enrich customer portrait information, such as information of UnionPay and e-commerce to enrich consumption characteristic information, location information of mobile big data to enrich customer interest information, and data of external manufacturers to enrich social information.

External information has many latitudes and rich contents, but how to introduce external information is a challenging task. When introducing external information, we need to consider several issues, namely, the coverage of external data, how to get through with internal data, the matching rate with internal information, the degree of information correlation and the freshness of data. These are the main considerations for introducing external information. External data are mixed, and data compliance is also an important consideration for financial enterprises when introducing external data. Sensitive information such as mobile phone number, home address, ID number, etc., should pay attention to privacy issues when introducing or matching. The basic principle is that data matching and verification can be carried out without data exchange.

External data will not be concentrated in one company, which requires financial enterprises to spend a lot of time looking for it. Getting through external data and internal data is a very complicated problem. The MD5 numerical matching of mobile phone number/device number/ID number is a good method, which does not involve the exchange of private data and can be uniquely matched. According to the experience in the industry, no enterprise's external data can meet the requirements of the enterprise, and the introduction of external data requires various data. Generally speaking, the data coverage rate is above 70%, which is very high. When the coverage rate reaches more than 20%, it can be commercialized.

Good partners of external data sources in the financial industry include UnionPay, Sesame Credit, operators, China Airlines, Yun Teng Tianxia, Tencent, Weibo, Qianhai Credit Information, and major e-commerce platforms. There are many data providers in the market, and the data quality is good. It is necessary for the financial industry to explore one by one or entrust a manufacturer to introduce it. It is a good attempt for an independent third party to help the financial industry introduce external data, which can reduce data transaction costs and data compliance risks. In addition, big data trading platforms in major cities and regions are also a good way to introduce external data.

The main purpose of user portrait is to let financial enterprises tap the existing data value, use data portrait technology to discover the potential needs of target customers and customers, and promote products and design and improve products.

User portrait is an important way to realize commercial realization of business scene data. User portrait is an important closed loop in the process of data thinking operation, which helps financial enterprises to use data for refined operation, marketing and product design. User portraits are all commercial operations of data, focusing on business scenarios and helping financial enterprises deeply analyze customers and find target customers.

DMP (Big Data Management Platform) plays the role of data realization in the whole user portrait process. From a technical point of view, DMP tags portrait data, uses machine learning algorithm to find similar people, and deeply combines with business scenes to screen out valuable data and customers, locate target customers, reach customers, record and feedback marketing effects. DMP, a big data management platform, was mainly used in the advertising industry in the past, but not in the financial industry, and will become the main platform for data business applications in the future.

DMP can help credit card companies to screen out customers who may pay in installments in the next month, customers who buy a lot of electronic products, wealth management customers, high-end customers (few assets in our bank, many assets in other banks), insurance, life insurance, education insurance, auto insurance and other customers, stable investors, radical investors, wealth management and other customers, and can reach these customers, improve product conversion rate, and use data to carry out. DMP can also understand customers' consumption habits, hobbies and recent needs, customize financial products and services for customers, and conduct cross-border marketing. Make use of customers' consumption preferences to improve product conversion rate and user viscosity.

As a platform for introducing external data, DMP also introduces external valuable data into financial enterprises, supplements user portrait data, and creates different business application scenarios and business requirements, especially the application of mobile big data, e-commerce data and social data, which can help financial enterprises realize data value, make user portraits closer to commercial applications, and reflect the commercial value of user portraits.

The key to user portrait is not to analyze customers 360 degrees, but to bring business value to enterprises. Without commercial value, talking about user portraits is playing hooligans. The starting point of the financial enterprise user portrait project must be based on business needs, strong relevant data and the application of business scenarios. The essence of user portrait is to deeply analyze customers, master valuable data, find target customers, customize products according to customer needs, and realize value realization by using data.

Banks have rich transaction data, personal attribute data, consumption data, credit data and customer data, and there is a great demand for user portraits. But it lacks social information and interest information.

People who come to bank outlets to handle business are older, and consumers will mainly handle business online in the future. Banks can't reach customers, can't understand customer needs, and lack the means to reach customers. Analyzing customers, understanding customers, finding target customers and designing products for customers have become the main purposes of bank users' portraits. The main business needs of banks focus on consumer finance, wealth management and financing services. User portraits should look for target customers from these angles.

The bank's customer data is very rich, with many types, large amount of data and many systems. You can strictly follow the five steps of user portrait. Firstly, the data warehouse is used to concentrate the data, screen out the information with strong correlation, characterize the quantitative information and generate the data needed by DMP. Use DMP to customize basic labels and applications, and screen target customers or deeply analyze users according to the needs of business scenarios. At the same time, DMP is used to introduce external data, improve data scene design and improve the accuracy of target customers. Find ways to reach customers, market customers, feedback marketing results, and measure the commercial value of data products. Use feedback data to correct marketing activities and improve return on investment. Form a closed loop of marketing and realize the closed loop of realizing the commercial value of data. In addition, DMP can deeply analyze customers, develop and design products according to their consumption characteristics, hobbies, social needs, credit information, etc., and provide data support for product development of financial enterprises and scene data for product sales methods.

Briefly introduce some data scenarios that DMP can realize.

Looking for installment customers

Using the data of issuing bank+self-data+credit card data, it is found that users whose credit card consumption exceeds their monthly income are recommended to spend by stages.

Looking for high-end asset customers

Using the data of issuing bank+mobile location data (villa/upscale residential area)+property fee withholding data+bank-owned data+vehicle model data, it is found that users with less assets in banks and more assets in other banks provide high-end asset management services.

Looking for wealth management customers

Use your own data (transaction+salary)+mobile financial client/e-commerce active data. It is found that customers who transfer their wages/assets to the outside, but are not active in e-commerce consumption, are more likely to manage their money through the Internet, and can provide them with financial services and leave their funds in the bank.

Looking for overseas travel customers

Use your own card consumption data+mobile device location information+socializing overseas related data (strategies, routes, attractions, fees) to find overseas travel customers and provide financial services.

Looking for loan customers

Use your own data (demographic attributes+credit information)+mobile device location information+social purchase/strong consumption related information to find the target customers who are about to buy a car/house and provide them with financial services (mortgage/consumer loans).

Source: secondary collation of Qiantang big data, original data source of TalkingData Bao Zhongtie,