Traditional Culture Encyclopedia - Travel guide - How to build an industry knowledge graph?
How to build an industry knowledge graph?
The three basic elements of the knowledge graph: entities, attributes, and relationships. Entity-relationship-entity triplet; entity-attribute-attribute value triplet. Current knowledge graphs are divided into two categories. One type is an open domain knowledge graph, and the other is a vertical domain knowledge graph. For example, the knowledge graph established by Google for search engines is open domain. Knowledge graphs in vertical fields, such as finance and e-commerce.
The first thing is to process the data. Data on the Internet are basically structured, unstructured and semi-structured. Structural data is generally the company's business data. These data are stored in the database, and can be used after extracting them from the database and doing some simple preprocessing. Semi-structured data and unstructured data, such as a description of a product, or a title, which may be a piece of text or a picture, are some unstructured data. But it stores some information and reflects some attributes in the knowledge graph. Therefore, it is necessary to extract it, which is a relatively time-consuming and laborious task in building a knowledge graph.
What needs to be extracted from the data is actually the entities, attributes, and relationships mentioned before. The extraction of entities is named entity recognition in NLP. The relevant technologies here are relatively mature, from the traditional manual dictionary rule method to the current machine learning method, as well as some uses of deep learning. For example, from a piece of text, we extract the entity Bill Gates and the entity Microsoft, and then perform a relationship extraction. Bill Gates is the founder of Microsoft, so there is such a corresponding relationship. There is also attribute extraction, such as Bill Gates’ nationality is the United States. After these extractions are completed, there will be some relatively scattered information, and then the things obtained from the structured information and the information obtained from the third-party knowledge base will be used to fuse it before adding it.
What also needs to be done is entity alignment and entity disambiguation.
About entity alignment. For example, the four characters Bill Gates are his Chinese name and Bill Gates is his English name, but in fact they refer to the same person. Due to the difference in text, there were two entities at first. This requires us to physically align it and unify it.
The other is entity disambiguation. For example, an apple is a fruit, but in some contexts, it may refer to the company Apple. This is an entity ambiguity, and we need to perform entity disambiguation on it based on the context.
After completing the above steps, the next step is to extract the ontology. For example, Microsoft and Apple mentioned before, their entities are companies. It may not be directly extracted from the text, they are companies. Then some method is needed to extract them. Then build an ontology library. For example, if a company is an organization, it has this upstream and downstream relationship. For those of equal level, it is also necessary to calculate their degree of acquaintance. For example, at the physical level, Bill Gates and Steve Jobs are relatively similar. They all belong to the entity of man. They are quite different from the company, so a similarity calculation is needed.
After completing the above steps, the quality of the knowledge base needs to be evaluated. This is an unavoidable manual step. After completing the quality assessment, a knowledge graph is finally formed. After the knowledge graph is formed, some relationships may not be directly obtainable, and then knowledge reasoning needs to be performed, which can expand the knowledge graph. For example, cats are members of the feline family. Felines are mammals. This can be inferred that cats are mammals. But this reasoning cannot be derived casually. For example, Bill Gates is an American. Bill Gates founded a company, but this company is not necessarily American.
- Previous article:How to apply for a tourist visa
- Next article:Write a 100-word English composition on the itinerary and cost of traveling to Shandong
- Related articles
- What are the places suitable for college students to travel?
- How much is Fiji travel guide for 6 days and 5 nights?
- Ruili travel
- What are the interesting places in Harbin in winter?
- How to play a two-day tour from Changzhou to Suzhou is more appropriate?
- What documents do I need to travel to Japan with a group?
- How to design a questionnaire for world-class tourist destinations
- How to get to the tourist attractions from Daqing to Shanghai by car is the cheapest and most time-saving?
- What cosmetics should I buy when traveling to Korea?
- How about Huzhou Anji Chunlin Mountain Villa? Are there any interesting places?