Traditional Culture Encyclopedia - Weather forecast - Multi-round dialogues and word slots
Multi-round dialogues and word slots
Basic definition: What is multi-round dialogue? Multi-round dialogue (closed domain) is a way to obtain necessary information and finally get clear user instructions after the user's intention is preliminarily clarified in man-machine dialogue. Many rounds of dialogue correspond to the handling of one thing.
Supplementary note 1: Must the so-called "necessary information" be obtained through dialogue with users? Not necessarily, even in interpersonal communication, the information contained in the dialogue itself only accounts for a small part of the total amount of information transmitted, and more information comes from a series of scene information such as the identity of the speaker and the current time/place. Therefore, the information acquisition method of multi-round dialogue should not be limited to what users say.
Supplementary note 2: Do multiple rounds of dialogue have to show multiple dialogues and interactions with users in form? Not necessarily, if the user's words have provided enough information, or the supplementary information from other sources is enough to turn the user's initial heart into a clear user instruction, there will be no multiple conversations and interactions with the user.
The above is the answer to the overall definition of multi-round dialogue, and the relevant definitions of each module will be given below.
Second, the slot point
1) slot
Basic definition: What is a slot? Slot is to transform the initial user intention into clear information that needs to be completed in the process of multiple rounds of dialogue. A slot corresponds to a kind of information that needs to be obtained during the processing of a thing.
Supplementary note: Do all slots of multi-round dialogues need to be completely filled? Not necessarily. Take the following conversation as an example-
* Me: "How much is it to Xiaoshan Airport" *
Taxi driver: "70"
The "70" in the dialogue should be understood as RMB in 70 yuan. There is no need to ask, "Are you talking about RMB, USD, JPY or HKD? ど. This information should exist in the form of default values, that is, there are mandatory and non-mandatory slots, which corresponds to the above-mentioned "information does not necessarily need to be obtained through dialogue with users".
2) word slot and interface slot
As mentioned repeatedly above, conversation content is not the only way to get information, and user identity and current scene also contain a lot of hidden information worth using. Therefore, in contrast, a complete multi-round dialogue system should have the ability to obtain information from the user's words and the ability to obtain information from the outside world.
Personally, I call "the slot filled with keywords in the user's text" a text slot, and "the slot filled with scene information such as user portraits" an interface slot.
For example, I said, "I'm going to Shanghai by train tomorrow." Among them, "Tomorrow" and "Shanghai" are filled in the word slots named "departure time" and "destination" respectively, and my current position is filled in the interface slot named "departure place".
3) Slot group and slot location
Personally, I call "the slot filled with keywords in the user's text" a text slot, and "the slot filled with scene information such as user portraits" an interface slot.
For example, I said, "I'm going to Shanghai by train the day after tomorrow." Among them, "The day after tomorrow" and "Shanghai" are filled in the word slots named "departure time" and "destination", while my current position is filled in the interface slot named "departure place".
I don't know if the above conclusion is too outrageous to attract your attention:)
If you read the above example carefully, you will find a serious contradiction: can't the "starting point" slot be specified by the user? Users can say, "I'm going to Shanghai by train from Beijing the day after tomorrow." Is it a word slot or an interface slot? Besides, can I just use "My Current Location" to fill in the "Starting Point" location? For example, I can look at my schedule and find that I am going to Hangzhou tomorrow. Should I use "Hangzhou" instead of "Where am I now" to fill in the "starting point" slot?
What can we find from it? There may be many ways to fill the same groove.
I call a slot that may contain multiple slot filling methods a slot group. There can be any number of slots under the slot group, that is, any number of slot filling methods, and each slot corresponds to one of two slot types: word slot and interface slot.
In essence, the slot group (that is, the "slot" mentioned above) corresponds to a kind of information, and there is only one way to get almost no information. Therefore, it is natural that a "slot" will correspond to multiple slot filling methods at the same time.
According to the above, the same information can be obtained in many ways, that is, the same slot group can correspond to multiple slot filling methods (slot positions). There must be a concept of priority between different filling methods.
Just like the booking example above, the "starting point" slot contains three filling methods, one word slot and two interface slots. Naturally, the word slot has the highest priority, followed by "the hidden starting point in the schedule" and "my current position".
If combined with the above requirements/non-requirements, the groove filling process should follow the following steps:
We need to know that required/unnecessary is logically at the same level as the slot group, not the slot position. Only information will be classified as necessary/unnecessary, and the slot filling method will not make this distinction. Whether it needs to be independent of the interface slot depends only on whether it needs to interact with the user.
4) Clarify the wording
Another concept is equivalent to a slot group (that is, equivalent to a kind of information), which is called clarification.
Clarification is a problem that dialogue robots use when they want to get some information. For example, the clarification of "destination" is "Where do you want to start? The corresponding clarification of "departure time" is "When do you want to leave? " 』。
Obviously, clarification is equivalent to slot group, not slot position.
5) Fill in the slot
As mentioned above, a slot group can have multiple slots, which are divided into word slots and interface slots.
Let's start with the word slot.
In fact, the extraction of word slot information is still a bit troublesome, but this is an analytical problem, which is beyond the scope of this paper. Here is just a brief mention, give two examples:
Synonym dictionaries, rules and two-way LSTM+CRF all have their own methods.
Let's talk about the interface slot first.
Compared with the word slot, the interface slot has an additional problem, that is, is the result returned by the interface the result that the user needs?
Two situations are discussed here. One is that we clearly know that the return value of the interface can be directly filled in the slot (rather than the slot/slot group) without confirming to the user.
It is particularly clear here that even the above situation does not mean that the current slot/slot group only has this specific interface slot. There are two situations: one is that there is only one slot under the slot group, and the return value of the interface is directly filled in the slot, which is equivalent to filling in the slot/slot group; Or there are multiple slots under the slot, and the filling value of the interface slot is not necessarily the filling value of the slot/slot group.
The other is that we know that the return value of the interface can only be used as a reference and needs the user's assistance to fill the slot.
In this case, it is necessary to provide users with options to finally decide the filling value of the slot. Just like the word slot, there is also a need to deal with single/multi-valued problems here. Single-value/multi-value is logically equal to the slot group.
In addition, we should pay attention to the problem of negative options here. For example, if I tell Ali Xiaomi that I forgot my password, it will get my current account through the interface, and then give me an option to ask, "Which account password have you forgotten?" However, in addition to my current account, there is another option, that is,' no, not this account'.
This represents the existence of a kind of problem, and the user's intention is not necessarily included in all the return values of the interface. So there must be an option like "no/no/no", which I call the refusal option.
After the user selects the reject option, it means that the slot has failed to fill, and a special value needs to be filled to indicate the failure. The failure of the user to select the rejection option can be combined with other unexpected situations such as interface call failure, because this means that the slot filling fails, which means that this information acquisition method fails to obtain information.
If there is only one slot in this slot group, then this special fault token value should be used as the filling value of the entire slot group. If there are other slot values, the filling value of the slot group will be finally determined according to the priority between slots.
6) horizontal slot and subordinate slot
In the final analysis, all the above are the filling of a slot group, that is, the acquisition of information, but the purpose of multiple rounds of dialogue is to transform the initial user intention into clear user instructions, which usually requires more than one kind of information.
After talking about the relationship between slot group and slot position, let's talk about the relationship between slot group and slot group, that is, the relationship between information and information.
To make it easier to understand, I will give two examples to represent the extreme situations involved in the two rounds of dialogue.
The first one: booking a ticket requires knowing the user's time, place, destination and seat type. There is no dependency between these four slot groups. In other words, you only need to determine the clarification order between the required slot groups in these four slot groups, and clarify the unfilled required slot groups in turn after receiving the user's questions. I call the relationship between these four slot groups equal slot relationship.
On the other hand, I don't know if readers have ever played orange light or other plot games with multiple endings. What are their characteristics? Every choice will affect the subsequent plot development, that is, the filling result of each slot group will affect the filling of other slot groups. In other words, some slot groups depend on the filling results of the previous slot group, and the slot group cannot be filled until the previous slot group on which it depends is completed. I call this relationship between slot groups dependent slot relationships.
In this case, the whole process of multiple rounds of dialogue forms a tree, and in extreme cases, the tree is full. Each node in the tree has a slot group, which will affect the direction of subsequent conversations.
The choice of slot relationship should be determined according to the actual business scenario.
If the level slot is wrongly managed by the slot relationship, the information will be lost. Such as a, b and c, as a->; b-& gt; C depends on the slot relationship, so even if the user's question contains the information of filling in slot groups B and C, the filling in slot groups B and C may fail because slot group A is not filled in.
If the slave slots are managed by the relationship of equal slots, there will be information redundancy. For example, the relationship among a, b and c is a, a1->; B, A2->C, even if the user fills the value A 1 into the slot group a, it is still necessary to ask the user for the filling information of the unnecessary slot group C.
The above two cases are special cases of complete level slot relationship and completely dependent slot relationship. In the actual business scenario, these two relationships will coexist, and there are both level slot relationships and subordinate slot relationships between different slot groups.
In actual business scenarios, a complete multi-round dialogue process usually exists in the form of a tree. Each node has one or more time slot groups for obtaining one or more kinds of information. Slot groups between nodes are interdependent, and slot groups within nodes are equivalent.
Above, multi-round dialogue is defined as the processing of one thing, slot group/slot is defined as the acquisition of information, and slot position is defined as the way to acquire information. Here I tend to define a node in the multi-round dialogue tree structure as a step to handle things.
The handling of one thing involves many steps, each step needs to be supplemented with one or more kinds of Quan Yi information, and each kind of information has one or more ways to obtain it.
The above definition is somewhat different from the definition of boss in the group algorithm, but who let this be my article:) Just follow me.
7) Importance of slot filling
In combination with the above, we need to understand that filling the slot has two meanings: making conditional branches and dialogue turns, so that the information can fulfill the user's intention. In other words, filling the slot is not only a way to complete the user's intention, but also the filling of the pre-purchased slot will play a role in guiding the subsequent information to be completed.
8) Access conditions
As we said above, a complete multi-round dialogue process usually exists in the form of a tree, which contains multiple nodes, representing a step to deal with this matter.
And each node should have its own special access conditions. The root node of the tree often needs to limit the output of NLU module, that is, to clarify what user intentions the multi-round dialogue tree will deal with; The middle nodes and leaf nodes of the tree often need to be adjusted according to the slot filling results and other background information of the preorder slot group. (If all the information, such as the output of NLU module or other background information, is regarded as the filling result of the preorder slot group, then a unified slot group-condition-slot group-condition table can be obtained, in which the slot group is used for obtaining information and the condition is used for information restriction. )
I try to describe a complete system of access conditions from two angles.
One is multi-conditional organization, and access conditions should logically support yes or no between conditions. Baidu's unit platform provides a relatively mature organizational form, which divides the access conditions into conditions and condition groups as a whole. Conditions are included in condition groups, conditions within groups are relations, conditions between groups are relations (of course, they are related to their own business conditions or can be exchanged according to their own business conditions), and conditions themselves support non-relations.
One is the restriction ability of single condition, and the access condition should also support the restriction on the filling value, filling method and filling state of the preorder slot group. In other words, conditions with values, types and states are required. Simply put, the status is "filled", the type is "Who filled" and the value is "What filled".
We need different constraints in different business scenarios. For example, the meaning of filling the slot mentioned above includes two kinds: making conditional branches and multiple rounds of dialogue, and making information to complete the user's intention. If you just do information, you usually only care about "filling in". As long as you fill it out, you will go on with the subsequent steps, no matter who filled it out or what. However, if the filling value in the slot group will affect the direction of the subsequent rounds of dialogue, then we tend to branch the dialogue round by filling the slot group or the filling value.
three
Answer system, topic switching and state switching
1) answering system
Firstly, it is clear that the nodes of multi-round dialogue tree belong to dialogue nodes rather than answer nodes, and the same answer may appear in multiple dialogue nodes.
The answering system and multi-round process should be decoupled, and each answer in the answering system should set its own trigger conditions. For example, if there are three slots in ABC, A=A 1, B=B3 and C=C 1 provide the first answer, A=A2, B=B 1, C=C2 or A=A3, B=B2 and C = C/kloc-0.
In addition, the types of answers should not be limited to words, and rich words, interfaces and topic switching can all be regarded as reasonable forms of answers.
2) topic change
Topic switching refers to switching the conversation between users from one multi-round process to another. Topic conversion can be divided into active conversion and passive conversion.
The topic switching mentioned above as an answer can be understood as active topic switching.
Passive topic switching means that the system finds that it can't extract information from the user's question to continue the current multi-round conversation, so it has to be reanalyzed and the topic is identified as a brand-new problem.
Topic switching, especially active topic switching, will involve a new problem: slot inheritance. For example—
* Me: "I will take the high-speed train from Hangzhou to Beijing tomorrow" *
Me: "Forget it, let's fly."
In this case, the robot should not repeatedly ask about the "starting point", "departure time" and "destination".
Besides slot inheritance, there is an opposite problem called slot memory, which is usually applied to passive topic switching. Due to parsing errors or other reasons, the user jumped out of the original topic. When users return to the original topic within a certain period of time, they should not be allowed to fill in the slot repeatedly. This technology has been applied in Ali Xiaomi, but they seem to call it "multi-round state memory".
For example—
Me: help me book a plane ticket from Hangzhou to Beijing. *
VPA: When do you want to leave? *
Me: Will it rain in Hangzhou tomorrow? *
*VPA: There will be a thunderstorm in Hangzhou tomorrow. *
Me: What about the day after tomorrow? *
*VPA: It will be sunny in Hangzhou the day after tomorrow. *
* Me: Book a plane ticket for the day after tomorrow. *
VPA: OK, I've booked you a flight from Hangzhou to Beijing the day after tomorrow.
3) state switching
We also need to think about such a problem. Since the topic can be switched, that is, one multi-round process can be switched to another multi-round process, can the dialogue state in the multi-round process be switched?
Let me give you two examples-
The first one:
Me: help me book a plane ticket from Hangzhou.
VPA: Where do you want to go? *
* Me: (Found a thunderstorm in Hangzhou tomorrow) Change the starting point. *
VPA: Where do you want to start? *
Me: Shanghai.
Multiple rounds of dialogue should be allowed to return to the previous node.
The second one:
Me: I want to buy a cup. *
*VPA: The following cups are recommended for you. (Display result 1) *
* Me: Change it. *
VPA: The following cups are recommended for you. (Display Result 2)
Multiple rounds of conversations should allow repeated visits to the same node.
- Previous article:Summary of the investigation of hidden dangers in rural housing safety
- Next article:What do you mean by "ringing your eyes"?
- Related articles
- What is the reason for the huge flying ant colony over Britain?
- What are the rules all year round?
- What month is the dead of winter?
- Is winter in Russia a bug in the history of war?
- Weather in Hongjia of Taizhou
- What can be added to cold noodle spices to make it smell good?
- How much does it cost to cross the bridge from Hefei to Wuhe by car, such as gas money? And the route. How to get there? I live in Geda Store. . thank you
- The weather in Jiading, Shanghai tomorrow
- Weather forecast for Shenyang 15 days
- The origin of Li surname