[MUSIC] This week, we will talk about how we can compute concepts and implications when we don't really have data, but instead have access to someone who has data or at least knowledge about the domain we want to conceptualize. In this case, we will communicate with this, let's call him or her expert, through formalized interactive procedures. Our goal will be to get enough domain knowledge to be able to compute the concept lattice of the domain and to find out the implications valid in it. But we'll take a slightly broader perspective and will present these interactive procedures in the context of what is called learning with queries, which is a rather well- known subfield of machine learning. In classical supervised machine learning, our goal is to build a classifier that can classify objects into several predefined classes. For simplicity, assume that we have only two classes: objects that have a certain target property (positive examples) and those without this property (negative examples). For instance, we may want to classify pictures into two groups: those that have an image of a pipe and those that don't. The picture on the left is then a positive example (it is not a pipe, but it is an image of a pipe), while the one on the right is not. In the classical setup, we are given a training dataset divided into positive and negative examples and we want to learn to classify new, previously unseen examples. Learning with queries originates from the works of Dana Angluin and her colleagues. Here, we still want to learn to classify new examples, but we no longer have a training dataset. Instead, we have access to an oracle capable of answering certain types of questions. Our goal is to design an algorithm that asks as few questions as possible, but enough to be able to build a reliable classifier based on answers provided. There may be different types of questions that the oracle might be willing to answer. In a membership query, we ask the oracle to label an example as positive or negative. If, say, we want to learn to recognize dogs, we can show a picture of an animal and ask the oracle whether it's a dog. It's called a membership query, because we essentially ask if a certain object is a member of the class whose description we want to learn. In principle, we can use membership queries to build a training data set, which can then be used with any standard supervised machine learning algorithm to build a classifier. Another important type of queries is the equivalence query. Here, we suggest a hypothesis H and asked the oracle whether it describes precisely the positive examples of the target property. If the answer is “yes”, our learning is complete: we can use the hypothesis H to classify any new example: if the example matches the hypothesis, it is classified as positive; otherwise, it is classified as negative. If, however, the answer is “no”, there may be two cases. It might be that our hypothesis is too restrictive and there are positive examples of the target property that fail to satisfy the hypothesis. These are called positive counterexamples. Or it might happen that our hypothesis is too general and it matches some objects that do not belong to the class being learned. Such objects are called negative counterexamples. If the oracle answers “no” to an equivalence query, it has to provide a counterexample, either positive or negative. It's up to the oracle to choose which counterexample it provides. Again, suppose we want to learn to recognize dogs, and we've come up with the idea that dogs are precisely all animals with four legs and curly hair. Why would we have such a strange idea? Well, because some of the dogs are indeed like this, and, maybe, these are the only dogs we've seen so far. But if we ask the oracle to verify this hypothesis, it may point out that other kinds of dogs do exist and will show us one of them, which will be a positive counterexample to our hypothesis. Or it may show us someone with four legs and curly hair who is nevertheless not a dog— this will be a negative counterexample. When posing an equivalence query, we have no control on which kind of counterexample, positive or negative, we'll get in return. If we are looking specifically for negative counterexamples, we can pose a subset query instead. That is, we ask if the set of objects satisfying our hypothesis is a subset of the class being learned. For example, we might want to ask if all animals with four legs and curly hair are dogs, that is, whether having four legs and curly hair is a sufficient condition for being a dog. In this case, we may get a lamb as a counterexample, but not, say, a dachshund. If we are interested in positive counterexamples, we may pose a superset query. This amounts to asking if the class of objects satisfying our hypothesis includes the target class, the one being learned. In other words, in a superset query, we ask if H is a necessary condition for being an instance of the target class; if, say, it is necessary for a dog to have curly hair and four legs. Any counterexample must be negative: something that satisfies the hypothesis, but still not in the class. [MUSIC]