What is lazy learning?
Lazy learning refers to machine learning processes in which generalization of the training data is delayed until a query is made to the system. This type of learning is also known as Instance-based Learning. Lazy classifiers are very useful when working with large datasets that have a few attributes.
Learning systems have computation occurring at two different times: training time and consultation times.
Training time is the time before the consultation time. During training time, the system derives inferences from training data to prepare for the consultation time.
Consultation time is the time between the moment when an object is presented to a system so that the system can make an inference and the moment when the system finishes making the inference.
In a lazy learning algorithm, most of the computation is done during consultation time. Essentially, a lazy algorithm defers the processing of examples till it receives an explicit request for information.
Why is lazy learning important?
The main purpose behind using lazy learning is that (like in the K-nearest neighbors algorithm which is employed in online recommendation engines like those of Netflix and Amazon) the dataset gets updated with new entries on a continuous basis. Due to these continuous updates, the training data becomes obsolete in a short period of time. So there really is no time to have an actual training phase of some sort.
These lazy algorithms are very beneficial while working with vast, perpetually changing datasets that have a few attributes that are queried quite commonly.
Even if there is an enormous set of attributes available, recommendation queries tend to rely on a relatively smaller set of attributes.
What are some examples of lazy learning?
Instance-based learning, local regression, K-Nearest Neighbors (K-NN), and Lazy Bayesian Rules are some examples of lazy learning.
What is the difference between lazy learning and eager learning?
Lazy learning and eager learning are very different methods. Here are some of the differences:
Lazy learning systems just store training data or conduct minor processing upon it. They wait until test tuples are given to them.
Eager learning systems, on the other hand, take the training data and construct a classification layer before receiving test data.
So while lazy learning systems have a low or non-existent training time and a high consultation time, eager learning systems have a high training time and a low consultation time.
In eager learning, the system needs to commit to a single hypothesis that covers the entire instance space, while in lazy learning, systems can make use of a richer hypothesis space because it employs multiple local linear functions to form its implicit global approximation to the target function.
Essentially, eager learning methods create a general, explicit description of the target function, based on the training examples provided. Lazy learning essentially postpones generalizing beyond the data stored until an explicit request is made.
In eager learning, the same approximation is used to the target function which needs to be learned based on the training data and before the input queries are observed. In lazy learning, however, you’d see the algorithm constructing a different approximation to the target function for every query instance that it encounters.
Lazy learning is rather beneficial for complex and incomplete problem domains in which complex target functions could be represented by a collection of less complex local approximations.
What are the advantages of a lazy learning algorithm?
Here are the most significant advantages of the lazy learning method:
- It is very useful when not all the examples are priorly available, but need to be collected online. In such a situation, a new example observed would only require an update to the database.
- In lazy learning, collecting examples about an operating regime does not degrade the modeling performance of other operating regimes. Essentially, lazy learning is not prone to suffering from data interference.
- The problem-solving capabilities of a lazy learning algorithm increase with every newly presented case.
- Lazy learning is easy to maintain because the learner will adapt automatically to changes in the problem domain.
- They can be simultaneously applied to multiple problems.
What are the disadvantages of a lazy learning algorithm?
Here are the most significant disadvantages of lazy learning:
- It is possible that a vast amount of memory would be needed to store the data, especially because every single request for information would require the system to start the identification of a local model from scratch. In practice, however, this does not tend to be an issue, due to the advances in hardware and the smaller number of attributes that need to be stored.
- Lazy learning methods tend to be slower to evaluate, however, this could be set off by the quick training phase.
- If the data is rather noisy, then the case base gets pointlessly increased. This is due to the fact that the algorithm does not make any abstraction during the training phase since (as mentioned earlier) if learning is done in advance, it becomes obsolete rather soon.
- Lazy learning tends to increase your costs. There is a fixed amount of computational cost because the processor will only be able to process a limited amount of training data points.
- In Case-Based Reasoning, handling very dynamic problem domains involves reorganizing the case base on a continuous basis. This could cause errors to be introduced in the case base. The set previously encountered examples could become outdated if there is a sudden large shift in the problem domain.
- A lazy learner will only be able to achieve fully automatic operation for complete problem domains. If the problem domain is not complete, there will be a requirement for user feedback for situations in which the learner has no solution.