Saturday, March 2, 2013

Machine learning and its impact on human learning

Machine learning is a technique for finding model within a large dataset. It is used by IBM to create Watson that can answer questions in Jeopardy using knowledge it harvested from the web. It is used by Google to translate texts between one natural language to another, by finding traits between already translated corpus. It is used by financial firms to analyze stock market or other markets by looking at historical data, as well as to detect fraud. It is used by insurance companies to calculate risks. It is also used in scientific research, such as analyzing data from Large Hadron Collider. Machine learning is enabled by the massive computing power that big companies are able to build from the large quantities of computers that modern manufacturing process is able to churn out at incredible speed.

Machine learning, in short, is outsourcing research to machines. Those are research whose results have practical uses, but the process of data analysis to find pattern and formulate a model to explain the phenomenon is mind-numbingly tedious. However, machines can't figure out the research methodology itself. Humans preprogram the machine with a generic model, and machine learning finds a substructure within that generic model that can be used to predict phenomenon. If the generic model doesn't predict well, it is generalized even further. This becomes a trial and error process on the part of human.

The reason a generic model might not work well is because every model has limits in its expressive power. For example, Linear Algebra (in which variables relate to one another as a simple sum of coefficients multiplied by first-degree variable occurrences, e.g. \(x = 2y + 3z\)) cannot fit data that are quadratic in nature (e.g. \(x = y^2 + z^2\)), and quadratic model cannot fit data that are cubic in nature (e.g. \(x = z^3\)). A generic polynomial model cannot fit data that are exponential or transcendental in nature, although Taylor Series predict that a polynomial model can be used to approximately model and interpolate data but cannot extrapolate them beyond a certain boundary.

The more general the model is, a disproportionally amount of computing power increase is required to find submodel, to a point where the computation is intractable (e.g. integer linear programming is NP-complete) or just unsolvable.

Also, even if data is known to fit in a simple computable model, finding the right sub-model requires a stroke of luck. For example, simulated annealing is a machine learning technique that starts off the model searching with random parameters in the model and iteratively converge these parameters to find ones that fit data. Two annealing attempts can yield drastically different parameters that both fit existing data but disagree in predicting future phenomenon.

Some of these models can have thousands or millions of parameters. After a model is found, it is extremely hard to explain why the data fits the model. The effectiveness of the model is evaluated by having two corpuses, one for training and one for verification, where each corpus is a set of data that associates an input to an expected output. The training corpus is used to build a model, and the verification corpus is used to evaluate the model based on inputs that the model has never seen before.

There is very little understanding of how the model works, and less on how to explain the phenomenon. For example, a stock market model can be used to predict stock trends, but the model does not tell us there are various actors in the market—sellers and buyers, short-term and long-term investors, actos with large capital and actos with small capital—and how their behaviors influence the market. Language translation using machine learning does not help us understand why natural languages evolved the way it did.

Machine learning is like using proprietary technology. It's a service that we don't care how it works. The problem is solved but we don't know how. Since no insight is gained, we cannot build on top of this understanding to change the world. For example, stock market model does not help shaping financial regulation because it does not explain the behavior of the people controlling the capital. Machine language translation model does not help us improve a language so that it helps people express ideas and communicate better.

In a way, machine learning takes thoughtful problem solving out of the research. We become complacent in thinking. When there is a hard problem, throw more computing iron behind it, and let the computer find an answer for us. We should take inspiration from the story of a crow drinking water from a pitcher, by throwing pebbles into the pitcher in order to raise the water level enough so that he can drink the water. The moral of the story is the virtue of thoughtfulness over brute strength.

Let us not forget that thoughtful problem solving is part of what makes us human, and not lose that quality because of computers.

No comments: