In a data-driven business, each problem is unique, with its own objectives, constraints, aspirations etc.However, in order to solve these problems, the strategy of the Data Scientist is to break down a complex problem into different simpler sub-tasks that we already know how to solve with different Machine Learning algorithms.
How do people solve problems?
In a previous post in the blog of LUCA we saw that the basic concept underlying the idea of Artificial Intelligence it’s getting a computer to solve a problem the same way a person would.
There are many ways to try to solve a problem from a human perspective, but, in particular, two questions are of great help when we are faced with a new situation.
The first is:
Does this look like any problem you’ve solved before?
And the second:
Can I break down this complex problem into several simpler sub-problems?
It is clear that based on experience, on what we have learned in previous circumstances, can be very useful. There’s no point in reinventing the wheel. And, on the other hand, when a problem is complex, but can be broken down into different parts, it is very likely that we already know how to solve many of them. If we only have to work on the part of the problem that is really new to us, we will gain time and be much more effective.
When we work with Machine Learning (ML), instead of programming code based on rules, what we do is work with algorithm and train them with data. Algorithms there are many, and complex, almost all, but it is not essential to know them one by one. One of the works of the Data Scientist it is precisely to determine which is the most suitable for each particular case, although usually each professional has his own “Toolbox”with his favorite algorithms, those that solve most problems.
However, the tasks that these algorithms will allow us to solve, those that we can find as sub-problems in more complex problemsit’s not that many.
Let’s see what the main ones are:
- Rating: a classification task is, given an individual, to know which class he belongs to, based on what we have “learned” from other individuals. For example:
Which Telefónica customers will be interested in this offer?
Based on the information of the client history, summarized in a series of variables such as age, marital status, level of studies, seniority as a client, etc., classification algorithms build a model that allows us to assign a new customer the most appropriate label between these two: “Will be interested” or”Will not be interested”. The scoring algorithms they are very similar, but more specific. They give us the probability that a customer is interested or not.
- Regression: regression tasks are used when you want to find out a numerical value of a continuous variable. Following the previous example, they would serve us to, based on the consumption history of the customers, parameterized according to the previous variables (or others defined by the Data Scientist), we can answer questions like this:
What will be the consumption in … (voice, data, etc.) of this customer in a month?
-
Identify similarities: it is about identifying “similar ” individuals based on the information we have about them. It is the basis of recommendation systems, which offer you different products according to which you have consulted or purchased previously.
-
Clustering: clustering tasks have to do with grouping individuals by their similarity, but without a specific purpose. It is often used in the phases of preliminary exploration of the data, to see if there is any kind of natural clustering, which can suggest the best way to analyze the data. For example, these tasks would give us answers to questions like:
Can our customers be classified into natural groups or segments?
What products should we develop?
- Group co-occurrences: this task looks for associations between “entities” based on their matching in transactions. For example, they would answer the question:
What products are usually bought together?
While clustering techniques seek to group elements, based on attributes of these, the grouping of co-occurrences is based on where those elements appear together in a transaction. For example, it is common for a person who buys a camera to also buy a camera case, or a memory card.
Therefore, it can be interesting to make promotions of both products at the same time. However, sometimes coincidences are not so “obvious” and that is why it is very interesting to analyze them.
- Profiling: when we talk about Profiling, we talk about typical behaviors. These techniques seek to characterize the expected behavior of an individual, group or population. Questions such as:
What is the typical mobile consumption of this customer segment?
The description of these “typical ” behaviors is often used as a reference to detect unusual behaviors or anomalies. Based on a certain customer’s typical purchases, we can detect if a new charge on their credit card fits that pattern. We can create a “score” or degree of suspicion of fraud, and launch an alert when a certain threshold is exceeded.
- Link prediction: attempts to predict connections between elements. For example, between members of a social or professional network. They make suggestions like:
“You and Mary have 10 friends in common. Shouldn’t you be friends?»
“People you probably know”
-
Data reduction: sometimes it is necessary to reduce the volume of working data. For example, instead of working with a huge database of movie consumption preferences, work with a reduced version of them, such as the “genre” of the movie, rather than the specific movie. Whenever a data reduction is made, information is lost. The important thing is to reach a compromise between the loss of information and the improvement of Insights.
-
Causal Modeling: these tasks seek to detect the influence of some facts on others. For example, if sales increase in a group of customers that we have targeted with a marketing campaign:
Did sales increase thanks to the campaign or did the predictive model just detect well the customers who had bought in any way?
In this type of task it is very important to define well the conditions that have to be given in order to make this causal conclusion.
Therefore, when we want to address with ML a business problem, as the typical example of” customer flight ” (the famous churn), what we want to find out is which customers are more or less predisposed to stop being.
We could approach it as a classification problem, or clustering, even as a regression problem. Depending on how we define the problem, we will work with one family of algorithms or another.
If you want to keep learning about Machine Learning, don’t miss the post about error types.