4 Mistakes that Machine Learning Startups Do

October 2019
clock
6 min
Software Development
AI/ML
Big Data

For the past 25 years, I’ve seen thousands of times when a person makes errors — but never when a machine makes a mistake. Today, a blunder in the learning projects can cost companies millions and several years of useless work. For this reason, the most common errors in machine learning related to data, metrics, validation, and technology are collected here.

For the past 25 years, I’ve seen thousands of times when a person makes errors — but never when a machine makes a mistake. Today, a blunder in the learning projects can cost companies millions and several years of useless work. For this reason, the most common errors in machine learning related to data, metrics, validation, and technology are collected here.

1. Data

Chances to make a mistake working with data are rather high. It is easier to successfully pass a minefield than not to make a mistake while working with the data set. Moreover, there can be several common mistakes:

  • Unprocessed data. Unprocessed data is rubbish that will not allow you to be confident about the adequacy of the constructed model. Therefore, only pre-processed data should be the basis of any AI project.
  • Anomalies. To check data on deviations and anomalies and get rid of them. Getting rid of errors is one of the priorities of every machine learning project. The data may always be incomplete, incorrect, or some information may be lost for some period.
  • Lack of data. Perhaps, the easiest way is to conduct 10 experiments and get the result, but still not the most correct one. A small and unbalanced amount of data would drive to a conclusion far from the truth. So, if you need to train the network to distinguish spectacled penguins from spectacled bears, a couple of bears’ photos won’t fly. Even if there are thousands of penguins’ images.
  • Lots of data. Sometimes limiting the amount of data is the only correct solution. That is how you can get, for example, the most objective picture of human actions in the future. Our world and the human race are incredibly unpredictable. As a rule, to foretell someone’s response based on their behavior in 1998 is like reading tea leaves. The result, being quite the same, will be far from reality.

2. Metrics

Accuracy is an essential metric in machine learning. However, senseless seeking absolute accuracy can become a problem for an AI project. Particularly, if the goal is to create a predictive recommendation system. It is obvious that the accuracy can reach an incredible 99% if the grocery online-supermarket offers to buy milk. I bet a buyer will take it, and the recommendation system will work. But I’m afraid he would buy it anyway thus there is little sense in such a recommendation. In the case of a city resident, who buys milk daily, it is an individual approach and promotion of goods (which the one didn’t have in the basket earlier) that matters in such systems.

3. Validation

A child learning the alphabet gradually masters letters, simple words, and idioms. He learns and processes information at a certain level. At the same time, the analysis of scientific papers is incomprehensible for the toddler, although the words in the articles consist of the same letters that he learned.

The model of an AI project also learns from a specific data set. However, the project won’t handle an attempt to check the quality of the model on the same data set. To estimate the model, it is necessary to use specially selected for verification pieces of information that were not used in training. In such a way, one can achieve the most accurate model quality assessment.

4. Technology

The choice of technology in an AI project is still a common mistake, leads if not to fatal, but serious consequences that influence the efficiency and time of the project deadline.

No wonder, you can hardly find a more hyped theme in machine learning than neural networks, due to its suitable-to-any-task universal algorithm. But this tool won’t be the most effective and the fastest for any task.

The brightest example is Kaggle competition. Neural networks do not always take the first place; on the contrary, random tree networks have more chances to win; it is primarily related to tabular data.

Neurons are more often used to analyze visual information, voice, and more complex data.
Using a neural network as a guide one can see, nowadays, it is the simplest solution. But at the same time, the project team should understand clearly what algorithms are suitable for a particular task.

I truly believe machine learning hype won’t be false, exaggerated, and ungrounded. Machine learning is another engineering tool that makes our life simpler and more comfortable, gradually changing it for the better. For many massive projects, this article may be just a nostalgic retrospective about the mistakes they have already made but still managed to survive and overcome serious difficulties on the way to the product company.

But for those who are just starting their AI venture, this is an opportunity to understand why it isn’t the best idea to take a selfie with a wounded bear and how not to fill up the endless lists of “dead” startups.

https://readwrite.com/2019/10/18/4-mistakes-of-machine-learning-startups/

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Learn more from our insigths

Hardware
Product Development

Project Managers in Software Development and Hardware Development Projects: Key Differences

Our marketing specialist Vadzim Krasnouski conducts an interview with 6 project managers from software and hardware, and compares their functionalities. Read on to learn more about the topic.

Product Development
Engineering
Industrial

The Guide to Choosing a Suitable Electronics Enclosure

This isn't just a guide - it's essential! Ignore these factors, and you'll end up with a disposable cover instead of a proper enclosure for your pcb. Click now!

IoT
Industrial IoT

Internet of Things (IoT): What it Means. Part 1

With IoT growing, understanding its implementation is key for smart device development. Discover Long Range networks and find the right fit for your project!

Have a project to do?

Fill out the form and a member from our sales team will get back to you

Thank you!
Your request has been submitted! We shall contact you shortly

Oops! Something went wrong... Try to reload this page and resubmit

FAQ

At EnCata, what kinds of contracts do you use? Is it a fixed-term or an agile contract?
Can you provide me with a certification of competence?
What level of training do your specialists have?
Is it possible for us to cooperate with EnCata’s team?
Is it possible to discuss the project with your technical team?
Can EnCata facilitate mass production?
Do you sign NDAs?
Patent or Develop first?
Does EnCata outsource electronics services?
Do you write program code, either software or firmware?
Are there hardware engineers in your team?
What should I do now that I've approached you with my project idea?
Can EnCata help me with fundraising?

Sorry for butting in, but

Knowing how important clarity is when working with contractors, we've put together a checklist to help you evaluate the development and production of a mechanical device. No need to leave your contact details - just select the option that fits you, and the download button will appear!
Redcross
Choose the option that best describes you:
Download
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
No, thanks