I stumbled upon a great MIT Technology Review article (warning: regwall ahead) with a checklist you SHOULD use whenever considering a machine-learning-based product.
While the article focuses on machine learning at least some of the steps in that list apply to any new product that claims to use a brand new technology in a particular problem domain like overlay virtual networking with blockchain:
- What is the problem they are trying to solve including what are the more traditional ways of solving the problem, do they have shortcomings and how would those shortcomings be addressed with the new technology. If the traditional ways of solving a particular problem don’t have shortcomings that could be (obviously) solved with the new technology simply walk away. The risk is not worth the efforts.
- How is the product using the technology? If you don’t get a very clear answer on this one, walk away. Also, find someone who understands the drawbacks of the proposed technology (potentially from another problem domain) and try to figure out how those drawbacks might apply to your particular use case (see also: RFC 1925, Rule 11).
- Should the new technology be used to solve the problem? This is really a recap of the previous two ;) In many cases the answer is NO.
The article contains two more checks you SHOULD apply to a machine learning solution:
- Where do they get the training data? Machine learning can use self-play where you train one or more algorithms against each other (great in environments with fixed rules like Go or chess) or training sets that someone has to collect and label. While I’m positive the cloud-based security companies have tons of security-focused training sets, I wonder whether they exist for pure networking.
- How do they audit their products - the “beauty” of machine learning is that you never really know how it works, and if you train your product on biased training data you get a biased product.