Artificial Intelligence is the hottest buzzword in computing and business at the moment, and Machine Learning is the cutting edge. If you’re looking to expand your horizons as an IT professional or harness technology to move your business forward, an understanding of how it works will be a huge advantage in the next few years.
I’ve written a basic introduction to the terms AI and ML here, and this article is for those who want to look into the subject a little bit more deeply. There are already a large number of well-supported frameworks available which allow anyone to jump in at the deep-end and by process of trial and error, learn how to use machine learning to solve real-world problems.
These platforms highlighted below vary in complexity and beginner-friendliness. Some of them are fully fledged “as a service” cloud offerings from big players, while some are extensions of existing toolkits like Spark and Python. But one thing they all have in common is a huge amount of freely available help and guidance on the internet.
Amazon Machine Learning
Part of the Amazon Web Services package, AML is a standalone ML development environment, aimed at those who want to get to grips with the subject without getting their hands dirty with coding in Python or other languages. It uses wizards and visualization tools to enable ML operations on data stored in AWS. This may be a limitation if you have data elsewhere that you wish to analyze, but the simple to follow tutorials make it a great introductory tool.
Some great resources for Amazon Machine Learning:
- Amazon Machine Learning developer guide
- Amazon Machine Learning discussion forum
- Introduction to the Principles and Practices of AML – CloudAcademy
- Getting Started with Amazon Machine Learning on YouTube
Google developed TensorFlow to build machine learning into its own systems, but have now released the framework as open source software. TensorFlow is said to have replaced earlier technologies at Google for the purposes of developing the AI backbone to its core services such as search, Gmail, and its speech recognition applications. At the moment it is accessed through Python or C++ interfaces so coding knowledge is necessary to jump in and start using it, but there are plenty of resources online to help you get to grips with the fundamentals.
Some great resources for TensorFlow:
Azure Machine Learning Studio
Microsoft has Open Sourced several Machine Learning libraries such as the Distributed Machine Learning Toolkit (DMTK) but AMLS is its as-a-service framework, aimed at enabling organizations to get to grips with deploying machine learning solutions in the Azure cloud.
It offers users a free trial with 10GB of storage and full access to its algorithm libraries for eight hours – enough time to get your head around how it works.
Some great resources for Azure Machine Learning Studio:
- Azure Machine Learning Cheat Sheet
- Microsoft Azure Essentials: Machine Learning Free eBook from Microsoft
This open source platform is another “complete package” solution, offering a web-based user interface alongside access to a library of machine learning routines and algorithms, designed to simplify the process of getting started with ML. It supports working with Excel, R Studio and Tableau and can read data from Hadoop systems, Amazon’s S3 as well as SQL and noSQL databases.
Some great resources for H2O:
Caffe is one of the original deep learning libraries, created by UC Berkley’s AI research lab, and released as open source. As it’s a library it still requires the ability to connect through a programming language (C++, Python, and Matlab interfaces are supported) and it includes pretrained models for recurrent and convolutional neuro networks, which provide the foundation of a lot of today’ most exciting ML work. Like most of the frameworks here is can run using either CPU or more powerful and costly NVidia CUDA GPUs – using the latter it is said to be able to process over 60 million images in a single day. Typically training will be carried out using GPU hardware, before applications are deployed across user machines which are more likely to use CPUs, for less processor-intensive day-to-day use.
A great resource for Caffe:
- NVidia – Getting started with Caffe on YouTube
MLlib is a scalable machine learning library for Apache Spark, so if you already have got to grips with that, then this could make the ideal starting point for an exploration of ML. Spark is one of the most widely used and supported Apache open source projects right now which means there is a wealth of resources for its in-memory data processing framework, with new algorithms continuously in development. Well used algorithms already exist for common ML techniques such as image classification, clustering, decision trees, and regression modelling.
Some great resources for MLlib:
Torch is a widely used open source machine learning development framework, which among many other applications was used in development of much of the AI tech in use by Facebook and Twitter. It allows complex neural net based algorithms to be run across GPU hardware without the need for coding at hardware level. Torch applications are scripted using the LUA programming language, making it a natural choice for those who prefer that language to Python.
Some great resources for Torch:
This list is simply an introduction to a few of the most popular toolkits, extensions and resources for learning and deploying machine learning solutions. Honorable mentions should also go to scikit-learn, Theano, Keras, Veles and Apache Singa. With the huge growth of interest in AI and specifically ML, it’s likely we will see more packages and services designed at lowering the barriers to getting started with this cutting-edge technology in the near future.