What is Azure Machine Learning? | Introduction to Azure ML – Part 1

Azure ML – What’s better than machine learning? Machine learning where coding is optional! Drag and drop machine learning with a visual interface! We’re going to introduce you to a new tool to add to your data science toolkit, Azure Machine Learning Studio. Azure ML is a cloud-based data science platform on the Azure cloud ecosystem. Azure ML Studio also supports coding in Python, SQL, and R.

Hello, internet.
Welcome to Data Science Dojo.
My name is Phuc Duong, and I’m going to bring you
another video tutorial series.
This time we’re going to expand your data science toolkit
by teach you how to data mine using Azure Machine Learning
Studio.
This will be a multi-part video series,
and who is this video series intended for?
Well, it’s anyone who wants to learn Azure Machine Learning
Studio, and more importantly, Data Science Dojo
hosts a five-day boot camp on data science and data
engineering.
It is an in-person boot camp.
It lasts 50 hours, for an entire week.
We go 8 to 5 every day, and this is one of the five modules
that students have to learn before showing up on day 1.
So we teach a class mainly in R and in Azure ML Studio.
So students are expected to be very comfortable before even
showing up on day 1, so we can tackle more data science
problems during the course.
Now, if you are not attending the course,
don’t worry, because you will still
learn everything you need to know about Azure Machine
Learning Studio.
And by the time this series is over,
you should be very comfortable with Azure ML.
You should be able to import and export data.
You should be able to explore an unknown data set.
You should be able to manipulate and transform
data, mold data, preprocess data, and clean data
all within Azure ML.
You should be able to build and predict models in Azure ML.
You should be able to expose those predictive models
as a web service and then consumed those APIs,
and then you should also be up to code within Azure ML itself.
This series assumes that you already have an introduction
to data mining.
If you do not have that introduction,
go ahead and watch the video series
that I’ve linked inside a description box,
and I’ll get you up to speed on what data mining is.
A quick introduction about myself.
I’ve been teaching data science and data engineering
for about three years.
I was the lead author of a 85 page lab manual on how
to data mine using Azure Machine Learning Studio.
I wrote three other books on various topics
in data engineering and data science, none of which
are available to the public.
You have to sign up for our five-day boot camp
to receive one of these manuals, and then I
created an 11-part Azure ML tutorial tourists series
on YouTube three years ago, when Azure ML was still in beta.
And a lot has changed, which is why we’re going to redo it now.
And what are we going to cover in this video?
Well, we’re going to teach you what Azure Machine Learning
Studio is, what it means to be in the cloud,
and what are its benefits.
There are subscriptions that you need to get
and the pricing of Azure ML.
Well, what is Azure Machine Learning?
Well, it’s also called Azure ML for short.
It is a data mining and data science and machine learning
tool in the cloud.
But it’s different than most data
science tools, because there’s no coding involved.
Traditionally, data science tools
involve R or Python or Matlab or SAS, which is all coding based.
You had a terminal, and you had a command line,
and you had to learn syntax.
Not this one.
This one is a drag and drop approach to machine learning.
It has a visual interface, and it feels more
like Visio or PowerPoint than it does any other traditional data
science tool.
And I think this is the best tool for learning
machine learning and data mining, because you don’t have
to juggle both the syntax of learning how to program
at the same time and juggle how to data mine–
the data mining theory, the data mining frameworks,
and also the theories of machine learning in general.
But this is not just a tool for beginners.
Advanced users will love this as well, because it
does a seamless integration with SQL R or Python,
and you can mix and match.
So all of a sudden you can be using SQL,
drop SQL, use a module, and then all of a sudden switch
and switch again and then go into R. I do that all the time.
All right, and also you can deploy these models
automatically.
Meaning basically, they pack up your model,
throw it into a cloud, somewhere and then
you can contact those models via rest APIs.
And then you can connect those APIs using auto generated, code
using C sharp, R, or Python.
As your machine learning is a cloud-based machine learning
tool that only exists in the cloud,
specifically it is on the Azure Stack,
so the Azure cloud platform.
And Azure is one of the services within the Azure ecosystem
itself.
So Azure is the cloud platform brought to you by Microsoft.
So that makes Azure ML by extension
a Microsoft technology.
Azure is also comparable to other cloud services,
like Amazon Web Services and Google Cloud
Services, all of which are infrastructure as a service.
Meaning, they are a platform that
can host robust IT services that can build entire software
platforms, like Netflix or Snapchat.
These cloud services are all software
as a service, where you use these services,
and they charge you by usage, like an electric bill
or a subscription service, like a cable bill.
So because Azure Machine Learning Studio
is a cloud-based tool, it brings with it
the strengths and weaknesses of cloud computing itself.
So let’s go over the pros and cons of cloud computing,
so that you know what you’re getting yourself into.
So the first thing is, I’ve noticed
that none of the cloud services will almost ever ask you how
much data you need to store.
They would just store and not ask any questions about that,
and the reason for that is storage space, especially
on cloud, has been plummeting pretty consistently every year.
It’s gotten to the point where it’s only
going to cost you about $0.02 to store anything
online per gigabyte per month.
So it doesn’t matter if you’re on AWS or on Azure,
they will both charge you about $0.02 per gigabyte per month
to store something.
Which is really cool because it basically removes data size
as an equation when you have to deal with your capacity
with your machine learning tools and hardware.
The next thing is machine learning
does not exist in a vacuum.
Machine learning is extremely dependent on large data sets.
Large data sets, like big data, is
dependent on IT infrastructure.
So the IT infrastructure, in order
to have an IT infrastructure big enough to support big data,
you have to have either some kind of robust service
like a data warehouse, or you can
rent that stuff from AWS or Azure
or Google Cloud, one of the cloud services.
So specifically in Azure, you have
all of that data infrastructure to back up
your machine learning.
So you have, for example, Azure SQL databases as your database.
You have Spark and Apache Hadoop in the form of HDInsight.
You have Blob storage.
You have Data Lake storage.
You have stream analytics in the form of ETL.
You have Azure Data Factory in the form orchestration
of all your data pipelines.
And more importantly, because it’s a Microsoft technology,
it integrates with Excel, which is one of the most commonly
used BI tools in existence.
The next thing is, because you’re in the cloud,
you’re freed from hardware.
Right?
None of your guys have to have a beeper,
and you don’t have a data warehouse where you have
to keep upgrading hardware.
You don’t have to worry about that.
All you have to worry about is really the data
and how much it’s charging you on a monthly basis.
The next thing I find that’s really cool
is that it runs on someone else’s machine.
That doesn’t sound like it means much,
but the idea is you don’t need a very powerful computer anymore.
You don’t need a workstation desktop anymore.
What this means is you can get an iPad
and use any of these cloud-based tools,
hit the Run button or the Execution button,
and then close the tab.
If the device can open up a browser,
like Chrome, the idea is it can run the cloud service.
Because it’s in the cloud, it’s also
collaborative, which means you can
invite other people your cloud spaces
and work together and share the same cloud space.
Another thing, it’s scalable.
You’re harnessing the entire power that is the cloud.
So the cloud is very good at distributing workloads
among multiple nodes, multiple surveys,
calling upon extra help when it needs more processing
power or more storage.
So let’s go over the cons of cloud computing.
By using our cloud-based tool, you’re
committing to an unwavering internet connection.
You can never lose internet, or you lose basically access
to do your job.
The next con is the biggest con of them all,
which is compliance.
Can you even be in the cloud, is what
that thing is trying to ask.
Does your industry, does the government
that oversees your industry, does your company,
do all those policies comply is such that your data can
be in the cloud?
This is all about data governance.
So before you even use this tool and start loading data
into it that is work related, you really
need to ask someone at your company,
is the cloud available, and specifically
is Azure an allowed technology at this company?
So I’ll show you have to make an Azure Machine Learning Studio
workspace in the next video, but for now, let
me explain how it all works.
There are two main ways to get Azure Machine Learning Studio.
The first is the free trial method,
and the second is the full workspace method.
So to get a free trial Azure ML workspace, what you do
is you just simply go to the Azure ML website,
sign in with your email, and you’ll
be given limited-access workspace.
If you want a fully working workspace,
then you’ll need an actual Azure subscription.
Then, once you’re inside of that subscription, go ahead,
and you’ll have to create an Azure ML workspace
within that subscription.
So this is the full workspace.
Now, if you’ve never used Azure before,
then Azure will give you a free trial subscription
to start off.
So you’ll get $200 credit on your Azure subscription
or 30 days, whichever limit is reached first.
Now, if you’ve used up your Azure free trial subscription,
because you’ve used Azure before,
then you’ll simply need to go the pay-as-you-go route,
where you’ll add your credit card to an autopay system,
and it will charge you for the fees that you’ve incurred based
upon just cloud usage for the month.
And for this web series, I’m going
to go ahead and assume that you have a full workspace,
and I’ll show you how to create that full workspace
in the next video.
All right, let’s talk about cost.
So there’s really two parts to the pricing of Azure Machine
Learning Studio.
First, is a monthly subscription,
and then second is for usage.
So first is the subscription.
You’re charged for $9.99 per month per seat.
So for each workspace you’ll be charged about $10 a month.
Secondly, is your charge for usage.
So there’s three parts to the usage.
The first one is runtime.
So your charged $1 per one hour of experiment run time,
so every time you hit the Run button on your experiment.
You’re going to see a timer tick on the top right hand
corner of your screen.
That is what you’re being charged for.
The second part is you’re also charged for deployment calls.
So any time anyone calls your deployed web services,
basically your rest APIs, for every 1,000 API calls,
you’re charged about $0.10 to $0.50,
depending which tier you’re going with for your web
service.
Now, the first 1,000 API calls are basically always free.
So unless you’re an e-commerce business,
I don’t foresee you actually getting hit with these charges
at all for deployment.
And then the third usage is you’re
charged for storage of data.
So basically whatever data you pull into Azure ML,
you’ll be charged for that as well.
So Azure Blob storage, which is the service we’re going to use,
so Azure Blob storage charges $0.02 per gigabyte per month.
It’s actually less than $0.02, but they’re going around it up.
So it’s going to be about $0.02.
And then, if you’re going with the whole pay-as-you-go
for this web series, this whole series will probably cost you
$20 for the month.
So $10 for the seat and then another $10 for the usage.
Now, they will prorate it if you delete the workspace when
you’re done with it.
So that’s going to be nice.
All right, and then that will conclude this video.
If you like these tutorials, and you
want to see more videos like this in the future,
go ahead and like and subscribe.
I’m also going to leave you with a question.
What is your favorite data mining tool?
Talk to me in the comment section.
I’m really interested in knowing what everyone is using.
All right, and I’ll see you in the next video, where
I’ll show you how to create an Azure ML workspace,
and I will look forward to seeing you at our boot camp.

In Part 1 we will cover:

  • What is Azure Machine Learning Studio
  • Being in the cloud
  • Subscriptions you need
  • Pricing of Azure

Part 2:
Subscriptions & Workspaces

Complete Series:
Introduction to Azure Machine Learning

More Data Science Learning Material:
[Video] Introduction to Data Mining Series
[Blog] Build a Predictive Model in Azure ML Studio
[Blog] University of New Mexico Data Science Certificate

(1353)

Phuc H Duong
About The Author
- Phuc holds a Bachelors degree in Business with a focus on Information Systems and Accounting from the University of Washington.

Avatar

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>