Development environment for Machine learning

- Pranay on Feb 13 '17 at 13:20
- viewed 303 times

This post helps in setting up a development environment for Machine Learning using the latest and most effective IDEs. It uses Anaconda distribution of Python and Jupyter as IDE.


Selecting development environment for ML – languages, IDE, tools

We have selected Python as the scripting language for the Machine Learning here. So to run Python code, we need Python or any of its distribution to be installed on our development environment. Also, we might need an IDE for Python development. So we shall install the Anaconda distribution of Python. This comes with Python and required data sciences packages along with the Jupyter IDE, which serves our need with a single click install.

Here are the reasons why I made this choice

Why Python?

Python is the widely used language for data science tasks because of the wide range of the libraries available like – NumPy, SciPy, pandas, matplotlib, IPython etc..

Here is a meaningful discussion you can look at – Why is Python a language of choice for data scientists?

Why Jupyter?

Unlike other IDEs, Jupyter runs in the browser and has options to include images/charts/visual representations along with the code blocks. This makes the code and the logic behind it more descriptive and easy to understand. Also, as this IDE runs in the browser as a HTTP website, there is a huge scope of hosting it as a cloud service. There are already some free cloud providers of Jupyter through which you can code without a local setup. All you need is a device (like some iPad) with internet connection which can connect to this Jupyter Cloud. ‘Microsoft Azure Notebook’, ‘Anaconda Cloud’ are some good examples for this.

Here is a sample screen shot showing both code and a chart illustrating some data.


Why Anaconda distribution?

It is a free Python distribution widely used because of the large number of data science packages available. It also provides ‘Conda’, a package, dependency and environment manager. The Jupyter IDE is also included in the Anaconda distribution. Hence it is enough to just install Anaconda distribution which gives – Python + Jupyter IDE + most used data science packages.

Downloading and Installing – Anaconda distribution

Anaconda is available over all the platforms and below is the page to download Anaconda based on your Operating system platform. So download and run the installation as per the instructions mentioned in the page.

Download Page: Download Anaconda Now

Running the Anaconda/Jupyter

Run the below command in the Anaconda prompt. This will open a new browser window or a new browser tab with URL normally being “http://localhost:8888/tree?token=*********” In this example, I am using the Windows OS and the UI could slightly vary with other platforms.

Jupyter Notebook

Image Text

Image Text

Click on the ‘New’ dropdown at the top right and then select Python file. This will show you a new UI or a command line where you can type Python script and run.

Image Text

Here is a demo URL provided by Jupyter – where you can try Jupyter and run a sample Python script code .

Good luck!! Please do post your comments and thoughts.

About the author

Sitecore Certified Professional

A Software Engineer by profession, a part time blogger and an enthusiast programmer. You can find more about me here.

Leave your comments on this post here