AI News, Interactive Data Science with Jupyter Notebooks

Interactive Data Science with Jupyter Notebooks

Installation and Startup The easiest way to install Jupyter is by running pip install jupyter, though if you use a packaged python distribution such as Anaconda, you may already have it installed.

If you don’t already have a notebook you want to open, you can create one by clicking on ‘New’ and selecting Python 2 or 3, depending on which version of Python you have running in your environment.

Running code in Jupyter Notebooks Once you have a new notebook running, you can write some python code in the empty cell and just hit ctrl+enter to run it.

The brackets on the left side of the cell shows an asterisk when a cell is running or queued to run, and then shows a number once it’s finished, representing the order in which cells were run during a given session, starting at ‘1’.

So for example, if I import tensorflow and then concatenate it with a string, the output is shown below, even though I didn’t use the print command.

Outputs When you have a lot of output, you can reduce the amount of space it takes up by clicking on the left-hand panel of the output, which turns it into a scrolling window.

On the other hand, if you want to create a new cell immediately after a given cell, you can use alt-enter to execute the cell and then insert a new cell directly after it.

And perhaps most importantly, it allows past-you to tell future-you what a given code cell was supposed to do, in a way that can be much more expressive than using comment blocks!

For an easy way to time your code, start a cell with %%time and once the cell finishes executing it will print out how long it took to run that cell.

Running Code¶

Large outputs To better handle large outputs, the output area can be collapsed.

Run the following cell and then single- or double- click on the active area to the left of the output: In [9]: for i in range(50):

print(i) 0

1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Beyond a certain point, output will scroll automatically: In [10]: for i in range(500):

print(2**i - 1) 0

1

3

7

28 Jupyter Notebook tips, tricks, and shortcuts

Jupyter Notebook Jupyter notebook, formerly known as the IPython notebook, is a flexible tool that helps you create readable analyses, as you can keep code, images, comments, formulae and plots together.

When working with Python in Jupyter, the IPython kernel is used, which gives us some handy access to IPython features from within our Jupyter notebooks (more on that later!) We're going to show you 28 tips and tricks to make your life working with Jupyter easier.

-20.42 181.96 649 4.0 11 LAT LONG DEPTH MAG STATIONS 996 -25.93 179.54 470 4.4 22 997 -12.28 167.06 248 4.7 35 998 -20.13 184.20 244 4.5 34 999 -17.40 187.80 40 4.5 14 1000 -21.59 170.56 165 6.0 119 If you want to set this behaviour for all instances of Jupyter (Notebook and Console), simply create a file ~/.ipython/profile_default/ipython_config.py with the lines below.

This will list all magic commands %lsmagic Available line magics: %alias %alias_magic %autocall %automagic %autosave %bookmark %cat %cd %clear %colors %config %connect_info %cp %debug %dhist %dirs %doctest_mode %ed %edit %env %gui %hist %history %killbgscripts %ldir %less %lf %lk %ll %load %load_ext %loadpy %logoff %logon %logstart %logstate %logstop %ls %lsmagic %lx %macro %magic %man %matplotlib %mkdir %more %mv %notebook %page %pastebin %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %popd %pprint %precision %profile %prun %psearch %psource %pushd %pwd %pycat %pylab %qtconsole %quickref %recall %rehashx %reload_ext %rep %rerun %reset %reset_selective %rm %rmdir %run %save %sc %set_env %store %sx %system %tb %time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode

%%HTML %%SVG %%bash %%capture %%debug %%file %%html %%javascript %%js %%latex %%perl %%prun %%pypy %%python %%python2 %%python3 %%ruby %%script %%sh %%svg %%sx %%system %%time %%timeit %%writefile

time.sleep(0.01)# sleep for 0.01 seconds CPU times: user 21.5 ms, sys: 14.8 ms, total: 36.3 ms Wall time: 11.6 s %%timeit uses the Python timeit module which runs a statement 100,000 times (by default) and then provides the mean of the fastest three times.

append_if_not_exists(arr, x) Writing pythoncode.py %pycat pythoncode.py ```python import numpy def append_if_not_exists(arr, x):

Using `%prun statement_name` will give you an ordered table showing you the number of times each internal function was called within the statement, the time each call took as well as the cumulative time of all runs of the function.

10000 0.527 0.000 0.528 0.000 <ipython-input-46-b52343f1a2d5>:2(append_if_not_exists)

10000 0.022 0.000 0.022 0.000 {method 'randint' of 'mtrand.RandomState' objects}

1 0.006 0.006 0.556 0.556 <ipython-input-46-b52343f1a2d5>:6(some_useless_slow_function)

6320 0.001 0.000 0.001 0.000 {method 'append' of 'list' objects}

1 0.000 0.000 0.556 0.556 <string>:1(<module>)

1 0.000 0.000 0.556 0.556 {built-in method exec}

1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 14.

pick_and_take() Automatic pdb calling has been turned ON --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) <ipython-input-24-0f6b26649b2e> in <module>()

4 picked = numpy.random.randint(0, 1000) ----> 5 raise NotImplementedError()

4 picked = numpy.random.randint(0, 1000) ----> 5 raise NotImplementedError()

Here you get the output of the function plt.hist(x) (array([ 216., 126., 106., 95., 87., 81., 77., 73., 71., 68.]),

, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.

You can use this to check what datasets are in available in your working folder: !ls *.csv nba_2016.csv titanic.csv pixar_movies.csv whitehouse_employees.csv Or to check and manage packages.

The best solution to this is to install rpy2 (requires a working version of R as well), which can be easily done with pip: pip install rpy2 You can then use the two languages together, and even pass variables inbetween: %load_ext rpy2.ipython %R require(ggplot2) array([1], dtype=int32) import pandas as pd df = pd.DataFrame({

'Letter': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],

'X': [4, 3, 5, 2, 1, 7, 7, 5, 9],

'Y': [0, 4, 3, 6, 7, 10, 11, 9, 13],

'Z': [1, 2, 3, 1, 2, 3, 1, 2, 3]

}) %%R -i df ggplot(data = df) + geom_point(aes(x = X, y= Y, color = Letter, size = Z)) Example courtesy Revolutions Blog 22.

!pip install https://github.com/ipython-contrib/jupyter_contrib_nbextensions/tarball/master !pip install jupyter_nbextensions_configurator !jupyter contrib nbextension install --user !jupyter nbextensions_configurator enable --user The nbextension configurator.

You can install RISE using conda: conda install -c damianavila82 rise Or alternatively pip: pip install RISE And then run the following code to install and enable the extension: jupyter-nbextension install rise --py --sys-prefix jupyter-nbextension enable rise --py --sys-prefix 26.

In this example I scan the folder with images in my repository and show thumbnails of the first 5: import os from IPython.display import display, Image names = [f for f in os.listdir('../images/ml_demonstrations/') if f.endswith('.png')] for name in names[:5]:

display(Image('../images/ml_demonstrations/' + name, width=100)) We can create the same list with a bash command, because magics and bash calls return python variables: names = !ls ../images/ml_demonstrations/*.png names[:5] ['../images/ml_demonstrations/colah_embeddings.png',

Sharing notebooks The easiest way to share your notebook is simply using the notebook file (.ipynb), but for those who don't use Jupyter, you have a few options: Convert notebooks to html file using the File > Download as > HTML Menu option.

also recommend the links below for further reading: IPython built-in magics Nice interactive presentation about jupyter by Ben Zaitlen Advanced notebooks part 1: magics and part 2: widgets Profiling in python with jupyter 4 </module></ipython-input-24-0f6b26649b2e></module></string></ipython-input-46-b52343f1a2d5></ipython-input-46-b52343f1a2d5>

Advanced Jupyter Notebook Tricks — Part I

They're great for experimenting with new ideas or data sets, and although my notebook 'playgrounds' start out as a mess, I use them to crystallize a clear idea for building my final projects.

I wanted to write a blog post on some of the lesser known ways of using Jupyter — but there are so many that I broke the post into two parts.

%time will time whatever you evaluate %%latex to render cell contents as LaTeX %timeit will time whatever you evaluate multiple times and give you the best, and the average times %prun, %lprun, %mprun can give you line-by-line breakdown of time and memory usage in a function or script.

As described in the rmagics documentation, you can use %Rpush and %Rpull to move values back and forth between R and Python: You can find other examples of language-magics online, including SQL magics and cython magics.

Getting familiar with magics gives you the power to use the most efficient solution per subtask and bind them together for your project.

When used this way, Jupyter notebooks became 'visual shell scripts' tailored for data science work.

Each cell can be a step in a pipeline that can use a high-level language directly (e.g., R, Python), or a lower-level shell command.

At the same time, your 'script' can also contain nicely formatted documentation and visual output from the steps in the process.

For a report, just schedule your notebooks to run on a recurring basis automatically and update its contents or email its results to colleagues.

Or using the magics techniques described above, a notebook can implement a data pipeline or ETL task to run on an automatic schedule, as well.

ipython nbconvert --to html pipelinedashboard.ipynb After scheduling this shell script, the result will be a regular HTML version of the last run of your notebook.

Python Code from Jupyter

This is how to use python code .py file from your Jupyter notebook.

How to Execute python code on Jupyter Notebook First Time on Anaconda

This video will show you steps to use jupyter for executing python code.

Jupyter Notebook Tutorial: Introduction, Setup, and Walkthrough

In this Python Tutorial, we will be learning how to install, setup, and use Jupyter Notebooks. Jupyter Notebooks have become very popular in the last few years, and for good reason. They allow...

Jupyter Notebook Tutorial / Ipython Notebook Tutorial

This tutorial will go over, 1) What is jupyter or ipython notebook? 2) Installation of jupyter notebook 3) Build first notebook using python pandas 4) Cover markdown and embedding video links...

Ipython/Jupyter notebook - Split Cell Tutorial.

Demo of split cells in the ipython/jupyter notebook. Code is Currently here. My Gitlab My LiveStream

Make Jupyter/IPython Notebook even more magical with cell magic extensions!

PyCon Canada 2015: Talk Description: * My talk will start out with a brief explanation of what Jupyter is (the project formerly known as IPython Notebook)..

Make Jupyter Even More Magical

Watch this tutorial to learn how to create text cells with Markdown and LaTeX. Also, you will learn how to add some of the built-in 'cell magic' extensions like running bash commands and displaying...

1. Introduction - Jupyter Tutorial (IPython 3)

This is a quick introduction to jupyter which is the IPython version 3. I'll be covering some of the new and interesting features about jupyter. We cover using multiple kernels like python...

Jupyter in Depth

Jupyter provide tools for interactive computing that are widely used in scientific computing, education, and data science, but can benefit any Python developer. You will learn how to use IPython...

Randall J. LeVeque - Writing a Book in Jupyter Notebooks

Description I will describe an on-going project to write a book exploring mathematical and computational aspects of wave propagation problems. Abstract I will describe an on-going project...