You can run this notebook in a live session Binder or view it on Github.

Analyze web-hosted JSON data

This notebook reads and processes JSON-encoded data hosted on the web using a combination of Dask Bag and Dask Dataframe.

This data comes from mybinder.org a web service to run Jupyter notebooks live on the web (you may be running this notebook there now). My Binder publishes records for every time someone launches a live notebook like this one, and stores that record in a publicly accessible JSON file, one file per day.

Introduction to the dataset

This data is stored as JSON-encoded text files on the public web. Here are some example lines.

[1]:
import dask.bag as db
db.read_text('https://archive.analytics.mybinder.org/events-2018-11-03.jsonl').take(3)
[1]:
('{"timestamp": "2018-11-03T00:00:00+00:00", "schema": "binderhub.jupyter.org/launch", "version": 1, "provider": "GitHub", "spec": "Qiskit/qiskit-tutorial/master", "status": "success"}\n',
 '{"timestamp": "2018-11-03T00:00:00+00:00", "schema": "binderhub.jupyter.org/launch", "version": 1, "provider": "GitHub", "spec": "ipython/ipython-in-depth/master", "status": "success"}\n',
 '{"timestamp": "2018-11-03T00:00:00+00:00", "schema": "binderhub.jupyter.org/launch", "version": 1, "provider": "GitHub", "spec": "QISKit/qiskit-tutorial/master", "status": "success"}\n')

We see that it includes one line for every time someone started a live notebook on the site. It includes the time that the notebook was started, as well as the repository from which it was served.

In this notebook we’ll look at many such files, parse them from JSON to Python dictionaries, and then from there to Pandas dataframes. We’ll then do some simple analyses on this data.

Start Dask Client for Dashboard

Starting the Dask Client is optional. It will start the dashboard which is useful to gain insight on the computation.

[2]:
from dask.distributed import Client, progress
client = Client(threads_per_worker=1,
                n_workers=4,
                memory_limit='2GB')
client
[2]:

Client

Cluster

  • Workers: 4
  • Cores: 4
  • Memory: 8.00 GB

Get a list of files on the web

The mybinder.org team maintains an index file that points to all other available JSON files of data. Lets convert this to a list of URLs that we’ll read in the next section.

[3]:
import dask.bag as db
import json
[4]:
db.read_text('https://archive.analytics.mybinder.org/index.jsonl').map(json.loads).compute()
[4]:
[{'name': 'events-2018-11-03.jsonl', 'date': '2018-11-03', 'count': '7057'},
 {'name': 'events-2018-11-04.jsonl', 'date': '2018-11-04', 'count': '7489'},
 {'name': 'events-2018-11-05.jsonl', 'date': '2018-11-05', 'count': '13590'},
 {'name': 'events-2018-11-06.jsonl', 'date': '2018-11-06', 'count': '13920'},
 {'name': 'events-2018-11-07.jsonl', 'date': '2018-11-07', 'count': '12766'},
 {'name': 'events-2018-11-08.jsonl', 'date': '2018-11-08', 'count': '14105'},
 {'name': 'events-2018-11-09.jsonl', 'date': '2018-11-09', 'count': '11843'},
 {'name': 'events-2018-11-10.jsonl', 'date': '2018-11-10', 'count': '7047'},
 {'name': 'events-2018-11-11.jsonl', 'date': '2018-11-11', 'count': '6940'},
 {'name': 'events-2018-11-12.jsonl', 'date': '2018-11-12', 'count': '16322'},
 {'name': 'events-2018-11-13.jsonl', 'date': '2018-11-13', 'count': '16530'},
 {'name': 'events-2018-11-14.jsonl', 'date': '2018-11-14', 'count': '14099'},
 {'name': 'events-2018-11-15.jsonl', 'date': '2018-11-15', 'count': '13182'},
 {'name': 'events-2018-11-16.jsonl', 'date': '2018-11-16', 'count': '12863'},
 {'name': 'events-2018-11-17.jsonl', 'date': '2018-11-17', 'count': '6490'},
 {'name': 'events-2018-11-18.jsonl', 'date': '2018-11-18', 'count': '7310'},
 {'name': 'events-2018-11-19.jsonl', 'date': '2018-11-19', 'count': '13348'},
 {'name': 'events-2018-11-20.jsonl', 'date': '2018-11-20', 'count': '13982'},
 {'name': 'events-2018-11-21.jsonl', 'date': '2018-11-21', 'count': '13165'},
 {'name': 'events-2018-11-22.jsonl', 'date': '2018-11-22', 'count': '12217'},
 {'name': 'events-2018-11-23.jsonl', 'date': '2018-11-23', 'count': '9070'},
 {'name': 'events-2018-11-24.jsonl', 'date': '2018-11-24', 'count': '6798'},
 {'name': 'events-2018-11-25.jsonl', 'date': '2018-11-25', 'count': '6796'},
 {'name': 'events-2018-11-26.jsonl', 'date': '2018-11-26', 'count': '13617'},
 {'name': 'events-2018-11-27.jsonl', 'date': '2018-11-27', 'count': '14964'},
 {'name': 'events-2018-11-28.jsonl', 'date': '2018-11-28', 'count': '14434'},
 {'name': 'events-2018-11-29.jsonl', 'date': '2018-11-29', 'count': '13845'},
 {'name': 'events-2018-11-30.jsonl', 'date': '2018-11-30', 'count': '12109'},
 {'name': 'events-2018-12-01.jsonl', 'date': '2018-12-01', 'count': '6785'},
 {'name': 'events-2018-12-02.jsonl', 'date': '2018-12-02', 'count': '7119'},
 {'name': 'events-2018-12-03.jsonl', 'date': '2018-12-03', 'count': '13946'},
 {'name': 'events-2018-12-04.jsonl', 'date': '2018-12-04', 'count': '13765'},
 {'name': 'events-2018-12-05.jsonl', 'date': '2018-12-05', 'count': '13106'},
 {'name': 'events-2018-12-06.jsonl', 'date': '2018-12-06', 'count': '12249'},
 {'name': 'events-2018-12-07.jsonl', 'date': '2018-12-07', 'count': '10687'},
 {'name': 'events-2018-12-08.jsonl', 'date': '2018-12-08', 'count': '6269'},
 {'name': 'events-2018-12-09.jsonl', 'date': '2018-12-09', 'count': '6639'},
 {'name': 'events-2018-12-10.jsonl', 'date': '2018-12-10', 'count': '12782'},
 {'name': 'events-2018-12-11.jsonl', 'date': '2018-12-11', 'count': '13442'},
 {'name': 'events-2018-12-12.jsonl', 'date': '2018-12-12', 'count': '13069'},
 {'name': 'events-2018-12-13.jsonl', 'date': '2018-12-13', 'count': '15279'},
 {'name': 'events-2018-12-14.jsonl', 'date': '2018-12-14', 'count': '9941'},
 {'name': 'events-2018-12-15.jsonl', 'date': '2018-12-15', 'count': '5358'},
 {'name': 'events-2018-12-16.jsonl', 'date': '2018-12-16', 'count': '6441'},
 {'name': 'events-2018-12-17.jsonl', 'date': '2018-12-17', 'count': '11332'},
 {'name': 'events-2018-12-18.jsonl', 'date': '2018-12-18', 'count': '11971'},
 {'name': 'events-2018-12-19.jsonl', 'date': '2018-12-19', 'count': '10818'},
 {'name': 'events-2018-12-20.jsonl', 'date': '2018-12-20', 'count': '9408'},
 {'name': 'events-2018-12-21.jsonl', 'date': '2018-12-21', 'count': '7741'},
 {'name': 'events-2018-12-22.jsonl', 'date': '2018-12-22', 'count': '4818'},
 {'name': 'events-2018-12-23.jsonl', 'date': '2018-12-23', 'count': '4870'},
 {'name': 'events-2018-12-24.jsonl', 'date': '2018-12-24', 'count': '5974'},
 {'name': 'events-2018-12-25.jsonl', 'date': '2018-12-25', 'count': '4737'},
 {'name': 'events-2018-12-26.jsonl', 'date': '2018-12-26', 'count': '6725'},
 {'name': 'events-2018-12-27.jsonl', 'date': '2018-12-27', 'count': '7998'},
 {'name': 'events-2018-12-28.jsonl', 'date': '2018-12-28', 'count': '8155'},
 {'name': 'events-2018-12-29.jsonl', 'date': '2018-12-29', 'count': '5108'},
 {'name': 'events-2018-12-30.jsonl', 'date': '2018-12-30', 'count': '4428'},
 {'name': 'events-2018-12-31.jsonl', 'date': '2018-12-31', 'count': '4561'},
 {'name': 'events-2019-01-01.jsonl', 'date': '2019-01-01', 'count': '4194'},
 {'name': 'events-2019-01-02.jsonl', 'date': '2019-01-02', 'count': '8559'},
 {'name': 'events-2019-01-03.jsonl', 'date': '2019-01-03', 'count': '9687'},
 {'name': 'events-2019-01-04.jsonl', 'date': '2019-01-04', 'count': '10048'},
 {'name': 'events-2019-01-05.jsonl', 'date': '2019-01-05', 'count': '6012'},
 {'name': 'events-2019-01-06.jsonl', 'date': '2019-01-06', 'count': '6019'},
 {'name': 'events-2019-01-07.jsonl', 'date': '2019-01-07', 'count': '11903'},
 {'name': 'events-2019-01-08.jsonl', 'date': '2019-01-08', 'count': '12777'},
 {'name': 'events-2019-01-09.jsonl', 'date': '2019-01-09', 'count': '13294'},
 {'name': 'events-2019-01-10.jsonl', 'date': '2019-01-10', 'count': '13112'},
 {'name': 'events-2019-01-11.jsonl', 'date': '2019-01-11', 'count': '10327'},
 {'name': 'events-2019-01-12.jsonl', 'date': '2019-01-12', 'count': '6434'},
 {'name': 'events-2019-01-13.jsonl', 'date': '2019-01-13', 'count': '7004'},
 {'name': 'events-2019-01-14.jsonl', 'date': '2019-01-14', 'count': '12898'},
 {'name': 'events-2019-01-15.jsonl', 'date': '2019-01-15', 'count': '12363'},
 {'name': 'events-2019-01-16.jsonl', 'date': '2019-01-16', 'count': '13444'},
 {'name': 'events-2019-01-17.jsonl', 'date': '2019-01-17', 'count': '14452'},
 {'name': 'events-2019-01-18.jsonl', 'date': '2019-01-18', 'count': '12056'},
 {'name': 'events-2019-01-19.jsonl', 'date': '2019-01-19', 'count': '7590'},
 {'name': 'events-2019-01-20.jsonl', 'date': '2019-01-20', 'count': '6740'},
 {'name': 'events-2019-01-21.jsonl', 'date': '2019-01-21', 'count': '12507'},
 {'name': 'events-2019-01-22.jsonl', 'date': '2019-01-22', 'count': '15355'},
 {'name': 'events-2019-01-23.jsonl', 'date': '2019-01-23', 'count': '16319'},
 {'name': 'events-2019-01-24.jsonl', 'date': '2019-01-24', 'count': '16732'},
 {'name': 'events-2019-01-25.jsonl', 'date': '2019-01-25', 'count': '13642'},
 {'name': 'events-2019-01-26.jsonl', 'date': '2019-01-26', 'count': '6976'},
 {'name': 'events-2019-01-27.jsonl', 'date': '2019-01-27', 'count': '7570'},
 {'name': 'events-2019-01-28.jsonl', 'date': '2019-01-28', 'count': '15906'},
 {'name': 'events-2019-01-29.jsonl', 'date': '2019-01-29', 'count': '15534'},
 {'name': 'events-2019-01-30.jsonl', 'date': '2019-01-30', 'count': '15183'},
 {'name': 'events-2019-01-31.jsonl', 'date': '2019-01-31', 'count': '14421'},
 {'name': 'events-2019-02-01.jsonl', 'date': '2019-02-01', 'count': '12352'},
 {'name': 'events-2019-02-02.jsonl', 'date': '2019-02-02', 'count': '7113'},
 {'name': 'events-2019-02-03.jsonl', 'date': '2019-02-03', 'count': '7331'},
 {'name': 'events-2019-02-04.jsonl', 'date': '2019-02-04', 'count': '14493'},
 {'name': 'events-2019-02-05.jsonl', 'date': '2019-02-05', 'count': '14053'},
 {'name': 'events-2019-02-06.jsonl', 'date': '2019-02-06', 'count': '15600'},
 {'name': 'events-2019-02-07.jsonl', 'date': '2019-02-07', 'count': '17158'},
 {'name': 'events-2019-02-08.jsonl', 'date': '2019-02-08', 'count': '14107'},
 {'name': 'events-2019-02-09.jsonl', 'date': '2019-02-09', 'count': '7209'},
 {'name': 'events-2019-02-10.jsonl', 'date': '2019-02-10', 'count': '7422'},
 {'name': 'events-2019-02-11.jsonl', 'date': '2019-02-11', 'count': '17085'},
 {'name': 'events-2019-02-12.jsonl', 'date': '2019-02-12', 'count': '17286'},
 {'name': 'events-2019-02-13.jsonl', 'date': '2019-02-13', 'count': '17181'},
 {'name': 'events-2019-02-14.jsonl', 'date': '2019-02-14', 'count': '19298'},
 {'name': 'events-2019-02-15.jsonl', 'date': '2019-02-15', 'count': '13387'},
 {'name': 'events-2019-02-16.jsonl', 'date': '2019-02-16', 'count': '8182'},
 {'name': 'events-2019-02-17.jsonl', 'date': '2019-02-17', 'count': '8142'},
 {'name': 'events-2019-02-18.jsonl', 'date': '2019-02-18', 'count': '16364'},
 {'name': 'events-2019-02-19.jsonl', 'date': '2019-02-19', 'count': '18090'},
 {'name': 'events-2019-02-20.jsonl', 'date': '2019-02-20', 'count': '17441'},
 {'name': 'events-2019-02-21.jsonl', 'date': '2019-02-21', 'count': '18844'},
 {'name': 'events-2019-02-22.jsonl', 'date': '2019-02-22', 'count': '15400'},
 {'name': 'events-2019-02-23.jsonl', 'date': '2019-02-23', 'count': '8879'},
 {'name': 'events-2019-02-24.jsonl', 'date': '2019-02-24', 'count': '9342'},
 {'name': 'events-2019-02-25.jsonl', 'date': '2019-02-25', 'count': '16999'},
 {'name': 'events-2019-02-26.jsonl', 'date': '2019-02-26', 'count': '18514'},
 {'name': 'events-2019-02-27.jsonl', 'date': '2019-02-27', 'count': '15799'},
 {'name': 'events-2019-02-28.jsonl', 'date': '2019-02-28', 'count': '18702'},
 {'name': 'events-2019-03-01.jsonl', 'date': '2019-03-01', 'count': '14222'},
 {'name': 'events-2019-03-02.jsonl', 'date': '2019-03-02', 'count': '8990'},
 {'name': 'events-2019-03-03.jsonl', 'date': '2019-03-03', 'count': '8503'},
 {'name': 'events-2019-03-04.jsonl', 'date': '2019-03-04', 'count': '17427'},
 {'name': 'events-2019-03-05.jsonl', 'date': '2019-03-05', 'count': '17732'},
 {'name': 'events-2019-03-06.jsonl', 'date': '2019-03-06', 'count': '17532'},
 {'name': 'events-2019-03-07.jsonl', 'date': '2019-03-07', 'count': '17622'},
 {'name': 'events-2019-03-08.jsonl', 'date': '2019-03-08', 'count': '13110'},
 {'name': 'events-2019-03-09.jsonl', 'date': '2019-03-09', 'count': '9132'},
 {'name': 'events-2019-03-10.jsonl', 'date': '2019-03-10', 'count': '8989'},
 {'name': 'events-2019-03-11.jsonl', 'date': '2019-03-11', 'count': '16334'},
 {'name': 'events-2019-03-12.jsonl', 'date': '2019-03-12', 'count': '18637'},
 {'name': 'events-2019-03-13.jsonl', 'date': '2019-03-13', 'count': '18355'},
 {'name': 'events-2019-03-14.jsonl', 'date': '2019-03-14', 'count': '18657'},
 {'name': 'events-2019-03-15.jsonl', 'date': '2019-03-15', 'count': '15206'},
 {'name': 'events-2019-03-16.jsonl', 'date': '2019-03-16', 'count': '8606'},
 {'name': 'events-2019-03-17.jsonl', 'date': '2019-03-17', 'count': '8110'},
 {'name': 'events-2019-03-18.jsonl', 'date': '2019-03-18', 'count': '15846'},
 {'name': 'events-2019-03-19.jsonl', 'date': '2019-03-19', 'count': '17909'},
 {'name': 'events-2019-03-20.jsonl', 'date': '2019-03-20', 'count': '15610'},
 {'name': 'events-2019-03-21.jsonl', 'date': '2019-03-21', 'count': '14671'},
 {'name': 'events-2019-03-22.jsonl', 'date': '2019-03-22', 'count': '12962'},
 {'name': 'events-2019-03-23.jsonl', 'date': '2019-03-23', 'count': '7941'},
 {'name': 'events-2019-03-24.jsonl', 'date': '2019-03-24', 'count': '7248'},
 {'name': 'events-2019-03-25.jsonl', 'date': '2019-03-25', 'count': '16775'},
 {'name': 'events-2019-03-26.jsonl', 'date': '2019-03-26', 'count': '18064'},
 {'name': 'events-2019-03-27.jsonl', 'date': '2019-03-27', 'count': '17773'},
 {'name': 'events-2019-03-28.jsonl', 'date': '2019-03-28', 'count': '17945'},
 {'name': 'events-2019-03-29.jsonl', 'date': '2019-03-29', 'count': '13126'},
 {'name': 'events-2019-03-30.jsonl', 'date': '2019-03-30', 'count': '7315'},
 {'name': 'events-2019-03-31.jsonl', 'date': '2019-03-31', 'count': '7750'},
 {'name': 'events-2019-04-01.jsonl', 'date': '2019-04-01', 'count': '16049'},
 {'name': 'events-2019-04-02.jsonl', 'date': '2019-04-02', 'count': '18909'},
 {'name': 'events-2019-04-03.jsonl', 'date': '2019-04-03', 'count': '17629'},
 {'name': 'events-2019-04-04.jsonl', 'date': '2019-04-04', 'count': '17635'},
 {'name': 'events-2019-04-05.jsonl', 'date': '2019-04-05', 'count': '14057'},
 {'name': 'events-2019-04-06.jsonl', 'date': '2019-04-06', 'count': '8297'},
 {'name': 'events-2019-04-07.jsonl', 'date': '2019-04-07', 'count': '8726'},
 {'name': 'events-2019-04-08.jsonl', 'date': '2019-04-08', 'count': '18217'},
 {'name': 'events-2019-04-09.jsonl', 'date': '2019-04-09', 'count': '17833'},
 {'name': 'events-2019-04-10.jsonl', 'date': '2019-04-10', 'count': '19018'},
 {'name': 'events-2019-04-11.jsonl', 'date': '2019-04-11', 'count': '19173'},
 {'name': 'events-2019-04-12.jsonl', 'date': '2019-04-12', 'count': '15502'},
 {'name': 'events-2019-04-13.jsonl', 'date': '2019-04-13', 'count': '7839'},
 {'name': 'events-2019-04-14.jsonl', 'date': '2019-04-14', 'count': '6907'}]
[5]:
filenames = (db.read_text('https://archive.analytics.mybinder.org/index.jsonl')
               .map(json.loads)
               .pluck('name')
               .compute())

filenames = ['https://archive.analytics.mybinder.org/' + fn for fn in filenames]
filenames[:5]
[5]:
['https://archive.analytics.mybinder.org/events-2018-11-03.jsonl',
 'https://archive.analytics.mybinder.org/events-2018-11-04.jsonl',
 'https://archive.analytics.mybinder.org/events-2018-11-05.jsonl',
 'https://archive.analytics.mybinder.org/events-2018-11-06.jsonl',
 'https://archive.analytics.mybinder.org/events-2018-11-07.jsonl']

Create Bag of all events

We now create a Dask Bag around that list of URLs, and then call the json.loads function on every line to turn those lines of JSON-encoded text into Python dictionaries that can be more easily manipulated.

[6]:
events = db.read_text(filenames).map(json.loads)
events.take(2)
[6]:
({'timestamp': '2018-11-03T00:00:00+00:00',
  'schema': 'binderhub.jupyter.org/launch',
  'version': 1,
  'provider': 'GitHub',
  'spec': 'Qiskit/qiskit-tutorial/master',
  'status': 'success'},
 {'timestamp': '2018-11-03T00:00:00+00:00',
  'schema': 'binderhub.jupyter.org/launch',
  'version': 1,
  'provider': 'GitHub',
  'spec': 'ipython/ipython-in-depth/master',
  'status': 'success'})

Convert to Dask Dataframe

Finally, we can convert our bag of Python dictionaries into a Dask Dataframe, and follow up with more Pandas-like computations.

We’ll do the same computation as above, now with Pandas syntax.

[8]:
df = events.to_dataframe()
df.head()
[8]:
provider schema spec status timestamp version
0 GitHub binderhub.jupyter.org/launch Qiskit/qiskit-tutorial/master success 2018-11-03T00:00:00+00:00 1
1 GitHub binderhub.jupyter.org/launch ipython/ipython-in-depth/master success 2018-11-03T00:00:00+00:00 1
2 GitHub binderhub.jupyter.org/launch QISKit/qiskit-tutorial/master success 2018-11-03T00:00:00+00:00 1
3 GitHub binderhub.jupyter.org/launch QISKit/qiskit-tutorial/master success 2018-11-03T00:01:00+00:00 1
4 GitHub binderhub.jupyter.org/launch jupyterlab/jupyterlab-demo/master success 2018-11-03T00:01:00+00:00 1
[9]:
df.spec.value_counts().nlargest(20).to_frame().compute()
[9]:
spec
ipython/ipython-in-depth/master 942326
jupyterlab/jupyterlab-demo/master 230753
ines/spacy-io-binder/live 79631
DS-100/textbook/master 65351
bokeh/bokeh-notebooks/master 42819
binder-examples/r/master 35876
rationalmatter/juno-demo-notebooks/master 28532
QuantStack/xeus-cling/stable 24895
RasaHQ/rasa_core/master 17247
QISKit/qiskit-tutorial/master 17246
binder-examples/julia-python/master 13858
numba/numba-examples/master 13744
nteract/examples/master 7670
stencila/examples/elife-30274-binder 7388
data-8/textbook/gh-pages 7181
dask/dask-examples/master 6281
mkozturk/CMPE140/master 6278
wshuyi/demo-spacy-text-processing/master 6199
uvg-cc2005/jupyter-notebooks-2019/master 5655
bethgelab/bwki-notebooks/master 5581

Persist in memory

This dataset fits nicely into memory. Lets avoid downloading data every time we do an operation and instead keep the data local in memory.

[10]:
df = df.persist()

Honestly, at this point it makes more sense to just switch to Pandas, but this is a Dask example, so we’ll continue with Dask dataframe.

Investigate providers other than Github

Most binders are specified as git repositories on GitHub, but not all. Lets investigate other providers.

[11]:
import urllib
[12]:
df.provider.value_counts().compute()
[12]:
GitHub    1967967
GitLab       6419
Git          1560
Gist          194
Name: provider, dtype: int64
[13]:
(df[df.provider == 'GitLab']
 .spec
 .map(urllib.parse.unquote, meta=('spec', object))
 .value_counts()
 .to_frame()
 .compute())
[13]:
spec
rruizz/inforfis/R 1006
rruizz/inforfis/master 898
bhugueney/cxx-init-for-python-dev/master 459
dbernhard/PythonM1/master 397
shadaba/lss-handson/master 259
DGothrek/ipyaggrid/binder-demo 258
dbernhard/pythonm1s2/master 256
dbernhard/ProgHumaNumTAL/master 247
dbernhard/JavaM2/master 236
albert.van.breemen/masterclassdeeplearning/master 161
clemej/data601-clemens-fall18/master 132
wichit2s/programmingfundamentals/master 115
lfortran/web/lfortran-binder/master 97
kkmann/shortcourse-data-science-toolbox/master 84
andrey.kovalev/imagination/master 70
rruizz/inforfis/autin 60
PersonalDataIO/toronto-letter/master 55
kitsunix/pyHIBP/pyHIBP-binder/master 54
wichit2s/pythonintro/master 54
slloyd/python-introduction/master 47
sgmarkets/sgmarkets-api-notebooks/master 46
biehl/jscatter/master 44
g2lab/fossgis2019-geopython-vector/master 42
open-scientist/formation-data-reproductibilite/master 41
thoma.rey/FV_HipoDiff/master 39
brivadeneira/recursos-didacticos-telecomunicaciones/master 38
jhskone/intro2python/master 38
PersonalDataIO/gdpr-rights-notebook/master 36
rokm/informed_action/master 35
hkex/pyr2/master 35
... ...
vaulter82/Prion-stats/master 1
Arnall/PreciPyth/master 1
wallacmj/notebooks/master 1
s-boardman/peco/8d95e4fa08541d2521d2a3ecf43414a8aa6b79bc 1
nmg/704p2/master 1
adophobr/lightwavesystens/master 1
jibe-b/sabre/master 1
david-chen/test/master 1
elfua/ipython-notebooks/master 1
g_money/folio_track/master 1
cbkmephisto/ab2lab/no_julia 1
gtsaito/binder-test/master 1
hassakura/integracao_low_eans/master 1
hassakura/teste_exportcsv/master 1
hixi/colloqium-presentation/master 1
yerbby/scientific-python-lectures/master 1
jibe-b/crowdsource-science-improvement/dev 1
atomap/atomap_demos/master 1
agrumery/aGrUM/0.13.5 1
jibe-b/sens-de-la-vie-workflow-brouillon-tests-tuto/dev 1
kerel-fs/jupyter-notebooks/master 1
korzhimanov/kin-eq-num/master 1
anarcat/terms-benchmarks/master 1
krsna1/qml-mooc/PhaseEstimationUsingCirq 1
ktiwari9/gaussian-process/master 1
lokeller/matrix-bot/master 1
mfleck/test_mybinder_org/master 1
damonlh/puma/master 1
nmg/enee704h1/master 1
jdiep/master-thesis/master 1

204 rows × 1 columns

[14]:
(df[df.provider == 'Git']
 .spec
 .apply(urllib.parse.unquote, meta=('spec', object))
 .value_counts()
 .to_frame()
 .compute())
[14]:
spec
https://bitbucket.org/gaur/1820/a4027afe0aa592d49f3989b2e9c8136c36322a77 75
https://gitlab.rc.uab.edu/mcbios19_single_cell/single_cell_rnaseq_hands-on_1.git/a20f707fc0b67f6eb4f9bf85a5daacc52c125df6 62
https://gricad-gitlab.univ-grenoble-alpes.fr/nonsmooth/siconos-tutorial.git/b08a0514b22b3927b58bddce3c4018f27ac0fc7d 57
https://collaborating.tuhh.de/cip3725/ib_base.git/0a1f4f66a1a3c29ff347b2abc79bb292b0be17ca 56
https://gricad-gitlab.univ-grenoble-alpes.fr/chatelaf/conference-ia/d98a199f66d603b0b1e7c25fbe1341d29a40cd39 41
https://gitlab.mech.kuleuven.be/rob-expressiongraphs/docker/etasl-binder.git/127d6c9f33a938a98607505ef30237b5223000e4 38
https://gitlab.oceantrack.org/otndc/fact-workshop/a174f5bc60cf9f0c5b86851204c7c85fcfb98131 35
https://risk-engineering.org/git/notebooks.git/527b4fdaa5fe43294ed5de196b96262b1c60522b 34
https://git.rcc.uchicago.edu/jhskone/multiproc_py.git/38f9bb6ce3602b73a8ddd1dbcad3f5f9a8d21f6a 34
https://risk-engineering.org/git/notebooks.git/a18f7f0e6a707ccfa4478d3fc73d53ada11605fb 34
https://gitlab.ethz.ch/darioce/sysbio_ss2019.git/4c383c1e118678cf26ad8b6d9c0be7f41b15ad69 29
https://bitbucket.org/qlouder/lumc-ml-caffe/acc2701f43f267361594e72980ec59602dab7fd7 27
https://api.jovian.ai/git/e5cfe043873f4f3c9287507016747ae5_5.git/fc3bc662bc8b8dd00a66c336081ee39e14df5f11 23
https://framagit.org/mfauvel/omp_machine_learning/77d246991186d758147fc280eeef0eed563d525f 23
https://bitbucket.org/ml_tsu/ml.git/1bb9f4b9b349e1771000219055e597287201bfb2 22
https://api.jovian.ai/git/5bc23520933b4cc187cfe18e5dd7e2ed_7.git/2558b3811d8a9759c89ef0afce1a7cc28f58568d 21
https://collaborating.tuhh.de/xldrkp/jupyter-notebook-beispiel/e895284be3c0a128ed97dc007fcd7497ef19d2bc 20
https://github.com/camilo912/research_rating/6fe3d52d8dce4cbc548cfd2a19299beafa0eeff2 20
https://risk-engineering.org/git/notebooks.git/eab9bdb1a9bbf59833b26d39b02c5a5e743807c4 20
https://risk-engineering.org/git/notebooks.git/cc040c44c37a2cd67b0f5c2b65b029b93d178b05 19
https://risk-engineering.org/git/notebooks.git/0af42a9c5987be368b6e1fef1ad11816703e0db9 19
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/2d76ce86722edbc43d58072115793038baf1b2bf 18
https://gitlab.tubit.tu-berlin.de/jraymond/domenum/80e0602898b759f9cd103cbc18c3e88ce830db06 14
https://code.ill.fr/panosc/data-analysis-services/hdf-viewer/c60798a9e7d618b9810462d38b85121774f2c00b 13
https://gitlab.ethz.ch/darioce/sysbio_ss2019.git/8bad4772a157e1f0e5cab8716bc140aa2ba29bda 12
https://[email protected]/boismenux/ml.git/0f140be32895f01d66392b4081e6589e42395f53 12
https://api.jovian.ai/git/e5cfe043873f4f3c9287507016747ae5_6.git/8144d421554e1119f1d20cd17d7830c2fdd4ff64 12
https://bitbucket.org/qlouder/lumc-ml-caffe/4e78f53a1b8aaf24e559be509b57197927818d19 12
https://api.jovian.ai/git/e556978bda9343f3b30b3a9fd2a25012_4.git/f63a3c19b0086a654c55d57a14b6963526866d6f 11
https://api.jovian.ai/git/a1b40b04f5174a18bd05b17e3dffb0f0_2.git/afcaca04064472c678f541198f9e0f03c1154c01 11
... ...
https://gitlab.oceantrack.org/otndc/fact-workshop/c3853384f803253544d9b7333cc01d4d7bfa72a6 1
https://gitlab.inria.fr/vdrevell/istic-robm/b45ab9de12dbf06c92d3dd5dd7b884e75dad2c35 1
https://gitlab.oceantrack.org/otndc/fact-workshop/c631ee43a05abd62569262f1e599a1e5ee7ec95d 1
https://gitlab.pasteur.fr/dbikard/badSeed_public.git/da655dd38fe444c918ac754206568ad39e03be45 1
https://code.ovgu.de/spannan/Lehre.git/cfa3af5edb425830f40da4447b58ad8f35f80109 1
https://gitlab.pasteur.fr/dbikard/crisprbrowser.git/a894255cd5055e9b541296ab3d9018d67c955156 1
https://gitlab.physik.uni-muenchen.de/Martin.Ritter/python-intro/6b2e01e55667d833e0028245e333144536d43e4b 1
https://gitlab.rc.uab.edu/mcbios19_single_cell/single_cell_rnaseq_hands-on_1.git/3421d4d1ba744bb079fa6233b38aa03e5844f46f 1
https://[email protected]/chaffra/ipydevedit.git/a5e9bb5d659b50068213ab0379ecade80f8e1e29 1
https://gitlab.inria.fr/vdrevell/istic-robm/e345fd81cb6f70ba1fa4646ee80c0a619dd4feac 1
https://gitlab.inria.fr/vdrevell/istic-robm/4b79f44a45ffbd2a52c12585d119901edd7c9b29 1
https://git.fmrib.ox.ac.uk/fsl/ukbparse.git/5404c4d45bd8b0b2003deeb57fc1d8e4904f8d9e 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/50be26bfca663a8993ace469cac667ce9a5393b9 1
https://gitlab.in2p3.fr/gregoire.henning/tiipp-invprob-2018/6662fe629ebb3ad62c6d9de8348c73cac781c7e3 1
https://gitlab.in2p3.fr/gregoire.henning/tiipp-invprob-2018/6f330c93098d605b1fc3edfe63b1382de4bc9be4 1
https://gitlab.in2p3.fr/gregoire.henning/tiipp-invprob-2018/d874f3b9dcfa20c04266ad852246cd43178d0f42 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/b9d8579a3728119d6bafa6fb84c206e1a775856e 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/a1918e89aaee1b4f3fd21fc4a8170bc95c5a1c30 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/73f4e1b132d4ff83c89eed286adc3c6661ee8d7f 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/5d6389f49fca34a7c88aaef4982b60204b1a5dc2 1
https://gitlab.in2p3.fr/julien.bey/tiipp-invprob-2018/7f529faef8e27d397d5c63d286cb251342c70655 1
https://gitlab.inria.fr/vdrevell/istic-robm/23b80d226348758b877263952f520d8264016d77 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/1da9a3d30d2e7850d53151b7f7d2f2227a5377ee 1
https://gist.github.com/a38d389a18af62de2943c10e2c367f8d.git/10b85a580f967771dc7eed070af5afd46b89137d 1
https://gitlab.in2p3.fr/julien.bey/tiipp-invprob-2018/a5296c250bd3e4ac0f6941cf5664fbca59c56a7c 1
https://gitlab.in2p3.fr/julien.bey/tiipp-invprob-2018/a6b016ac3fff072f339dfc3cc43c65709f95a357 1
https://gitlab.in2p3.fr/thomas.giraud/tiipp-invprob-2018/69a47d1ffdd94f376f3502e1c107f091e82f969a 1
https://gitlab.in2p3.fr/thomas.giraud/tiipp-invprob-2018/e079aa04a8d1aa5772e803bf088e14ff6d37c8db 1
https://framagit.org/gafhyb/gitCourse.git/c5314ee08cbb03dc35494f7fee3fc764ed151c96 1
http://code.datamode.com/datamode/binder.git/85bd134f3713f8a8cb3e22612929987b49b675a5 1

327 rows × 1 columns

[ ]: