Welcome to the world PyScript

- pandas

One of the main reasons I joined Anaconda seven and a half years ago was the company’s commitment to the data science and Python communities by creating tools that enable people to do more with less.

Today I'm happy to announce a new project that we’ve been working on here at Anaconda and we hope will take another serious step towards making programming and data science available and accessible to everyone.

What is PyScript

PyScript is a framework that allows users to run Python and create rich applications in the browser by simply using special HTML tags provided by the framework itself. Core features include:

  • Python in the browser: Enable drop-in content, external file hosting (made possible by the Pyodide project, thank you!), and application hosting without the reliance on server-side configuration
  • Python ecosystem: Run many popular packages of Python and the scientific stack (such as numpy, pandas, scikit-learn, and more)
  • Python with JavaScript: Bi-directional communication between Python and Javascript objects and namespaces
  • Environment management: Allow users to define what packages and files to include for the page code to run
  • Visual application development: Use readily available curated UI components, such as buttons, containers, text boxes, and more
  • Flexible framework: A flexible framework that can be leveraged to create and share new pluggable and extensible components directly in Python

All that to say… PyScript is just HTML, only a bit (okay, maybe a lot) more powerful, thanks to the rich and accessible ecosystem of Python libraries.

Wait... what? Why?

tl;dr: As an industry, we have focussed on making the impossible possible, rather than focussing on making the possible accessible to all.

At some point, in the 80s, personal computers became cheaper, which led to them becoming more popular. Most of the HW (C64/ZX80/Apple II) gave the user direct access to BASIC. A programming interface ready to use and a language simple to learn. Later, while systems became more complex (and complicated), frameworks like Visual Basic and HyperCard made it easy to create and package/distribute visual applications. Even the web, when it started, was accessible! All you needed was a text editor and a way to upload your files somewhere, before we created CGI and heavier server-side logic/rendering, etc...

It's somehow unfortunate that in the last 2/3 decades we created simpler programming languages, made things faster, more scalable, and bigger; requiring an increasing amount of surrounding technology and the complexity of infrastructure needed to make things work. Today, in addition to the problem of packaging and distributing applications for different architectures and platforms, we added the complexity of having the server/client separation, which requires an additional networking layer and so on... This leads to having to learn about servers, cloud vendors, web stacks, how to test code in a simulated production environment, how to deploy applications,... All of a sudden, instead of the 1 problem users were initially trying to solve when they started, they now have many problems!

Similarly, modern HTML/CSS and JS are very powerful and can be used to create really powerful and beautiful UIs, but require a significant learning curve for users to be proficient at it. This is also true for native GUI Applications. In fact, Python, the #1 most popular programming language in the world doesn't have a straightforward story on how to build native GUI Applications. Nor for making websites [entirely with python, server + client]! Nor for packaging and distributing applications!

We believe users should be spending their time thinking about and writing their applications and solving real problems. Let's make programming more fun and simpler, while keeping the right technology advancements we made over the past 20/30 years. The more we do, the more users will come.

So, how does it work?

Warning, we are about to get a little technical.... :)

The core concept of PyScript, as a framework, is to provide a set of [opinionated] components and tools that allow users to quickly create and share their applications. We also don't want to reinvent the wheel and aim to reuse the great work that many others are already doing.

With that in mind, let's start from the foundation...

The platform

One of the hotest topics people work on to solve today is: how do we create an abstraction that allows users to ship their applications to multiple HW/SW platforms without having to rewrite and rebuild their code? Most of the solutions today tend to fall under one of the 2 buckets: Virtual Machines or Containers. Both are great or have limitations, depending on the type of application and how heavy your need for abstracting a whole machine is.

Instead of creating a whole new technology stack, we want to start from the best option the ecosystem provides today. So, what virtualization abstraction system is the most popular and ubiquitous today? With a little bit of flexibility, we can claim that the Browser (browsers in general) is an excellent Virtual Machine, that actually checks a lot of the boxes we are looking for. They are everywhere (from laptops to tablets and phones), secure (browsers have been working security and isolation from the underlying file system for decades), powerful (from HW acceleration to the maturity of WASM and Web Assembly), and stable.

The Stack

Keeping in mind one of the premises above, we want to provide a reliable and fun experience to PyScript users (whether they are authoring or consuming an application), ultimately making the web a friendly and hackable place for users. For this reason, we need something beyond the current state of web development. Something that can:

  • give users a first-class programming language that is less weird, more expressive, and easier to learn than Javascript.
  • centralize: strip away most of the complexity of the client/server modern web by removing that distinction as much as possible.

Luckily for us, the ecosystem has been building the foundations of a very solid stack that we can build on top of:

  • WebAssembly/WASM: a portable binary-code format and text format for executable programs & software interfaces to enable high performance applications on web pages and other environments
  • Emscripten(https://emscripten.org/): an Open Source compiler toolchain to WebAssmbly, practically allowing any portable C/C++ codebase to be compiled into WebAssembly
  • Pyodide(https://pyodide.org/)/python-wasm(https://github.com/ethanhs/python-wasm): Python implementations compiled to WebAssembly

As Python found its success standing on the shoulders of giants and building out of the excellent work of many people, we can do that too!

The Interface

One of our highest goals is to make programming and the web a friendly and hackable place where anyone can create interesting things and still have fun.

As hinted above, the presentation layer of the modern web is really powerful and actually not bad, if you know what to do. That means that either you've been doing this for some time or that you'll have to spend a considerable amount of time learning. Even then, that ecosystem moves so fast that is often hard even for experts to keep up.

Instead, we want a system that:

  • offers a clean and simple API
  • supports standard HTML
  • extends the HTML elements with custom components that are opinionated and predictable (do fewer things but do it "as you'd expect it")
  • is extensible and offers an easy way for users to define their own new components

To do this, PyScript defines a series of new HTML tags (web components). For instance, to write a simple program, one can just use the <py-script> tag and write Python code inside the tag itself

<py-script>
"Hello World"
</py-script>

or, alternatively, pass the source file directly

<py-script src="/my_own_file.py"></py-script>

PyScript will read that code, run it on a python interpreter and handle the output accordingly.

If I need to load (install) additional modules and packages needed by my application, I can just use the <py-env> tag to specify my environment requirements

  <py-env>
- bokeh
- numpy
- paths:
  - /utils.py
  </py-env>

To add a REPL-like component to create an interactive experience, one can just use the <py-repl> tag

<py-repl id="my-repl"  auto-generate="true"> </py-repl>

and it'll create a widget like the one below, that can be used to access everything loaded and executed by the other tags we mentioned before, such as <py-script> and <py-env>.

Since we already loaded pandas and numpy for you, try copying, pasting and running (by hitting the green arrow) the code below:

import pandas as pd
import numpy as np

s = pd.Series([1, 3, 5, np.nan, 6, 8])
s

Voila!

The point is, that by registering new web components that are simple and very expressive and users don't need to waste their time learning css and other specific web dev technologies.

Where is PyScript today?

Today, April 30th, 2022, PyScript is just at its beginning and is very limited compared to the vision we have for the project. It's a demonstration that we can build the vision and the technology is mature enough for us to create a new way of programming, building, sharing, and deploying applications. Be advised that it's very unstable and limited, but it works and can be used to hack with and build experimental applications.

We hope to make progress fast and that in a few weeks/months, this post will be outdated :).

For more information about the available features and how to get started, visit the project documentation.

Where is PyScript going

One of the ways I like to think of PyScript is "the Minecraft for software development". A framework that provides basic blocks for users to create their own worlds [applications] or new blocks [PyScript components and widgets] that others can use. In that sense we want to build a framework that is:

  • extremely simple and expressive
  • feels familiar to users
  • extensible:
  • so users can create new widgets and share them with others
  • so we can support multiple runtimes...
  • ... and multiple languages ...
  • ... that can interop with each other ...
  • ... and yet be controlled to also create secure namespaces
  • runs on both the browser and server/native side

In addition to all that, it's worth mentioning that with this project, we are exploring new horizons, and a lot of the old paradigms that are at the roots of "standard" server-side programming are not that untouchable anymore. For instance, I/O, network, and storage on the browser/client side are not the same as in traditional native systems. We'll save that topic for another post, but the point here is that we have the opportunity to innovate and explore, and that's what we want to do.

It's also worth mentioning that a lot of the core technology used to build PyScript is itself recent and very vibrant. As these technologies mature and expose new functionalities, we want to extend PyScript and take all the advantages we can get.

Thanks

PyScript wouldn’t be here without the help of some incredible people. We’d really like to thank:

  • Peter Wang, Kevin Goldsmith, Philipp Rudiger, Antonio Cuni, Russell Keith-Magee, Mateusz Paprocki, Princiya Sequeira, Jannis Leidel, David Mason, Anna NG, Maria Genovese, Katherine Kinnaman, Kent Pribbernow, Albert DeFusco, Michael Verhulst and Chris Leonard for the contributions to the project and helping spin it up
  • Especial thanks to the Pyodide maintainers (Roman Yurchak, Hood Chatham and all the contributors)