Data to Insight is not a software supplier in the traditional sense -- we're here to help local authority (LA) colleagues get work done and share it with each other. The best thing about Data to Insight is the community of analysts who are doing and sharing great work through this network.
We recognise that different LAs want different levels of engagement with our work. As such, we offer four key collaboration approaches to suit different users’ information needs.
PATCh
PATCh is the Python Analysis Toolkit for Children's services. It allows users to use data analysis and visualisation apps that have been coded by analysts and Data to Insight, entirely in browser, free from servers, without the need to have Python on your computer. This allows users to access the power of Python for data work with no need for a local Python install, no need to worry about they safety of servers, and no need to know any Python themselves.
Contributing to PATCh
Whilst anyone can use apps on for PATCh, the tool itself is made possible by LA analysts as apps are coded by them, with support and teaching from Data to Insight. Data to Insight provides teaching and consulting for any LA analyst with any level of Python, even none, to help them get to the point where they can code Python and write apps.
​
For those who already know a little Python, we have two videos that will be useful:
The workflow for writing patch apps on GitHub
How to write a basic PATCh app for people who already know a little Python
​
To get involved you'll need a GitHub account, and to contact us with your username so we can add you to the PATCh repository and give you read/write permissions.
How it works
We've below provided three summaries of how D2I's Python web tools work - intended for different audiences - followed by a brief note about the potential to deploy versions of the tool locally.
The shortest version:
-
Browsers don’t normally understand Python, and apps using Python normally need a server, so to avoid that we need a way for them to understand it.
-
Our tools run on Pyodide. Pyodide allows your browser to translate Python code into something it understands, so it can show you an output.
-
When you load a page with Pyodide, it looks at all the Python code and tells the browser what it means and how to understand it, and stores that in the same way any information about what to do on a page is stored normally on your computer when visiting any other site.
-
This means that your browser can use Python code to analyse data without needing to send the data to a server somewhere, because by the time you see the page, your local browser already knows what to do if it sees some data.
-
The code for our apps is all stored on and served from GitHub, open source, so you can have a look at what it’s doing if you want.
-
-
So, the tools use Python, and an interpreter must be loaded for that to happen, but really, it’s no different than any web page running normally (as it would if it were using, say, Java Script), except it is now able to understand an additional language.
-
This means our apps can analyse data without installing Python, and without sending data off your local system.
The short version:
-
When you visit a website, you load code telling your browser what to do. Assuming you don’t do anything that makes any more outside connections (such as clicking a link to a new page or loading a video hosted elsewhere), you’re able to use and interact with all the features on the page you’re on without sending information anywhere: your browser already knows what to do when you tell it to perform an action from when it initially loaded the page.
-
One of the things your browser runs using is Web Assembly which understands code used to tell your browser what to do and show when you visit a site. Web Assembly is like a step up from binary, but it’s hard for humans to understand and write, so we use other languages as an in-between for humans and Web Assembly, like JavaScript. Web Assembly can understand and run Java Script locally, without running the app on a server, and it’s a common language for apps and browser tools.
-
Java Script is not good for data work; whilst it would be possible to code D2I apps using Java Script, it’s needlessly complicated. Python, however, is good for data work.
-
Normally, as Web Assembly (and therefore browsers) can’t understand Python, if you want to run Python on a web page, you’d need a server or cloud system behind the web page running Python that sends outputs back to your computer for display in the browser.
-
For data governance reasons, we can’t send Children’s Services data off local networks and out to servers or cloud systems running Python, so we need a way for Web Assembly to understand and run Python code locally in browser, the same way it already runs and understands things like Java Script.
-
Luckily, Pyodide exists. Pyodide acts as an interpreter between Python and Web Assembly. So, once Pyodide is loaded, you can run Python apps on websites without a server, just as you can run Java Script ones.
-
The exciting bit about this is when you’ve loaded the page, your browser already knows what to do with any data, without the need to send it anywhere, you can just point it to where the data is and off it goes. Simply: you don’t install python on your computer, and your data stays on your computer.
-
Our apps do all the processing locally, and don’t have an upload button! You can test this yourself by loading an app, turning off the internet connection, and then putting data in and seeing that it processes it and gives outputs without even being able to send the data anywhere.
The long version:
Our tools run like any other website. There is code that tells your browser what to do and what to show, and your browser runs it. For most websites, the code is written in Java Script (JS) and the JS tells your browser what to show, what to do when you click a button or move a slider, etc. Your browser then organises and displays that with HTML. The only difference between D2I tools and other websites is that D2I tools use Python to do data analysis, which allows a greater scope much more easily than JS. In almost every other case when websites run Python to do data/database/back-end work, the website will connect to a server/cloud to run that Python and send the outputs back to the user. We need to avoid that as there are data governance concerns whenever data is leaving a local environment. D2I tools use a very clever package called Pyodide that allows the browser itself, rather than an external server, to understand Python code and then let JS take outputs from and show them in the browser. An important thing to note here is that the Python ISN’T doing anything the JS couldn’t already do, we just opted to code in Python because the pool of analysts who know/want to know Python is larger and it’s far easier to write data analysis code in Python than JS, and it’s also easier to teach and be of use outside of our tools. Another note is that huge numbers of websites DO already use Python, just server side rather than client side. So: Python is being interpreted locally, but not installed locally just run/understood by your browser, and your data does not go anywhere.
In a bit more depth: Pyodide works by allowing the code that browsers run on normally (Web Assembly), to be able to understand and run Python code (which as standard browsers don’t know how to understand) and pass the outputs to JS to display. This means that the JS can be used to tell the web page what to display once the Python has done the calculations. So, Pyodide acts as an interpreter to tell Web Assembly what the Python code means. This is just the same as when you load another web page, except you’re also loading Pyodide so your browser knows how to understand and talk to the Python. The Python is getting no more access to your computer than any other website is, it’s just novel that the websites use Python rather than a more usual client-side language. So, to sum up that bit: our tools use Pyodide to allow Web Assembly to understand Python, whilst web assembly is already running JS, and it allows the browser to understand how to run the Python and how to show the outputs.
We could have made tools that use JS to do the exact same thing as our tools do; then, at that point it would be no different to any other website, but the complexity of the code and maintenance would be higher and it would be harder to find volunteer analysts to get involved given the lack of suitability of JS for data work. What all this means is that visiting our tools, excluding the data governance context, is no more dangerous than visiting any other site, the fact that some of the tools run on Python has no impact on this. So, hopefully that clears up worries about the fact that the tools use Python, or at least provides some context to get going on.
Now, on to the data issue. What’s exciting about Pyodide is that it is a serverless solution to getting browsers to run Python. Many websites use Python to do data/database work, but they are running the Python on a server somewhere, which is exactly what we are using Pyodide to avoid. The way this works normally (and a reason why browsers don’t normally need to be able to interpret Python) is that whilst you have a web page up, every time you do something you’re making a connection to a server, that server has Python installed and is running code to work out what to do, and is then sending some output back to your browser to show an output. So, if we ran our tools on a server, we’d provide an interface to upload data, the data would be sent to our server where it would be processed, and then the outputs would be sent back for display in your browser. With data governance concerns, that’s exactly what we need to avoid. We can’t make tools that need a connection to a server to run the Python because then data is leaving a local system to be processed, and this gives rise to problems associated with cloud/server security. The real magic of Pyodide is that to be able to get your browser to be able to interpret Python, it must read all the Python code and understand what the browser/web assembly should do to do before it shows you the page. This means that when you see the page, the Python is already interpreted and web assembly already knows what set of processes it has to do if it’s passed some data, and the ‘knowledge’ of what to do is stored in the same way your browser temporarily stores any information about what to do on any given website once it’s loaded, it just happens to have started off in Python. This means that it ‘knows’ what to do when it sees data, without needing to ask a server what to do, and that data doesn’t need to go anywhere.
By using Pyodide to allow browsers to understand Python, you are running all the code locally, using your browser already. No data is being sent anywhere at all. All our tools do is, when you open the web page, tell your browser how to understand Python (by loading Pyodide), and tell the browser what to do with data that gets put in it. Once you’ve loaded the tool, our tools don’t make any outside connections at all. You can even test this by loading a tool, unplugging the internet, and then running it. Importantly, lots of people take browser to be ‘box where the internet is’, and that’s obviously true, but Pyodide doesn’t use the internet (except to know what code to run in the first place), it just uses the resources your computer would normally use for accessing the internet to understand and do whatever the Python code would be doing.
The code to run the tools is all open-source and available to see on GitHub, which, other than that meaning that anyone can have a look at it, is no different to the code to run the site being stored anywhere else. A website always needs to look somewhere to know what to do when you open it!
A summary:
-
Our tools run on Pyodide
-
Pyodide allows your browser to translate Python code into something it understands, so it can show you an output.
-
-
When you load a page with Pyodide it looks at all the Python code and tells the browser what it means and how to understand it and stores that in the same way any information about what to do on a page is stored temporarily on your computer (e.g. sliders/buttons).
-
This means that your browser can use Python code to analyse data without needing to send the data to a server somewhere, because by the time you see the page, it already knows what to do if it seems some data.
-
-
So, the tools use Python, and an interpreter has to be loaded for that to happen, but really it’s no different than a web page running JS, except it already knows how to interpret JS.
Running the tool locally:
​
The applications may run faster if deployed locally, and could then be adapted to point directly at local datasets without user prompting, At present, that's simply not an approach that all LAs are able to follow, but it is theoretically possible, using a little Git knowledge, to clone a D2I repo and get it working locally via the command line.
At that point you’re taking the code and giving it command line access which, whilst not dangerous in the case of our tool, should require prior approval from your organisation's IT administrator. We're happy to talk through this process, but we won't be able to demonstrate it: our own host LA's network security policy currently prohibits this. We know this is a common limitation for LAs, and this is why our tools all run in a web browser: it's a safe and easy way to deploy complex analysis code.