Posting From Jupyter

retrosupply-jLwVAUtLOAQ-unsplash.jpg

Photo by RetroSupply on Unsplash

This post contains a description of some tools I used my recent posts about data analysis introduction

This is a very opinionated selection of tools which is based on my personal journey in the Open Source Software world

Jupyter Lab

Jupyter Lab is a multi-language environment started as a python shell on steroids and evolved into a really cool tool.

You can find more information about how to install and use it as well as transform a notebook into a stand-alone script.

Its interactivity makes it a go-to tools for data scientist and beginners but I find it also convenient to teach programming languages

Other than python dozen of progrmming languages are supported.

Markdown cells including Latex equations are available to comment the code; exports in many document formats too.

Its ability to launch remote shells and dozen of integration tools (e.g. GIT) makes it the tool of choice for cloud prototyping in many of the greatest environments including Google and IBM

Emacs and Org-Mode

let’s address with the elephant in the room.

I’ve been using Emacs since 1995; it is almost 30 years this piece of libre software is helping in my journey as a developer and content creator, also becoming my personal project tracking tool

Org Mode is one of the Emacs default extensions which enables rich structured notes taking, project management etc. all in plain text. One of its submodules called Org Babel is a has a lot of features in common with Jupyter Lab, but it is less popular nowadays because it has very little support outside of Emacs

Org mode has a lot of extensions: one of my favorites is org2blog which allows me to transform an org-mode file into a WordPress post… more on that below.

That said I can’t make without it – it is not a tool I recommend to beginners, but is enabling my workflow in so many ways that Visual Studio Code still cannot do. Moreover is lighter than VSC, can run on a terminal and respects my privacy.

Just for those who are interested, I’m a Doom Emacs user (of course with vi keybindings)

Pandoc

this tool is really great: it enables transforming between a lot of markdown languages! In this case it helps me bridging the gap between ipynb (notebook format) to org-mode; it works perfectly out of the box.

A simple command like

pandoc Part1.ipynb -f ipynb -t org -o Part1.org --extract-media=images

already does what I need

  • converts the format
  • extracts all the images in a directory
  • creates the org-babel environments properly

I just need to

  • clean up the content type of the org-babel environments (more details below)
  • add the org2blog heading

And I’m good to go.

Edit Iterations

What if I change something after publishing it with org2blog ?

I need to

  • copy the generated heading in a different file
  • copy the footer (generated when uploading images) in a different file
  • extend the conversion command to look like this one
pandoc Part_5.ipynb -f ipynb -t org --extract-media=images/ -o Part_5.org -H Part_5_head.org -A Part_5_foot.org

Exporting Latex from notebook

The only viable way to have formulae included is to have them embedded in the Markdown cell; the `%%latex` magic command does not work when translating the files in the org format

Luckily Jupyter Lab markdown includes both inline formulae using pairs of $ sign as delimiter and \begin{equation} \end{equation} environments

WordPress and Org-Mode

While I never chose php as a development language for my projects, WordPress remains one of the most convenient content management solutions out there

  • there is tons of cheap hosting
  • also tons of plug-ins

Org Mode has a

Fixing language

we need to clean up a little the result

  • all occurrences of #+begin_src python with #+begin_src python :noeval :exports code this will preserve examples in the markdown
  • all occurrences of jupyter-python have to be translated as python in order to have them properly colorized
  • add an header for org-babel execution: this will prevent org2blog reexecute all snippets unless you really want it

    #+PROPERTY: header-args:python :noeval
    

However you may want to execute your code into emacs anyway… in this case:

  • create a virtual environment which contains all of the packages you want to execute
  • set python-shell-virtualenv-path emacs variable to your virtual environment path
  • add an header for org-babel execution: you may need to tweak your output to get the expected result

    #+PROPERTY: header-args:python :session *Python* :exports both :results table
    
  • some output may not fit in a table #+begin_src python :results verbatim drawer
  • execute org-babel-execute-buffer to get the actual result of export before org2blog does

Unsplash, Gimp and optimizing images for publications

I like to have a splash image to be shared within social media: unsplash is my go-to site for free quality images

With Gimp I scale the image to a 1200px width and reduce the jpg quality to 20-30% this gives me a good image wich is not so big in size.

There are many places on the Internet where you can find optimization for wordpress image: consider look for your case

Conclusions

While this setup is not a good chioce for everyone I hope you can find some useful tip for your case.

marco.p.v.vezzoli

Self taught assembler programming at 11 on my C64 (1983). Never stopped since then -- always looking up for curious things in the software development, data science and AI. Linux and FOSS user since 1994. MSc in physics in 1996. Working in large semiconductor companies since 1997 (STM, Micron) developing analytics and full stack web infrastructures, microservices, ML solutions

You may also like...

Leave a Reply