IIS - Annoying error with .NET Core

I deployed a .NET Core application recently into an environment which previously had only run .NET Framework apps, but got a tricky HTTP Error 500.19 - Internal Server Error suggesting there was a problem with my Web.config:

What was also peculiar was that when I tried to enable stdout logging using the IIS UI I got an error "There was an error while performing this operation" with blank "Details" and "Error" but with the "Filename" set to the path to my Web.config. This was true for any of the configuration sections:

However after a lot of head scratching and googling turned up nothing my colleague realised that this environment was missing the Windows Server Hosting bundle, available from the .NET Core downloads section under the "Other Windows Downloads" section. Once this was installed everything worked as expected.

Czech - Telling The Time

I've been learning Czech on and off for the last few years, but now and again I discover that there are some basic things that I never quite learned properly. The most recent of these was my ability to tell the time. When I wanted to learn this properly I found that there are very few places that teach this at the level I wanted - they were either too simplistic, misleading or written in a way that is much too hard to quickly digest. I ended up writing the below for myself and for anyone else who wants to learn.

To simplify each explanation I'm assuming you're familiar with the following:

what I mean by "Nominative", "Genitive" and "Accusative" case
the verb "být"
the noun "hodina"
numbers from 1 - 60

Additionally plurals behave kinda funny in Czech. My friends and I all first learned this when counting beer so I'm gonna use "beer plurals" as shorthand for the following behaviour when you have ...

one of something you give the noun in singular Nominative (jedno pivo, jedna hodina, jedna minuta)
two, three or four of something you give the noun in plural Nominative (dvě piva, dvě hodiny, dvě minuty)
five or more of something you give the noun in plural Genitive (pět piv, pět hodin, pět minut)

Quick Reference

Firstly in case you want to just quickly reference, here's a table that demonstrates most of the cases:

time	Czech	time	Czech
1:00	je jedna hodina	4:00	jsou čtyři hodiny
1:10	je deset minut po jedné	4:10	je deset minut po čtvrté
1:15	je čtvrt na dvě	4:15	je čtvrt na pět
1:30	je půl druhé	4:30	je půl paté
1:45	je tři čtvrtě na dvě	4:45	je tři čtvrtě na pět
1:50	je za deset minut dvě	4:50	je za deset minut pět
2:00	jsou dvě hodiny	5:00	je pět hodin
2:10	je deset minut po druhé	5:10	je deset minut po paté
2:15	je čtvrt na tři	5:15	je čtvrt na sest
2:30	je půl třetí	5:30	je půl sesté
2:45	je tri čtvrtě na tři	5:45	je tři čtvrtě na sest
2:50	je za deset minut tři	5:50	je za deset minut šest hodin
3:00	jsou tři hodiny
3:10	je deset minut po třetí
3:15	je čtvrt na čtyři
3:30	je půl čtvrté
3:45	je tři čtvrtě na čtyři
3:50	je za deset minut čtyři

Whole Hours

If it's exactly H o'clock, we say something like "it is H hours". The word for "hours" is "hodina" and this is the first example of the beer plurals I described above. There are only 12 possibilities here so it's not too hard to just memorise the below:

Half Hours

Times that in English are exactly half past an hour in Czech are said to be a half of the next hour, so instead of "half past one" we say something like "a half of two". There's a minor complication - the hour is given in the genitive singular feminine. They behave like adjectives because ultimately they are adjectives modifying the noun "hodina". So je půl jedné means "it is half of the first (hour)"

Again there are only 12 combinations so if my description doesn't make much sense you can just memorise the below without too much trouble:

Quarter Hours

Similar to half-hours when we talk about quarter past or quarter to an hour in Czech we talk about quarters of an hour. So 12:15 is "a quarter of one" or čtvrt na jednu (the hour part is in the Accusative case) and 12:45 is "three quarters of one" or tři čtvrtě na jednu (with čtvrtě being the plural of čtvrt)

Minutes After The Hour

If we have a time up to half past the hour, we write it really similar to the English - so 1:10 is "je jedna minuta po jedné". It's tricky:

"minuta" behaves like our "beer plurals" - 1 minuta, 2/3/4 minuty, 5/6/7/etc minut
the hour is the genitive singular, and since "hodina" is feminine we have jedné, druhé, třetí, etc

Minutes Before The Hour

Finally for minutes before an hour, we write something like "in M minutes it is H o'clock" - so 1:50 this is "je za deset minut dve". Again we have the "beer plural" for minut - minuta, minuty, minut.

24 Hour Time.

When reading from a watch, computer of phone it seems many Czechs will just say the hour and limit component separately. For example the other day when we were leaving the Maximus spa my girlfriend asked the lady at the counter what time the shuttle bus to the nearest tram stop left, she pulled up a schedule and said “šestnáct dvacet” - 16:20.

The only quirk is 1-9 minutes past the hour where you say the minutes with a leading nula - so 16:05 would be šestnáct nula pět.

Firefox Developer - Gnome and Debian 9 quickstart

Using Firefox on Debian 9 is a little frustrating as the packages available from default APT sources are older “ESR” releases (firefox-esr, 52.0 at the time of writing). Getting the Dev or Nightly builds is pretty straight forward but if you use Gnome you probably want a launcher, and it might not be very obvious how to do this.

First grab the sources or prebuilt binaries:

    $ curl -LO "https://download.mozilla.org/?product=firefox-devedition-latest-ssl&os=linux64&lang=en-US"

Extract them into /opt/firefox-dev:

    $ tar -xjf firefox-57.0b12.tar.bz2 && sudo mv firefox /opt/firefox-dev

Open up a text editor and create /use/share/applications/firefox-dev.desktop as follows

    [Desktop Entry]
    Name=Firefox Dev
    Comment=Browse the World Wide Web
    GenericName=Web Browser
    X-GNOME-FullName=Firefox Dev Web Browser
    Exec=/opt/firefox-dev/firefox %u
    Terminal=false
    X-MultipleArgs=false
    Type=Application
    Icon=firefox-dev
    Categories=Network;WebBrowser;
    MimeType=text/html;text/xml;application/xhtml+xml;application/xml;application/vnd.mozilla.xul+xml;application/rss+xml;application/rdf+xml;image/gif;image/jpeg;image/png;x-scheme-handler/http;x-scheme-handler/https;
    StartupNotify=true

Copy the icon and run gtk-update-icon-cache so that the icon appears as expected.

    $ sudo cp /opt/firefox-dev/browser/icons/mozicon128.png /usr/share/icons/hicolor/128x128/apps/firefox-dev.png
    $ sudo gtk-update-icon-cache -f /usr/share/icons/hicolor

And that's it! You should have a nice desktop icon for Firefox Developer Edition you can use. I also did the same for Firefox Nightly:

For updates you can clean out the directory and repeat the process with the latest tar.bz2 file ... or you can change the permissions in the firefox-dev directory so you have write access, and auto-updates will work.

Arduino - 5x8 ISO 8859-2 font

A while ago I took an existing 3x8 font, converted it by hand for use with AdaFruit's graphics library and subsequently modified it to use ISO 8859-2 characters (in a presumably innocent coincidence this popped up in the Adafruit GFX Library a few months later). Anyway, I recently received a request to perform the same modification to the standard 5x8 fonts and only recently got round to looking at this. As a quick reminder, ISO 8859-2 covers ~128 extra characters for alphabets used by Central and Eastern and Southern European languages - these characters look like this:

After a little bit of experimentation I realised that the extra 2 columns aren't really all that useful for adding the extra ligatures (čarky, hačky, umlauts etc) - we're really constrained by vertical space. Many of the existing letters need to be modified or reworked entirely, since they consume the almost entire 8 rows and we need 2 rows for some ligatures. For example Ä is currently implemented as the following:

    0x7D, 0x12, 0x11, 0x12, 0x7D

Which looks like this:

This is visually a little confusing, but more importantly we cannot really re-use it for ISO 8859-2 since some of the ligatures we need to add to the "A" require at least two rows. Instead of having an "A" which jumps around depending on the ligature, I've created a single A for when a ligature is used and left the ungarnished original letter alone.

Just as another example of why this can be tricky the existing ä looks really weird to me, the umlauts are skewed off to the left and look like they're joined to the letter itself.

I've moved this up into a central position which is the same on all letters involving umlauts. This is purely based on personal taste, but I think it looks better - below is the original style compared to my modified version:

There are similar considerations in some of the other letters that are left as an exercise for the reader - see if you can devise a neat system to fit all the letters below into 3x8 grid in a way that is consistent and legible, it's pretty tricky. I've made an initial stab at this (see GitHub gist below), but after revisiting this I've realised how flakey and error-prone this process of creating fonts is.

F# - some Project Euler patterns

To help me get used to F# and relearn the ways of functional programming I've been working through Project Euler in a Jupyter IfSharp notebook and keeping my solutions on GitHub at https://github.com/smcl/ProjectEulerJupyter

After around 50 problems so far I've spotted a handful of patterns which had either a number of possible solutions or were a pain to type out (or copy/paste) each time I used them. I explored them each in a little more detail to find the most optimal implementation of each pattern. The reason I wrote this up is that even though the problems are pretty simple, some of the results were pretty surprising.

For each of the patterns I've got a simple definition, some solutions and a set of benchmark results in a table. In each results table I've highlighted the most optimal solution that fully expands the result set (so the lazily evaluated solutions that "complete" in a millisecond don't count) so that we can have a realistic idea of what the best solution is.

Combinations of two lists

The first item is the simplest of the four problems - if we have two lists foo and bar, produce a list of pairs featuring all combinations of elements of foo with bar. So given something like ...

let foo = [ 'a'; 'b'; 'c' ]
let bar = [ 1; 2; 3 ]

We expect to see a list like this ...

[ 
    ('a', 1); ('a', 2); ('a', 3)
    ('b', 1); ('b', 2); ('b', 3)
    ('c', 1); ('c', 2); ('c', 3)
]

I came up with only three solutions - I don't feel like I have an ideal solution to this, just the least hacky variant of the first solution that popped up in my mind.

pair_list: The first solution, call List.map for every member of one list, then in the function argument call List.map again for every member of the second - flatten the resulting list using List.concat

let pair_list l1 l2 =
    List.map (fun x -> l2 |> List.map (fun y -> (x,y))) l1
    |> List.concat

pair_seq: as above, but assume we can have sequences as input, so we can produce the (fairly large) output array lazily:

let pair_seq s1 s2 =
    Seq.map (fun x -> s2 |> Seq.map (fun y -> (x,y))) s1
    |> Seq.concat

pair_seq_expanded: as above, expand fully to a List for an idea of how long it takes to operate on the whole output:

let pair_seq_expanded s1 s2 =
    Seq.map (fun x -> s2 |> Seq.map (fun y -> (x,y))) s1
    |> Seq.concat
    |> Seq.toList

pair_seq_for: it's pretty clear that using a sequence is preferable here, especially if you need to work with 1000 x 1000 collections, so the final variant is a slight rewrite of the second, using a for loop and yield-ing the resulting tuple.

let pair_seq_for s1 s2 = 
    [ for x in s1 do 
        for y in s2 do 
            yield (x,y) ]

To compare the performance of these three I've defined 100 and 1000 element lists/sequences and measured how long it takes to iterate through each sequence pair performing a simple operation (accumulating the difference between the pairs).

time to create n-element collection of pairs in milliseconds

method	n=10000	n=100000	n=1000000	n=10000000
pair_list	1.5096	14.7937	226.0501	2927.2477
pair_seq	0.8690	0.8690	0.8846	0.9028
pair_seq_expanded	3.3952	21.5028	184.3353	2264.2805
pair_seq_for	3.2361	12.5183	180.1352	1997.3700

So thankfully the cleanest looking pair_seq_for solution is actually the fastest when we get to larger data sets. This isn't quite where the story ends though. There's a nice discussion here on Stack Overflow about a similar but slightly different problem - finding n element combinations of a single list - so for

let someList = [ 1; 2; 3 ]

... we wanted a function combs (n:'a) (lst:'a list) which would produce something like the below for combs 2 someList

[
    ( 1, 2 ); ( 1, 3 )
    ( 2, 1 ); ( 2, 3 )
    ( 3, 1 ); ( 3, 2 )
]

This is different from the problem I posed, but I've got a GitHub gist here where I've turned them all loose on the same set of data, and performed some simple measurements.

Pairing elements of Collections with their indexes

In a couple of places I found myself wondering if F# collections had an equivalent of python's enumerate - which is a function which wraps a list and returns an index/element pair for each loop iteration:

letters = [ "a", "b", "c", "d" ]
for i, c in enumerate(letters):
    print "%d: %s" % (i, c)

# output:
#     0: a
#     1: b
#     2: c
#     3: d

It took a little while before I spotted Array.mapi so I ended up working through and measuring a handful of different ways first - some are obviously pretty poor (particularly those using recursion) but I left them in nonetheless:

enumerate_by_for_seq - using a Seq to generate the index and return a pair

let enumerate_by_for_seq (a:string []) =
    seq { for i in 0 .. (a.Length-1) -> (i, a.[i]) }

enumerate_by_for_seq_expanded - as previous, but returning a List to fully expand the sequence

let enumerate_by_for_seq_expanded (a:string []) =
    seq { for i in 0 .. (a.Length-1) -> (i, a.[i]) }
    |> Seq.toList

enumerate_by_for_list - iterating over each index using a for loop, returning a (int * string) list

let enumerate_by_for_list (a:string []) =
    [ for i in 0 .. (a.Length-1) -> (i, a.[i]) ]

enumerate_by_for_array - as above but returning (int * string) [], note that this seems ridiculously similar, but differs surprisingly in performance (I discovered this by accident and included it in this experiment because of the difference!)

let enumerate_by_for_array (a:string []) =
    [| for i in 0 .. (a.Length-1) -> (i, a.[i]) |]

enumerate_by_map - generating a list of indexes and then using |> and List.map to create the index/element pair (i.e. the same as the first approach, but using List)

let enumerate_by_map (a:string []) =
    [0..(a.Length-1)]
    |> List.map (fun i -> (i, a.[i]))

enumerate_by_recursion_array - bonkers approach, abusing Array.append and recursing. Just don't do this...

let rec enumerate_by_recursion_array' i (a:string[]) =
    match i with
    | 0 -> [||]
    | _ -> Array.append [| (i, a.[i]) |] (enumerate_by_recursion_array' (i-1) (a.[1..]))

let enumerate_by_recursion_array (a:string[]) =
    enumerate_by_recursion_array' (a.Length-1) a

enumerate_by_recursion_list - List variant of the above. Don't do this either

let rec enumerate_by_recursion_list' i (a:string[]) =
    match i with
    | 0 -> []
    | _ -> [ (i, a.[i]) ] @ (enumerate_by_recursion_list' (i-1) (a.[1..]))
    
let enumerate_by_recursion_list (a:string[]) =
    enumerate_by_recursion_list' (a.Length-1) a

enumerate_by_for_zip - Using Array.zip - shortest solution, the best until I spotted Array.mapi

let enumerate_by_zip (a:string[]) =
    Array.zip a [|0..(a.Length-1)|]

enumerate_by_for_mapi - Probably the most "correct" solution, using Array.mapi

let enumerate_by_mapi (a:string[]) =   
    Array.mapi (fun i x -> (i,x)) a

enumerate_by_for_parallel_mapi - As above but naively switching in Array.Parallel.mapi without any other changes

let enumerate_by_parallel_mapi (a:string[]) =
    Array.Parallel.mapi (fun i x -> (i,x)) a

time taken to enumerate n element collection (milliseconds)

method	n=10000	n=100000	n=1000000	n=10000000
enumerate_by_for_seq	0.3385	0.3496	0.3471	0.3540
enumerate_by_for_seq_expanded	2.6177	18.8341	205.4403	3610.3913
enumerate_by_for_list	1.3487	22.1703	248.5039	4200.8530
enumerate_by_for_array	2.1619	12.8186	192.3148	3178.5893
enumerate_by_map	2.0391	26.2468	287.2852	4179.3407
enumerate_by_recursion_array	7760.3141	n/a*	n/a*	n/a*
enumerate_by_recursion_list	5368.5472	n/a*	n/a*	n/a*
enumerate_by_zip	7.1136	9.4388	170.0941	1917.8617
enumerate_by_mapi	2.6911	13.0303	116.5348	1268.8625
enumerate_by_parallel_mapi	8.1293	17.7548	102.2350	1379.0431

* = this took way too long so I killed it

Obviously Array.mapi was the fastest overall. However it wasn't as much faster than Array.zip as I would have imagined, and I suspect that I'm doing something wrong with Array.Parallel.mapi. Also interesting is that while the super-fast performance of the enumerate_by_for_seq method dissipates somewhat when fully evaluated, it is still faster than the equivalent enumerate_by_for_list version.

Pandigitals

"Pandigital" numbers feature relatively frequently in Project Euler. An n-digit pandigital number will contain all digits from 0..n or 1..(n-1) once in some order. For example 41523 is a 1-5 pandigital, and 43210 is 0-4 pandigital. These numbers are mentioned in 32, 38, 41, 104, 118, 170 (and perhaps more) so a relatively efficient way to recognise them is pretty useful to have at hand.

Again there's a few ways we can do this - in each case I can think of we start with taking the string representation of the input number and splitting it up using ToCharArray() and with this we can do a number of different things.

pandigital_strcmp - sort array, map each element to string, sort, create string + compare to "123..n"

let pandigital_strcmp (n:int) = 
    let sorted = new string (string(n).ToCharArray() |> Array.sort)
    sorted = pandigitalString

pandigital_intcmp - sort array, map each element to string, sort, create string, cast to int + compare to 123..n

let pandigital_intcmp (n:int) = 
    let sorted = new string (string(n).ToCharArray() |> Array.sort)
    int(sorted) = pandigitalInt

pandigital_arrcmp - sort array, string, cast to int + compare to existing array [| '1'; '2'; .. n |]

let pandigital_arrcmp (n:int) = 
    pandigitalArray = (string(n).ToCharArray() |> Array.sort)

pandigital_set_difference - convert to Set and compute difference from precalc'd set, pandigital if empty

let pandigital_set_difference (n:int) = 
    string(n).ToCharArray()
    |> Set.ofArray
    |> Set.difference pandigitalSet
    |> Set.isEmpty

pandigital_array_contains - for each element in precalculated pandigital array, check it's present in array and use List.fold to ensure all true

let pandigital_array_contains (n:int) = 
    let a = string(n).ToCharArray()
    pandigitalArray 
    |> Array.map (fun c -> Array.contains c a) 
    |> Array.fold (fun e acc -> e && acc) true

So I tested these against using the code to measure how quickly each method was in applying

// where panDigitalInt is the upper limit ("n" in the table)
let testNumbers = [ 0 .. pandigitalInt ]
let bench name f =
    let sw = Stopwatch.StartNew()
    let res = testNumbers |> List.filter f |> List.length
    sw.Stop()
    printfn "%s: %f ms" name sw.Elapsed.TotalMilliseconds

time taken to filter pandigitals in [0..n] in milliseconds

method	n=1234	n=12345	n=123456	n=1234567
pandigital_strcmp	2.1081	11.2639	113.2086	1356.1985
pandigital_intcmp	0.9716	9.7646	89.3238	947.0513
pandigital_arrcmp	2.4441	6.1932	59.7014	618.0665
pandigital_set_difference	2.5024	17.2115	199.2863	1986.9592
pandigital_array_contains	0.9790	4.8161	50.447	565.6698

So it seems Array.contains wins overall. The Set.difference approach was pretty dismal which was disappointing - it came to me when I was out walking my dog and I rushed back to write it and benchmark it. I think Set.ofArray is perhaps a little slow, but I haven't done any investigation into it.

It's worth noting that you probably shouldn't do something like [0..bigNumber] |> List.filter pandigital_array_contains to start with - maybe it's worth approaching the problem from a different angle in some cases.

Sorting a 3-element tuple

OK this only came up once and was part of a pretty poor solution I had for problem 39 (original, solution) - I generated thousands of tuple featuring 3 of integers and then tested whether they could represent the sides of right-angled triangles using Pythagoras' theorem. However since they were in no particular order I thought I needed identify the hypotenuse. I wrote this out long-form since there's only a handful of cases and solved the problem relatively quickly.

Regardless of whether this was a suitable solution for the problem, I was curious as to what approach works best for sorting these tiny collections of 3 elements.

I had only three approaches:

sort_nested_if - use nested if statements to reduce the number of comparisons needed while introducing branches

let sort_nested_if (a,b,c) =
    if a >= b then
        if b >= c then (a,b,c) else (a,c,b)
    elif b >= a then
        if a >= c then (b,a,c) else (b,c,a)
    else
        if a >= b then (c,a,b) else (c,b,a)

sort_flat_if - have a separate if for each result at the top level

let sort_flat_if (a,b,c) =
    if a >= b && b >= c then       (a,b,c)
    elif a >= b && b >= c then     (a,c,b)
    elif b >= a && a >= c then     (b,a,c)
    elif b >= a && c >= a then     (b,c,a)
    elif (*c >= b &&*) a >= b then (c,a,b)
    else (*c >= b && b >= a then*) (c,b,a)

sort_array - create an array, use Array.sort and map the results back into a tuple when returning the result

let sort_array (a,b,c) =
    let sorted = Array.sort [| a;b;c |]
    (sorted.[0], sorted.[1], sorted.[2])

To test these I generated large arrays of size 4000, 40000 and 400000 3-element tuples and timed how long each method took to sort them.

time taken to sort n 3-element tuples, in milliseconds

method	n=4000	40000	400000
sort_nested_if	1.2626	13.9014	193.3619
sort_flat_if	1.7864	23.4633	258.2538
sort_array	1.2424	11.9907	132.4312

OK now it's probably obvious why I didn't just bin this little experiment - sort_array is surprisingly the clear winner. I would have guessed that the overhead of building an array and calling Array.sort function on a list way smaller than you'd normally need to sort would be insane. But apparently it's not!

Conclusion

It's surprising how many different ways you can write some relatively simple algorithms. Some of them are obviously pretty awful, like the recursive enumerate function (though I'm sure I can rearrange it to take advantage of tail call elimination) - and some were surprisingly performant, like the sort_array function in the final example. I've noticed that some other Project Euler people maintain a sort of library of functions they can re-use. Eventually I'd like to do something like this, but until it becomes unmanageable I'll just keep a Gist on GitHub:

Fedora - Jupyter on a Linux remote using systemd

When you want to do some experimentation or put together a simple code-based presentation Jupyter notebooks are a powerful tool to have at your disposal. But if you use a number of devices over a few locations it can be useful to have a single instance hosted somewhere central (Linode, Digital Ocean, wherever) that you can access from any device wherever you are. There are a handful of ways that you can achieve this:

log in to your remote machine, set Jupyter up and run jupyter notebook (perhaps in a tmux session) then log out - do this whenever your machine reboots
as above but using an existing docker image
spin up an Azure notebook
... or we could do something like #1 - but have it setup under a separate user and administered via a systemd service

All four of the above are fine for different reasons and use-cases but here I'll talk about how I put #4 together in a little Linode instance running Fedora 25 - it's relatively simple, you can control over the kernels installed, and it's another excuse to get a bit more in-depth with another Linux subsystem (systemd).

Requirements

All you need is a Linux system which uses systemd (Fedora 15.0 or newer, Debian 8.x or newer, Ubuntu 15.04 or newer, for example) which you have sudoer level access on, and Python 3.x. It's probably pretty straight-forward to set this up on systems using the SysV init but I won't cover them here.

Install and Set Up Jupyter

First thing we need to do is install Jupyter and set up the user context which the Jupyter will be run under - which is a user called "jupyter":

$ sudo python3 -m ensurepip
$ sudo pip install jupyter
$ sudo useradd jupyter
$ sudo passwd jupyter

Next we should switch to the new jupyter user, create the directory our notebooks will live in and generate the Jupyter config we'll mess around with:

$ su - jupyter
$ mkdir notebooks
$ jupyter notebook --generate-config

The last command will create a new file ~/.jupyter/jupyter_notebook_config.py which we'll do a little messing around with shortly, but before this we'll set up a password

$ python3
Python 3.5.2 (default, Sep 14 2016, 11:28:32) 
[GCC 6.2.1 20160901 (Red Hat 6.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from notebook.auth import passwd
>>> passwd() # below I enter "password123"
Enter password: 
Verify password: 
'sha1:2eff88aac285:385c87867bd18fe852ee1d56b1010d4beed96969'

This will be used to log in to the application when its running. Open up the ~/.jupyter/jupyter_notebook_config.py file in a text editor and add/modify the following lines (using the SHA1 hash returned by the above):

c.NotebookApp.port = 8888
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.password = 'sha1:2eff88aac285:385c87867bd18fe852ee1d56b1010d4beed96969'

Setting up a Jupyter systemd service

Now we want to create a new systemd service so we can make sure our Jupyter notebook runs on startup, handles logging nicely and has all the other bells-and-whistles afforded to us by systemd. This is surprisingly simple - we want to create a new file jupyter.service in /usr/lib/systemd/system - this will tie together our newly installed Jupyter software and our newly setup jupyter user - using your favourite text editor create it so it looks like the below:

$ sudo cat /usr/lib/systemd/system/jupyter.service
[Unit]
Description=Jupyter

[Service]
Type=simple
PIDFile=/var/run/jupyter.pid
ExecStart=/usr/bin/jupyter notebook --no-browser
WorkingDirectory=/home/jupyter/notebooks
User=jupyter
Group=jupyter

[Install]
WantedBy=multi-user.target%

Now all that's left to do is cross our fingers, enable our services, kick them off and browse to our remote box and login with our password:

$ sudo systemctl daemon-reload
$ sudo systemctl enable jupyter
$ sudo systemctl start jupyter

And if you want you can stop here - bookmark your http://www.xxx.yyy.zzz:port address and you're all set!

Conclusion

This was initially just an experiment - an excuse to test out my ability to put together a systemd .service file and do something more with a mostly-idle linux server sitting in a facility somewhere in Amsterdam. However I have found that I really like using this setup. When I was first shown Jupyter (née IPython) I was unimpressed and didn't see the point. However over the last few days I've been working through Project Euler problems again while teaching myself F# (using the IfSharp kernel) and I have found that it lends itself very well to my problem solving workflow on Project Euler.

Fedora - getting Tor and SELinux to play nice

tl;dr

If you have weird SELinux permissions issues using Tor on Fedora, skip to "The Solution" we're basically gonna add a couple of custom SELinux policies and update the permissions on the /var/lib/tor directory.

Intro

I'm trying to get a little bit out of my cosy Debian comfort zone, and since I have a few friends working at Red Hat figured I'd try out Fedora. While I was teaching myself about systemd however I ran into an issue - starting up a Tor hidden service using systemd was fine immediately after it was installed, but after a reboot it'd repeatedly fail - the following is what is displayed when I ran systemctl status tor.service:

    tor.service - Anonymizing overlay network for TCP
       Loaded: loaded (/usr/lib/systemd/system/tor.service; enabled; vendor preset: disabled)
       Active: failed (Result: start-limit-hit) since Mon 2017-02-20 11:16:44 CET; 2min 42s ago
      Process: 1150 ExecStartPre=/usr/bin/tor --runasdaemon 0 --defaults-torrc /usr/share/tor/defaults-torrc -f /etc
    
    Feb 20 11:16:44 localhost.localdomain systemd[1]: tor.service: Service hold-off time over, scheduling restart.
    Feb 20 11:16:44 localhost.localdomain systemd[1]: Stopped Anonymizing overlay network for TCP.
    Feb 20 11:16:44 localhost.localdomain systemd[1]: tor.service: Start request repeated too quickly.
    Feb 20 11:16:44 localhost.localdomain systemd[1]: Failed to start Anonymizing overlay network for TCP.
    Feb 20 11:16:44 localhost.localdomain systemd[1]: tor.service: Unit entered failed state.
    Feb 20 11:16:44 localhost.localdomain systemd[1]: tor.service: Failed with result 'start-limit-hit'.

The Problem

Looking a little closer at the logs in journalctl it seems that the tor process is not able to access the directory structure under /var/lib/tor - the toranon user's home directory.

    Feb 20 11:16:43 localhost.localdomain tor[1150]: Feb 20 11:16:43.033 [warn] Directory /var/lib/tor/ssh/ cannot be read: Permission denied
    Feb 20 11:16:43 localhost.localdomain tor[1150]: Feb 20 11:16:43.033 [warn] Failed to parse/validate config: Failed to configure rendezvous options. See logs for details.
    Feb 20 11:16:43 localhost.localdomain tor[1150]: Feb 20 11:16:43.034 [err] Reading config failed--see warnings above.

This appears to be down to SELinux kicking in and telling us the process is trying to do something it's not explicitly permitted to do according the the SELinux policies currently loaded. A quick google search for this error turns up a handful of results from other Fedora users:

In each of these a couple of workarounds are proposed, like disabling SELinux and setting the hidden service directory to be the /var/lib/tor directory. Disabling SELinux would probably work fine, but I'd rather not do that - I'd rather understand what's going on and eventually fix it properly. I also don't want to use the other workaround - since that would prevent me from running two separate hidden services, and what if I want to run ssh and operate a cryptocurrency-based online drugs supermarket[1]?

After a bit more digging I found a bug report on Red Hat's Bugzilla which described exactly the problem I saw (working, ten failing after reboot). However it also confirmed that as-at 14th February 2017 this was still an open issue (poor Kyle Marek spent his Valentines Day debugging Tor) - https://bugzilla.redhat.com/show_bug.cgi?id=1375369 so there's no "proper" fix yet.

The Solution

Until there's an "official" solution there is a semi-smart workaround proposed by the SELinux Alert Browser - to generate a local policy module to permit the access that SELinux restricted. The following steps assume that you've setup a hidden service in your /etc/tor/torrc and that it's failing to start with some weird permissions error.

Firstly let's sort out the permissions for the toranon user's home directory - some people reported that the root user owned some folders in this directory which isn't really desirable:

So let's do this, and sort out the permissions for the toranon user's home directory too.

    $ sudo find /var/lib/tor ! -user toranon
    $ sudo chown toranon /var/lib/tor/some/folder
    $ sudo find /var/lib/tor ! -group toranon
    $ sudo chown :toranon /var/lib/tor/some/folder

In my case /var/lib/tor itself was owned by root - I moved it to toranon just in case. Next let's add an SELinux policy to give the Tor service the permissions it wants:

    $ sudo ausearch -c 'tor' --raw | audit2allow -M tor-workaround
    ******************** IMPORTANT ***********************
    To make this policy package active, execute:
    
    semodule -i tor-workaround.pp
    $ sudo semodule -i tor-workaround.pp

Now after a reboot we should see that the service has successfully started up without any errors

    $ sudo systemctl reboot
    (later...)
    $ sudo systemctl status tor.service
     tor.service - Anonymizing overlay network for TCP
       Loaded: loaded (/usr/lib/systemd/system/tor.service; enabled; vendor preset: 
       Active: active (running) since Sun 2017-02-19 15:49:42 CET; 18min ago
      Process: 768 ExecStartPre=/usr/bin/tor --runasdaemon 0 --defaults-torrc /usr/s
     Main PID: 825 (tor)
        Tasks: 1 (limit: 4915)
       CGroup: /system.slice/tor.service
               └─825 /usr/bin/tor --runasdaemon 0 --defaults-torrc /usr/share/tor/de
    
    Feb 19 15:49:42 localhost.localdomain Tor[825]: Parsing GEOIP IPv6 file /usr/sha
    Feb 19 15:49:42 localhost.localdomain Tor[825]: Bootstrapped 0%: Starting
    Feb 19 15:49:42 localhost.localdomain Tor[825]: Bootstrapped 80%: Connecting to 
    Feb 19 15:49:42 localhost.localdomain systemd[1]: Started Anonymizing overlay ne
    Feb 19 15:49:42 localhost.localdomain Tor[825]: Signaled readiness to systemd
    Feb 19 15:49:43 localhost.localdomain Tor[825]: Opening Control listener on /run
    Feb 19 15:49:43 localhost.localdomain Tor[825]: Bootstrapped 85%: Finishing hand
    Feb 19 15:49:43 localhost.localdomain Tor[825]: Bootstrapped 90%: Establishing a
    Feb 19 15:49:44 localhost.localdomain Tor[825]: Tor has successfully opened a ci
    Feb 19 15:49:44 localhost.localdomain Tor[825]: Bootstrapped 100%: Done

Conclusion

It was a little bit unfortunate that I bumped into this when I was trying to familiarise myself with systemd, but it was good to have it sorted and I think that the next thing I should explore is SELinux. Perhaps I could understand and contribute the proper fix since this is just a little temporary workaround.

[1] - note: I do not want to run a cryptocurrency-based online drugs supermarket

XMonad - Issues after dist-upgrade

Every other time I run apt-get dist-upgrade my XMonad fails to rebuild with the error "Failed to load interface for âXMonad.Actions.Volumeâ"

The annoying thing is happens infrequently enough that each time I forget what the resolution is, google a bunch of things, then eventually stumble on the answer and move on. Anyway the reason it's a little tricky is that most pages or posts suggest that the correct resolution is to issue the following command

    $ sudo cabal update && sudo cabal install xmonad-extras

However for some reason I have a bunch of these Haskell packages installed in ~/.ghc and running this command using sudo doesn't update these, so instead I have to run it normally:

    $ cabal update && cabal install xmonad-extras

And that's it!

IronPython - Integer performance improvements

When I was puddling around in IronPython ahead of an upcoming project I spotted something interesting - when we want to deal with integers within the IronPython interpreter we frequently call a function in Microsoft.Scripting.Runtime.ScriptingRuntimeHelpers called Int32ToObject:

    public static object Int32ToObject(Int32 value) {
        // caches improves pystone by ~5-10% on MS .Net 1.1, this is a very integer intense app
        // TODO: investigate if this still helps perf. There's evidence that it's harmful on
        // .NET 3.5 and 4.0

        if (value < MAX_CACHE && value >= MIN_CACHE) {
            return cache[value - MIN_CACHE];
        }
        return (object)value;
    }

For integers in the range -100 and 1,000 we maintain a cached array of objects already converted from Int32. This is called a lot even in relatively innocuous looking programs - to see this we can a simple Console.Out.WriteLine("Int32ToObject") in there and count the number of times it occurs in a program that simply prints "Hello, world!" we can see this:

    $ cat hello.py
    print "Hello, world!"
    $ ./bin/Debug/ipy.exe hello.py | grep -c "Int32ToObject"
    1317

The code itself specifically references the pystone benchmark (which I found here) in a comment suggesting that we could see a performance improvement on pystone with versions of .NET newer than 3.5 - which appears to be the minimum version later versions of IronPython supports.

I built the Release configuration of IronPython both before and after removing this cache functionality, and tested the default pystone benchmark on my work computer (a pretty hefty 8-core Xeon E3-1275 @ 3.6 GHz, with 32GB RAM) - the results are below where I ran the test 10 times and took the average. The values output by the benchmark are "pystones per second" - where one "pystone" is an iteration through the main loop inside the Proc0() function which performs a number of integer operations and function calls:

	Before	After
1.	238262	234663
2.	239115	234595
3.	245149	245931
4.	237845	302562
5.	228906	295027
6.	248294	275535
7.	258694	297271
8.	246650	282791
9.	235741	296104
10.	233604	274396
Average	241226	273887

So with the fix we see 32,661 more of these iterations per-second, which is roughly a 13.5% improvement. This makes sense - presumably casting each int to object has been improved so that it's nearly free, leaving the overhead being a simple function call.

2016 and 2017

Reflecting on 2016

It's easy to lose track of achievements and get bogged down in stress in the short term so at the end of the year I like to look back at what I've done and think about what I liked or what went well.

Debian/Linux

First off not a huge one but when I replaced my ailing MacBook Air with a Thinkpad X250 I moved to using Linux and Xmonad as my daily driver (posts one, two and three on this). This relatively minor switch actually kicked off a relatively productive year in terms of professional development - I ended up having to re-learn a lot about Linux that I had forgotten, and bit-by-bit ended up learning and creating more and more. Many times on macOS I'd find some library or utility wouldn't work without a lot of config, and I'd often just give up - once I got past the initial "setup my environment" step Debian ended up smoother than macOS.

Python

I created and released a handful of Python packages on PyPI (em73xx, rbcz, xnm, xsms) - none of them have huge widespread appeal but they're fairly well written and let me work through the process of releasing on PyPI, producing documentation. I also wrote a fair amount about Python, and worked through Philip Guo's excellent CPython Internals course which I thoroughly enjoyed.

Blog

I wrote a total of 28 blog posts in 2016, which is slightly more than one every fortnight. They've only had a few thousand views in total but I've thoroughly enjoyed learning and creating something new, then writing it up. It's good to practice any sort of technical writing (even if it's an informal blog post) and it's pretty rewarding to look back on what you've done. Here's a handful of the ones I enjoyed writing most

Homepage

I also put together a new homepage @ www.mclemon.org. My company use ASP.NET MVC for all new UIs, and since I'm not always involved in the front-end development I felt like I was falling behind. Instead of shying away from this and resolving to look into it "later" I tackled it head on and set about creating a simple home page with MVC and deploy it to Azure. What started off as a simple static "Contact" page grew into a neat little aggregator of my Posthaven blogs, which I can extend pretty easily to pull in other stuff (I've tried out GitHub and Twitter, but my commits push everything out of the way, and I rarely post anything useful on Twitter!). It should degrade reasonably well on mobile, but as you can see from the screenshot there's a couple of whitespace issues I clearly need to tackle!

2017

I don't like resolutions, but I do think it's good to think ahead about some things I'd like to work on - even if I ultimately get sidetracked on something else.

IronPython

I've already written a couple of articles about IronPython, but I'd like to keep diving deeper. I have a really nice series of posts in the works which could end up being useful to the community at large so I'm pretty stoked about getting them completed. I'd ultimately like to start contributing to the project, hopefully by the time I've finished these I'll be knowledgeable enough to be of use to Alex Earl et al.

Czech

My Czech has stagnated - I've got some friends I write in Czech to but I'd like to read and write far more regularly. I've found that if I sit down I can break down most sentences with a bit of patience but my vocabulary and sentence construction is still pretty poor.

Books

There are so many books I'd like to read - but in terms of professional development I'd like to revisit some fundamentals - so working my way through the classic SICP is top of the list, but I'd also like to work through Grokking Algorithms In Python as it looked like a nice read.

Alfie

My little dog is adorable, but he's way too boisterous - I'd like him to:

return on command every time I call him (without resorting to tricking him with biscuits and the frisbee!)
walk to "heel" better
not jump on strangers
chill out in the pub

Of course #4 is the most important one :)

Clock

I mostly cracked the Brno astronomical clock (aka the cock clock) at summer, and started writing a blog post about it - but I needed to create a couple more timelapses and visualisations to complete the picture and never found time to do it. This clock is kinda famous for being unintelligible so it'd be nice to share the knowledge with the world!

General Focus

In general I'd like to focus more on a couple of disciplines or topics and become more of an expert in them. Looking back at my posts I've covered a fairly broad spectrum of things, but I never really went into much detail on any of them. I'm spreading myself a little thinly and it doesn't help that I've got the attention span of a goldfish!