A Little More R in R

Couple of things were bothering me:

  • the key wasn’t really a key it was a subtitle
  • the title needs a date
  • data wasn’t getting automatically downloaded
  • column naming was a mess
  • structure / formating of the ggplot was inconsistent

Sunday morning is the obvious time to fix such issues! Below is new plot, with a key, and built from data pulled from RKI.


And here is my updated script:


download.file("https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Projekte_RKI/Nowcasting_Zahlen.xlsx?__blob=publicationFile", "Nowcasting_Zahlen.xlsx")
nz_data <- read_excel("Nowcasting_Zahlen.xlsx", sheet = "Nowcast_R")
names(nz_data) <- c("date","new","new_under","new_over","new2", "new2_under", "new2_over", "R", "R_under", "R_over", "R7", "R7_under", "R7_over")
g <- ggplot(data = nz_data)
g <- g + geom_line(mapping = aes(x = date, y = R, color = "4 day"))
g <- g + geom_ribbon(mapping = aes(x = date, y = R, ymin = R_under, ymax = R_over), alpha = 0.3)
g <- g + geom_line(mapping = aes(x = date, y = R7, color = "7 day"))
g <- g + geom_ribbon(mapping = aes(x = date, y = R7, ymin = R7_under, ymax = R7_over), alpha = 0.5)
g <- g + ggtitle(label = sprintf("R-effective for Germany (%s)", format(Sys.Date(), format = "%b %d %Y")))
g <- g + ylab("R") + xlab("Date")
g <- g + scale_color_manual(values = c('4 day' = 'firebrick', '7 day' = 'darkblue'))
g <- g + labs(color = 'Average')

Now might be a good time to go outside and not think about this!

A Little R in R

As usually happens when learning something new, it gets pointed out that while i might have achieved my goal, there is a far cooler way of doing the same thing! In this case that would be R.


Now it seems a shame to have got this far into a career without ever having played around in R. The above is pretty close to the simplest plot posssible, but hey, it was fun!

Continue reading

R Numbers

A few days ago there was a post on /r/de showing a nice graph of now famous R number for Germany:


Exciting for a couple of reasons: that the data-set was available (i’d been looking for it for a while); the reported number was over 1 … which isn’t good.

The post linked to a .pynb script that had been used to produce the image. The code was obviously Python and somehow related to Jupyter Notebook – time to learn something new!

Since the original script was published the Excel spreadsheet download added a cover sheet, which obviously broke things. Having patched things up a little (and translated the labels to English), here is the updated plot:


One of the labels has gone missing… oops.

Update: the missing label was important! The above graph is for a new RKI dataset that tracks R on a 7 day average, the original series was too sensitive (see below). For “reasons” that series doesn’t include error estimates for the last data points, which broke calculation done on the final point. Below is a new plot on the 4 day averaged series – now really an update of the original image:


As you can see the R value for the original chart has been revised down, and the current value remains below 1. The brief bump up was attributed to the infection of slaughter house slave labour, housed in cramped shared dormitories. Come of Germany – be better than that!

Update: weekends are obviously good for tracking down datasets and visualizations! There is a GitLab project running model simulations on the regional data, below is the plot for Hamburg:


Having no background in epidemiology (or statistical analysis…) all i can do is accept it as presented, and note that it correlates well with the recent decline in reported new infections. The RKI made an interesting observation on the sensitivity of R the other day, noting that as the number of active cases (infections) fell any new hotspots (such as the slaughterhouse outbreaks) would have a larger impact on the reported number.

Concatenation with ffmpeg

This is has been something that regularly happens. Never remember how to do it…

Create a file that contains each file to be concatenated:

% for i in `ls`; do
echo file \'$i\' >> files.txt
jje@wretched cat % cat files.txt
file '1.mp4'
file '2.mp4'
file '3.mp4'

and now let ffmpeg do it’s magic:

% ffmpeg -f concat -safe 0 -i files.txt -c copy out.mp4

This works well for sites that make youtube-dl download files in pieces… and having written this, i’ll no longer need to reference it again.

The Idiot

I’m reading Will, by Will Self. There is a teenage moment that is just perfect:

Simon’s a mole of a young man, with a thick pile brown hair and blinking myopic eyes. He’s an epileptic, and wears a chunky silver bracelet engraved with the rather alarming instructions that need to be followed if he succumbs to grand mal seizure. Will – by way of letting everyone know he’s reading the Russians – calls Simon Rogozhin, and asks often if, in the run-up to his attacks, he experiences the same wild epiphanies as Dostoevsky’s antihero.

The reference is to The Idiot. But the antihero Rogozhin is not the epileptic – that would be Prince Myshkin, the eponymous idiot, who is definitely not the antihero.

I’m ashamed (not enough to stop me posting this!) to report being quite thrilled to have identified this error (having just finished reading The Idiot)… but maybe it’s not what it seems?

My own recollection of reading classics as a teenager has left me with many unreliable memories of books that were beyond my life experience. Perhaps that’s what’s happening here… and i’m (still) the idiot?

Burpee Death

Death by Burpee! These bastards are killing me.

A few years ago i got into the habit of doing Burpees every day… and then, like most good intentions, it fell away. Have now started again. It is going to take a while to build back up to any reasonable level of fitness. Twenty repetitions makes me wonder if i’m having a heart attack. Still, a few days in my recovery time is already improving. Hopefully it will only take a few weeks to get back to some level of fitness.