Initial conversion to nikola

This commit is contained in:
Jez Cope 2017-10-26 16:54:51 +01:00
parent d5efca80a0
commit 96a5c83ba4
197 changed files with 4170 additions and 1160 deletions

2
.gitignore vendored
View File

@ -2,3 +2,5 @@
/tmp/
/output/
output.diff
/cache/
.doit.db

2
.gitmodules vendored
View File

@ -1,3 +1,3 @@
[submodule "themes/sidmouth"]
path = themes/sidmouth
url = https://github.com/jezcope/theme-sidmouth-hugo.git
url = https://github.com/jezcope/theme-sidmouth-nikola.git

10
Pipfile Normal file
View File

@ -0,0 +1,10 @@
[[source]]
url = "https://pypi.python.org/simple"
verify_ssl = true
[packages]
ws4py = "*"
watchdog = "*"
typogrify = "*"
Nikola = "*"
Jinja2 = "*"

122
Pipfile.lock generated Normal file
View File

@ -0,0 +1,122 @@
{
"_meta": {
"hash": {
"sha256": "0ebd7fc92d49105fb7beb509bab6e4ce801aedbba368a82e0334d7200207ae63"
},
"requires": {},
"sources": [
{
"url": "https://pypi.python.org/simple",
"verify_ssl": true
}
]
},
"default": {
"argh": {
"version": "==0.26.2"
},
"blinker": {
"version": "==1.4"
},
"certifi": {
"version": "==2017.4.17"
},
"chardet": {
"version": "==3.0.4"
},
"cloudpickle": {
"version": "==0.3.1"
},
"docutils": {
"version": "==0.13.1"
},
"doit": {
"version": "==0.30.3"
},
"idna": {
"version": "==2.5"
},
"jinja2": {
"version": "==2.9.6"
},
"logbook": {
"version": "==1.0.0"
},
"lxml": {
"version": "==3.8.0"
},
"mako": {
"version": "==1.0.6"
},
"markdown": {
"version": "==2.6.8"
},
"markupsafe": {
"version": "==1.0"
},
"natsort": {
"version": "==5.0.3"
},
"nikola": {
"version": "==7.8.9"
},
"olefile": {
"version": "==0.44"
},
"pathtools": {
"version": "==0.1.2"
},
"piexif": {
"version": "==1.0.12"
},
"pillow": {
"version": "==4.2.1"
},
"pygments": {
"version": "==2.2.0"
},
"pyinotify": {
"version": "==0.9.6"
},
"pyrss2gen": {
"version": "==1.1"
},
"python-dateutil": {
"version": "==2.6.1"
},
"pyyaml": {
"version": "==3.12"
},
"requests": {
"version": "==2.18.1"
},
"setuptools": {
"version": "==36.0.1"
},
"six": {
"version": "==1.10.0"
},
"smartypants": {
"version": "==2.0.0"
},
"typogrify": {
"version": "==2.0.7"
},
"unidecode": {
"version": "==0.04.21"
},
"urllib3": {
"version": "==1.21.1"
},
"watchdog": {
"version": "==0.8.3"
},
"ws4py": {
"version": "==0.4.2"
},
"yapsy": {
"version": "==1.11.223"
}
},
"develop": {}
}

View File

@ -1,7 +0,0 @@
---
type: "page"
draft: true
author: "Jez"
description: "description"
tags: ["one", "two"]
---

View File

@ -1,6 +0,0 @@
---
type: "page"
draft: true
author: "Jez"
tags: ["one", "two"]
---

View File

@ -1,8 +0,0 @@
---
type: "post"
draft: true
author: "Jez"
description: "description"
topics: ["Research communication", "Higher Education", "Technology", "Stuff"]
tags: ["one", "two"]
---

1329
conf.py Normal file

File diff suppressed because it is too large Load Diff

View File

@ -1,48 +0,0 @@
baseurl: https://erambler.co.uk/
title: eRambler
copyright: "(c) 2016 Jez Cope — [CC-BY-SA](/license/)"
languageCode: en-GB
metaDataFormat: yaml
rssuri: feed.xml
ignoreFiles: ["^\\.#"]
theme: sidmouth
disqusShortname: erambler
permalinks:
post: '/blog/:slug/'
page: '/:slug/'
taxonomies:
tag: tags
topic: topics
author:
name: Jez Cope
params:
home: home
brand: a blog about
topline: |
[research communication](/topics/research-communication/)
<span class="amp">&</span> [higher education](/topics/higher-education)
<span class="amp">&</span> [open culture](/topics/open-culture)
<span class="amp">&</span> [technology](/topics/technology/)
<span class="amp">&</span> [librarianship](/topics/librarianship/)
<span class="amp">&</span> [stuff](/topics/stuff/)
footline: 'brought to you by [hugo](http://gohugo.io/)'
googleAnalytics: UA-10201101-1
github: jezcope
twitter: jezcope
linkedin: jezcope
wercker: eabedc12f9bc4c2e496bb85791b4207b
sidebar: left
highlight: solarized_light
...

View File

@ -1,28 +0,0 @@
---
title: "#IDCC16 Day 0: business models for research data management"
date: 2016-02-22T18:20:55+01:00
slug: idcc16-day-0
draft: false
topics:
- Research communication
tags:
- IDCC16
- Research data management
- Conference
- Service planning
---
I'm at the [International Digital Curation Conference 2016][IDCC16] (#IDCC16) in Amsterdam this week. It's always a good opportunity to pick up some new ideas and catch up with colleagues from around the world, and I always come back full of new possibilities. I'll try and do some more reflective posts after the conference but I thought I'd do some quick reactions while everything is still fresh.
Monday and Thursday are pre- and post-conference workshop days, and today I attended [*Developing Research Data Management Services*][workshop]. Joy Davidson and Jonathan Rans from the [Digital Curation Centre (DCC)][] introduced us to the [Business Model Canvas][BMC], a template for designing a business model on a single sheet of paper. The model prompts you to think about all of the key facets of a sustainable, profitable business, and can easily be adapted to the task of building a service model within a larger institution. The DCC used it as part of the [Collaboration to Clarify Curation Costs (4C) project][4C], whose output the [Curation Costs Exchange][CCEx] is also worth a look.
It was a really useful exercise to be able to work through the whole process for an aspect of research data management (my table focused on training & guidance provision), both because of the ideas that came up and also the experience of putting the framework into practice. It seems like a really valuable tool and I look forward to seeing how it might help us with our RDM service development.
Tomorrow the conference proper begins, with a range of keynotes, panel sessions and birds-of-a-feather meetings so hopefully more then!
[IDCC16]: http://www.dcc.ac.uk/events/idcc16
[workshop]: http://www.dcc.ac.uk/events/idcc16/workshops#Workshop%201
[Digital Curation Centre (DCC)]: http://www.dcc.ac.uk/
[BMC]: http://www.businessmodelgeneration.com/canvas/bmc
[4C]: http://www.curationexchange.org/about#4cproject
[CCEx]: http://www.curationexchange.org/

View File

@ -1,40 +0,0 @@
---
comments: true
date: 2016-02-23T19:43:57+01:00
draft: false
image: ""
menu: ""
share: true
topics:
- Research communication
tags:
- IDCC16
- Research data management
- Conference
- Open data
slug: idcc16-day-1
title: "#IDCC16 Day 1: Open Data"
---
The main conference opened today with an inspiring keynote by Barend Mons, Professor in Biosemantics, Leiden University Medical Center. The talk had plenty of great stuff, but two points stood out for me.
First, Prof Mons described a newly discovered link between Huntingdon's Disease and a previously unconsidered gene. No-one had previously recognised this link, but on mining the literature, an indirect link was identified in more than 10% of the roughly 1 million scientific claims analysed. This is knowledge for which we already had more than enough evidence, but **which could never have been discovered without such a wide-ranging computational study**.
Second, he described a number of behaviours which **should be considered "malpractice" in science**:
- Relying on supplementary data in articles for data sharing: the majority of this is trash (paywalled, embedded in bitmap images, missing)
- Using the Journal Impact Factor to evaluate science and ignoring altmetrics
- Not writing data stewardship plans for projects (he prefers this term to "data management plan")
- Obstructing tenure for data experts by assuming that all highly-skilled scientists must have a long publication record
A second plenary talk from Andrew Sallons of the [Centre for Open Science](http://cos.io) introduced a number of interesting-looking bits and bobs, including the [Transparency & Openness Promotion (TOP) Guidelines][TOP] which set out a pathway to help funders, publishers and institutions move towards more open science.
[TOP]: https://osf.io/9f6gx/wiki/Guidelines/
The rest of the day was taken up with a panel on open data, a poster session, some demos and a birds-of-a-feather session on sharing sensitive/confidential data. There was a great range of posters, but a few that stood out to me were:
- Lessons learned about ISO 16363 ("Audit and certification of trustworthy digital repositories") certification from the British Library
- Two separate posters (from the Universities of Toronto and Colorado) about disciplinary RDM information & training for liaison librarians
- A template for sharing psychology data developed by a psychologist-turned-information researcher from Carnegie Mellon University
More to follow, but for now it's time for the conference dinner!

View File

@ -1,56 +0,0 @@
---
title: '#IDCC16 day 2: new ideas'
# description: 'Lots of new ideas from #IDCC16 day 2!'
slug: idcc16-day-2
date: 2016-03-16T07:44:14+01:00
type: post
topics:
- Research communication
tags:
- IDCC16
- Conference
- Open data
- Research data management
---
*Well, I did a great job of blogging the conference for a couple of days, but then I was hit by the bug that's been going round and didn't have a lot of energy for anything other than paying attention and making notes during the day! I've now got round to reviewing my notes so here are a few reflections on day 2.*
Day 2 was the day of many parallel talks! So many great and inspiring ideas to take in! Here are a few of my take-home points.
## Big science and the long tail ##
The first parallel session had examples of practical data management in the real world. Jian Qin & Brian Dobreski (School of Information Studies, Syracuse University) worked on reproducibility with one of the research groups involved with the recent gravitational wave discovery. "Reproducibility" for this work (as with much of physics) mostly equates to computational reproducibility: tracking the provenance of the code and its input and output is key. They also found that in practice the scientists' focus was on making the big discovery, and ensuring reproducibility was seen as secondary. This goes some way to explaining why current workflows and tools don't really capture enough metadata.
Milena Golshan & Ashley Sands (Center for Knowledge Infrastructures, UCLA) investigated the use of Software-as-a-Service (SaaS, such as Google Drive, Dropbox or more specialised tools) as a way of meeting the needs of long-tail science research such as ocean science. This research is characterised by small teams, diverse data, dynamic local development of tools, local practices and difficulty disseminating data. This results in a need for researchers to be generalists, as opposed to "big science" research areas, where they can afford to specialise much more deeply. Such generalists tend to develop their own isolated workflows, which can differ greatly even within a single lab. Long-tail research also often struggles from a lack of dedicated IT support. They found that use of SaaS could help to meet these challenges, but with a high cost required to cover the needed guarantees of security and stability.
## Education & training ##
This session focussed on the professional development of library staff. Eleanor Mattern (University of Pittsburgh) described the immersive training introduced to improve librarians' understanding of the data needs of their subject areas in delivering their [RDM service delivery model][UPitt model]. The participants each conducted a "disciplinary deep dive", shadowing researchers and then reporting back to the group on their discoveries with a presentation and discussion.
Liz Lyon (also University of Pittsburgh, formerly UKOLN/DCC) gave a systematic breakdown of the skills, knowledge and experience required in different data-related roles, obtained from an analysis of job adverts. She identified distinct roles of data analyst, data engineer and data journalist, and as well as each role's distinctive skills, pinpointed common requirements of all three: Python, R, SQL and Excel. This work follows on from an earlier phase which identified an allied set of roles: data archivist, data librarian and data steward.
[UPitt model]: http://d-scholarship.pitt.edu/26738/
## Data sharing and reuse ##
This session gave an overview of several specific workflow tools designed for researchers. Marisa Strong (University of California Curation Centre/California Digital Libraries) presented *[Dash](https://dash.cdlib.org/)*, a highly modular tool for manual data curation and deposit by researchers. It's built on their flexible backend, *Stash*, and though it's currently optimised to deposit in their Merritt data repository it could easily be hooked up to other repositories. It captures DataCite metadata and a few other fields, and is integrated with ORCID to uniquely identify people.
In a different vein, Eleni Castro (Institute for Quantitative Social Science, Harvard University) discussed some of the ways that [Harvard's Dataverse](http://dataverse.org/) repository is streamlining deposit by enabling automation. It provides a number of standardised endpoints such as [OAI-PMH](https://www.openarchives.org/pmh/) for metadata harvest and [SWORD](http://swordapp.org/) for deposit, as well as custom APIs for discovery and deposit. Interesting use cases include:
- An addon for the [Open Science Framework](https://osf.io/) to deposit in Dataverse via SWORD
- An [R package](https://cran.r-project.org/web/packages/dvn/README.html) to enable automatic deposit of simulation and analysis results
- Integration with publisher workflows Open Journal Systems
- A growing set of visualisations for deposited data
In the future they're also looking to integrate with [DMPtool](https://dmptool.org/) to capture data management plans and with Archivematica for digital preservation.
Andrew Treloar ([Australian National Data Service](http://ands.org.au/)) gave us some reflections on the ANDS "applications programme", a series of 25 small funded projects intended to address the fourth of their strategic transformations, *single use**reusable*. He observed that essentially these projects worked because they were able to throw money at a problem until they found a solution: not very sustainable. Some of them stuck to a [traditional "waterfall" approach to project management](https://en.m.wikipedia.org/wiki/Waterfall_model), resulting in "the right solution 2 years late". Every researcher's needs are "special" and communities are still constrained by old ways of working. The conclusions from this programme were that:
- "Good enough" is fine most of the time
- Adopt/Adapt/Augment is better than Build
- Existing toolkits let you focus on the 10% functionality that's missing
- Succussful projects involved research champions who can: 1) articulate their community's requirements; and 2) promote project outcomes
## Summary ##
All in all, it was a really exciting conference, and I've come home with loads of new ideas and plans to develop our services at Sheffield. I noticed a continuation of some of the trends I spotted at last year's IDCC, especially an increasing focus on "second-order" problems: we're no longer spending most of our energy just convincing researchers to take data management seriously and are able to spend more time helping them to do it *better* and get value out of it. There's also a shift in emphasis (identified by closing speaker Cliff Lynch) from sharing to reuse, and making sure that data is not just available but valuable.

View File

@ -1,35 +0,0 @@
---
title: "Data is like water, and language is like clothing"
teaser: "Data is like information in more ways than one, and it's like water too"
date: 2016-03-31T17:40:00+01:00
slug: language-is-like-clothing
draft: false
topics:
- Stuff
tags:
- Language
- Grammar
- Data
---
I admit it: I'm a grammar nerd. I know the difference between 'who' and 'whom', and I'm proud.
I used to be pretty militant, but these days I'm more relaxed. I still take joy in the mechanics of the language, but I also believe that English is defined by its usage, not by a set of arbitrary rules. I'm just as happy to abuse it as to use it, although I still think it's important to know what rules you're breaking and why.
My approach now boils down to this: **language is like clothing**. You (probably) wouldn't show up to a job interview in your pyjamas[^2], but neither are you going to wear a tuxedo or ballgown to the pub.
Getting commas and semicolons in the right place is like getting your shirt buttons done up right. Getting it wrong doesn't mean you're an idiot. Everyone will know what you meant. It will affect how you're perceived, though, and that will affect how your *message* is perceived.
And there are former rules[^1] that some still enforce that are nonetheless dropping out of regular usage. There was a time when everyone in an office job wore formal clothing. Then it became acceptable just to have a blouse, or a shirt and tie. Then the tie became optional and now there are many professions where perfectly well-respected and competent people are expected to show up wearing nothing smarter than jeans and a t-shirt.
[^1]: Like not starting a sentence with a conjunction...
One such rule IMHO is that 'data' is a plural and should take pronouns like 'they' and 'these'. The origin of the word 'data' is in the Latin plural of 'datum', and that idea has clung on for a considerable period. But we don't speak Latin and the English language continues to evolve: 'agenda' also began life as a Latin plural, but we don't use the word 'agendum' any more. It's common everyday usage to refer to data with singular pronouns like 'it' and 'this', and it's very rare to see someone referring to a single datum (as opposed to 'data point' or something).
If you want to get technical, I tend to think of data as a mass noun, like 'water' or 'information'. It's uncountable: talking about 'a water' or 'an information' doesn't make much sense, but it uses singular pronouns, as in 'this information'. If you're interested, the Oxford English Dictionary also takes this position, while Chambers leaves the choice of singular or plural noun up to you.
There is absolutely nothing wrong, in my book, with referring to data in the plural as many people still do. But it's no longer a rule and for me it's weakened further from guideline to preference.
It's like wearing a bow-tie to work. There's nothing wrong with it and some people really make it work, but it's increasingly outdated and even a little eccentric.
[^2]: or maybe you'd totally rock it.

View File

@ -1,35 +0,0 @@
---
title: 'Wiring my web'
slug: wiring-my-web
date: 2016-04-01T17:37:00+01:00
type: post
topics:
- Technology
tags:
- APIs
- Web
- Automation
- IFTTT
---
<!-- [![XKCD: automation](http://imgs.xkcd.com/comics/automation.png){:.main-illustration}](https://xkcd.com/1319/) -->
{{< figure alt="XKCD: automation" src="http://imgs.xkcd.com/comics/automation.png" class="main-illustration fr" link="https://xkcd.com/1319/" >}}
I'm a nut for automating repetitive tasks, so I was dead pleased a few years ago when I discovered that [IFTTT](https://ifttt.com) let me plug different bits of the web together. I now use it for tasks such as:
- Syndicating blog posts to social media
- Creating scheduled/repeating todo items from a Google Calendar
- Making a note to revisit an article I've starred in Feedly
I'd probably only be half-joking if I said that I spend more time automating things than I save not having to do said things manually. Thankfully it's also a great opportunity to learn, and recently I've been thinking about reimplementing some of my IFTTT workflows myself to get to grips with how it all works.
There are some interesting open source projects designed to offer a lot of this functionality, such as [Huginn](https://github.com/cantino/huginn), but I decided to go for a simpler option for two reasons:
1. I want to spend my time learning about the APIs of the services I use and how to wire them together, rather than learning how to use another big framework; and
2. I only have a small Amazon EC2 server to pay with and a heavy Ruby on Rails app like Huginn (plus web server) needs more memory than I have.
Instead I've gone old-school with a little collection of individual scripts to do particular jobs. I'm using the built-in scheduling functionality of systemd, which is already part of a modern Linux operating system, to get them to run periodically. It also means I can vary the language I use to write each one depending on the needs of the job at hand and what I want to learn/feel like at the time. Currently it's all done in Python, but I want to have a go at Lisp sometime, and there are some interesting new languages like Go and Julia that I'd like to get my teeth into as well.
You can see my code on github as it develops: <https://github.com/jezcope/web-plumbing>. Comments and contributions are welcome (if not expected) and let me know if you find any of the code useful.
*Image credit: [xkcd #1319, Automation](https://xkcd.com/1319/)*

View File

@ -1,39 +0,0 @@
---
title: 'Fairphone 2: initial thoughts on the original ethical smartphone'
slug: fairphone-first-thoughts
date: 2016-05-07T16:56:29+01:00
type: post
topics:
- Technology
tags:
- Gadgets
- Fairphone
- Smartphone
- Ethics
---
{{< figure alt="Naked Fairphone" src="/assets/images/posts/2016-05-07-fairphone.jpg" class="main-illustration fr" >}}
I've had my eye on the [Fairphone 2](https://www.fairphone.com/) for a while now, and when my current phone, an aging Samsung Galaxy S4, started playing up I decided it was time to take the plunge. A few people have asked for my thoughts on the Fairphone so here are a few notes.
## Why I bought it
The thing that sparked my interest, and the main reason for buying the phone really, was the ethical stance of the manufacturer. The small Swedish company have gone to great lengths to ensure that both labour and materials are sourced as responsibly as possible. They regularly inspect the factories where the parts are made and assembled to ensure fair treatment of the workers and they source all the raw materials carefully to minimise the environmental impact and the use of conflict minerals.
Another side to this ethical stance is a focus on longevity of the phone itself. This is not a product with an intentionally limited lifespan. Instead, it's designed to be modular and as repairable as possible, by the owner themselves. Spares are available for all of the parts that commonly fail in phones (including screen and camera), and at the time of writing the [Fairphone 2 is the only phone to receive 10/10 for reparability from iFixit](https://www.ifixit.com/Teardown/Fairphone+2+Teardown/52523). There are plans to allow hardware upgrades, including an expansion port on the back so that NFC or wireless charging could be added with a new case, for example.
## What I like
So far, the killer feature for me is the dual SIM card slots. I have both a personal and a work phone, and the latter was always getting left at home or in the office or running out of charge. Now I have both SIMs in the one phone: I can recieve calls on either number, turn them on and off independently and choose which account to use when sending a text or making a call.
The OS is very close to "standard" Android, which is nice, and I really don't miss all the extra bloatware that came with the Galaxy S4. It also has twice the storage of that phone, which is hardly unique but is still nice to have.
Overall, it seems like a solid, reliable phone, though it's not going to outperform anything else at the same price point. It certainly feels nice and snappy for everything I want to use it for. I'm no mobile gamer, but there is that distant promise of upgradability on the horizon if you are.
## What I don't like
I only have two bugbears so far. Once or twice it's locked up and become unresponsive, requiring a "manual reset" (removing and replacing the battery) to get going again. It also lacks NFC, which isn't really a deal breaker, but I was just starting to make occasional use of it on the S4 (mostly experimenting with my [Yubikey NEO](https://www.yubico.com/products/yubikey-hardware/yubikey-neo/)) and it would have been nice to try out Android Pay when it finally arrives in the UK.
## Overall
It's definitely a serious contender if you're looking for a new smartphone and aren't bothered about serious mobile gaming. You do pay a premium for the ethical sourcing and modularity, but I feel that's worth it for me. I'm looking forward to seeing how it works out as a phone.

View File

@ -1,55 +0,0 @@
---
title: "Changing static site generators: Nanoc → Hugo"
author: Jez Cope
date: 2016-08-12T13:18:28+01:00
slug: changing-static-site-generators-nanoc-hugo
description: |
In which I continue to avoid actually writing
by moving to a new static site generator
draft: false
topics:
- Stuff
tags:
- Meta
- Web
type: post
---
I've decided to move the site over to a different static site generator,
[Hugo](http://gohugo.io/).
I've been using [Nanoc](http://nanoc.ws) for a long time and it's worked very well,
but lately it's been taking longer and longer to compile the site
and throwing weird errors that I can't get to the bottom of.
At the time I started using Nanoc, static site generators were in their infancy.
There weren't the huge number of feature-loaded options that there are now,
so I chose one and I built a whole load of blogging-related functionality myself.
I did it in ways that made sense at the time
but no longer work well with Nanoc's latest versions.
So it's time to move to something that has blogging baked-in from the beginning
and I'm taking the opportunity to overhaul the look and feel too.
Again, when I started there weren't many pre-existing themes
so I built the whole thing myself
and though I'm happy with the work I did on it
it never quite felt polished enough.
Now I've got the opportunity
to adapt one of the many well-designed themes already out there,
so I've taken one from the [Hugo themes gallery](http://themes.gohugo.io)
and tweaked the colours to my satisfaction.
Hugo also has various features that I've wanted to implement in Nanoc
but never quite got round to it.
The nicest one is proper handling of draft posts and future dates,
but I keep finding others.
There's a lot of old content that isn't *quite* compatible with the way Hugo does things
so I've taken the old Nanoc-compiled content and frozen it
to make sure that old links should still work.
I could probably fiddle with it for years without doing much
so it's probably time to go ahead and publish it.
I'm still not completely happy with my choice of theme
but one of the joys of Hugo is that I can change that whenever I want.
Let me know what you think!

View File

@ -1,95 +0,0 @@
---
title: 'Software Carpentry: SC Build; or making a better make'
description: |
Entrants to the SC Build category were invited
to create a replacement for the venerable make tool.
How did they do?
slug: sc-build
date: 2016-08-19T19:30:13+01:00
draft: false
type: post
topics:
- Technology
tags:
- Software Carpentry
- Web archaeology
- SCons
- Python
- Make
series: swc-archaeology
---
> Software tools often grow incrementally from small beginnings into elaborate artefacts. Each increment makes sense, but the final edifice is a mess. make is an excellent example: a simple tool that has grown into a complex domain-specific programming language. I look forward to seeing the improvements we will get from designing the tool afresh, as a whole...
> --- *Simon Peyton-Jones, Microsoft Research (quote taken from [SC Build page][SC Build])*
Most people who have had to compile an existing software tool
will have come across the venerable [`make`][make] tool
(which usually these days means [GNU Make][]).
It allows the developer to write a declarative set of rules
specifying how the final software should be built
from its component parts,
mostly source code,
allowing the build itself to be carried out
by simply typing `make` at the command line and hitting `Enter`.
Given a set of rules,
`make` will work out all the dependencies between components
and ensure everything is built in the right order
and nothing that is up-to-date is rebuilt.
Great in principle
but `make` is notoriously difficult for beginners to learn,
as much of the logic for how builds are actually carried out
is hidden beneath the surface.
This also makes it difficult to debug problems
when building large projects.
For these reasons,
the [*SC Build* category][SC Build] called for a replacement build tool
engineered from the ground up to solve these problems.
The second round winner, ScCons,
is a [Python-based make-like build tool][SCons]
written by Steven Knight.
While I could find no evidence of any of the other shortlisted entries,
this project (now renamed SCons)
continues in active use and development to this day.
I actually use this one myself from time to time
and to be honest I prefer it in many cases
to trendy new tools like [rake][] or [grunt][]
and the behemoth that is [Apache Ant][].
Its Python-based `SConstruct` file syntax is remarkably intuitive
and scales nicely from very simple builds
up to big and complicated project,
with good dependency tracking to avoid unnecessary recompiling.
It has a lot of built-in rules for performing common build & compile tasks,
but it's trivial to add your own,
either by combining existing building blocks
or by writing a new builder with the full power of Python.
A minimal `SConstruct` file looks like this:
```python
Program('hello.c')
```
Couldn't be simpler!
And you have the full power of Python syntax
to keep your build file simple and readable.
It's interesting that all the entries in this category apart from one
chose to use a Python-derived syntax for describing build steps.
Python was clearly already a language of choice
for flexible multi-purpose computing.
The exception is the entry that chose to use XML instead,
which I think is a horrible idea
(oh how I used to love XML!)
but has been used to great effect in the Java world
by tools like Ant and Maven.
[make]: https://en.wikipedia.org/wiki/Make_(software)
[GNU make]: https://www.gnu.org/software/make/
[SC Build]: https://web.archive.org/web/20061116215358/http://www.software-carpentry.com/sc_build/index.html
[SCons]: http://scons.org/
[rake]: http://rake.rubyforge.org/
[grunt]: http://gruntjs.com/
[Apache Ant]: https://ant.apache.org/

View File

@ -1,102 +0,0 @@
---
title: "Software Carpentry: SC Config; write once, compile anywhere"
description: |
The SC Config category
asked competitors to make it easy to make software
that runs on any platform.
How did they get on?
slug: sc-config
date: 2016-08-26T19:47:40+01:00
draft: false
type: post
topics:
- Technology
tags:
- Software Carpentry
- Web archaeology
- autoconf
series: swc-archaeology
---
> Nine years ago, when I first release Python to the world, I distributed it with a Makefile for BSD Unix. The most frequent questions and suggestions I received in response to these early distributions were about building it on different Unix platforms. Someone pointed me to autoconf, which allowed me to create a configure script that figured out platform idiosyncracies Unfortunately, autoconf is painful to use -- its grouping, quoting and commenting conventions don't match those of the target language, which makes scripts hard to write and even harder to debug. I hope that this competition comes up with a better solution --- it would make porting Python to new platforms a lot easier!
> --- *Guido van Rossum, Technical Director, Python Consortium (quote taken from [SC Config page][SC Config])*
On to the next Software Carpentry competition category, then.
One of the challenges of writing open source software
is that you have to make it run on a wide range of systems
over which you have no control.
You don't know what operating system any given user might be using
or what libraries they have installed,
or even what versions of those libraries.
This means that whatever build system you use,
you can't just send the Makefile (or whatever) to someone else
and expect everything to go off without a hitch.
For a very long time,
it's been common practice for source packages to include a `configure` script
that, when executed, runs a bunch of tests to see what it has to work with
and sets up the Makefile accordingly.
Writing these scripts by hand is a nightmare,
so tools like [`autoconf`][autoconf] and [`automake`][automake] evolved
to make things a little easier.
They did, and if the tests you want to use are already implemented
they work very well indeed.
Unfortunately they're built on an unholy combination of
shell scripting and the archaic Gnu M4 macro language.
That means if you want to write new tests
you need to understand both of these
as well as the architecture of the tools themselves
--- not an easy task for the average self-taught research programmer.
[SC Conf][SC Config], then, called for a re-engineering of the autoconf concept,
to make it easier for researchers to make their code available
in a portable, platform-independent format.
The second round configuration tool winner was SapCat,
"a tool to help make software portable".
Unfortunately, this one seems not to have gone anywhere,
and I could only find [the original proposal][SapCat] on the Internet Archive.
[SapCat]: https://web.archive.org/web/20131130123139/http://homepages.rpi.edu/~toddr/Archives/2000/a04g-sapcat-final/SapCat/index.html
There were a lot of good ideas in this category
about making catalogues and databases of system quirks
to avoid having to rerun the same expensive tests again
the way a standard `./configure` script does.
I think one reason none of these ideas survived
is that they were overly ambitions,
imagining a grand architecture
where their tool provide some overarching source of truth.
This is in stark contrast to the way most Unix-like systems work,
where each tool does one very specific job well
and tools are easy to combine in various ways.
In the end though, I think Moore's Law won out here,
making it easier to do the brute-force checks each time
than to try anything clever to save time
--- a good example of avoiding unnecessary optimisation.
Add to that the evolution of the generic [`pkg-config`][pkg-config] tool
from earlier package-specific tools like `gtk-config`,
and it's now much easier to check for
particular versions and features of common packages.
On top of that,
much of the day-to-day coding of a modern researcher
happens in interpreted languages like Python and R,
which give you a fully-functioning pre-configured environment
with a lot less compiling to do.
As a side note, [Tom Tromey][],
another of the shortlisted entrants in this category,
is still a major contributor to the open source world.
He still seems to be involved in the automake project,
contributes a lot of code to the emacs community too
and blogs sporadically at [The Cliffs of Inanity][].
[SC Config]: https://web.archive.org/web/20071014042737/http://software-carpentry.com/sc_config/index.html
[autoconf]: https://www.gnu.org/software/autoconf/autoconf.html
[automake]: https://www.gnu.org/software/automake/
[SapCat]: https://web.archive.org/web/20131130123139/http://homepages.rpi.edu/~toddr/Archives/2000/a04g-sapcat-final/SapCat/index.html
[pkg-config]: https://www.freedesktop.org/wiki/Software/pkg-config/
[Tom Tromey]: https://github.com/tromey
[The Cliffs of Inanity]: http://tromey.com/blog/

View File

@ -1,42 +0,0 @@
---
title: "Semantic linefeeds: one clause per line"
slug: semantic-linefeeds
date: 2016-08-22T21:05:45+01:00
description: |
Still letting your text editor
break lines for you at 80 characters?
Take back control!
draft: false
tags:
- Writing
topics:
- Stuff
type: post
---
I've started using ["semantic linefeeds", a concept I discovered on Brandon Rhodes' blog][source],
when writing content,
an idea described in that article far better than I could.
I turns out this is a very old idea,
promoted way back in the day by Brian W Kernighan,
contributor to the original Unix system,
co-creator of the AWK and AMPL programming languages
and co-author of a lot of seminal programming textbooks
including "The C Programming Language".
The basic idea is
that you break lines at natural gaps between clauses and phrases,
rather than simply after the last word before you hit 80 characters.
Keeping line lengths strictly to 80 characters
isn't really necessary
in these days of wide aspect ratios for screens.
Breaking lines at points that make semantic sense in the sentence
is really helpful for editing,
especially in the context of version control,
because it isolates changes to the clause in which they occur
rather than just the nearest 80-character block.
I also like it because it makes my crappy prose
feel just a little bit more like poetry. ☺
[source]: http://rhodesmill.org/brandon/2012/one-sentence-per-line/

View File

@ -1,56 +0,0 @@
---
title: 'What happened to the original Software Carpentry?'
description: |
Before @softwarecarpentry, there was Software Carpentry.
What happened to it?
slug: swc-the-competition
date: 2016-08-16T18:03:13+01:00
draft: false
type: post
topics:
- Technology
tags:
- Software Carpentry
- Web archaeology
series: swc-archaeology
---
> "Software Carpentry was originally a competition to design new software tools,
> not a training course.
> The fact that you didn't know that tells you how well it worked."
When I read this in a [recent post on Greg Wilson's blog][gvwilson blog],
I took it as a challenge.
I actually do remember the competition,
although looking at the dates it was long over by the time I found it.
I believe it did have impact;
in fact, I still occasionally use one of the tools it produced,
so Greg's comment got me thinking:
what happened to the other competition entries?
Working out what happened will need a bit of digging,
as most of the relevant information
is now [only available on the Internet Archive][SWC Archive].
It certainly seems that by November 2008
the domain name had been allowed to lapse
and had been replaced with a holding page by the registrar.
There were four categories in the competition,
each representing a category of tool
that the organisers thought could be improved:
- SC Build: a build tool to replace [make][]
- SC Conf: a configuration management tool to replace [autoconf][] and [automake][]
- SC Track: a bug tracking tool
- SC Test: an easy to use testing framework
I'm hoping to be able to show that this work
had a lot more impact than Greg is admitting here.
I'll keep you posted on what I find!
[gvwilson blog]: http://third-bit.com/2015/12/06/just-keep-swimming.html
[SWC Archive]: https://web.archive.org/web/20071014042716/http://software-carpentry.com/index.html
[make]: https://www.gnu.org/software/make/
[autoconf]: https://www.gnu.org/software/autoconf/autoconf.html
[automake]: https://www.gnu.org/software/automake/

View File

@ -1,68 +0,0 @@
---
title: Tools for collaborative markdown editing
slug: collaborative-markdown-editing
author: Jez
date: 2016-09-15T20:52:35+01:00
description: |
I can't believe it's taken this long
to create tools that allow simultaneous editing
of markdown documents, but they're finally here.
draft: false
topics:
- Research communication
- Technology
tags:
- Markdown
- Collaboration
type: post
---
{{< figure alt="Discount signs in a shop window"
src="https://upload.wikimedia.org/wikipedia/en/thumb/b/ba/Half_off_original_price.jpg/640px-Half_off_original_price.jpg"
class="main-illustration fr"
attr="Photo by Alan Cleaver"
attrlink="https://en.wikipedia.org/wiki/File:Half_off_original_price.jpg" >}}
I really love [Markdown](https://en.wikipedia.org/wiki/Markdown)[^1]. I love its simplicity; its readability; its plain-text nature. I love that it can be written and read with nothing more complicated than a text-editor. I love how nicely it plays with version control systems. I love how easy it is to convert to different formats with Pandoc and how it's become effectively the native text format for a wide range of blogging platforms.
[^1]: Other plain-text formats are available. I'm also a big fan of [org-mode](http://orgmode.org/).
One frustration I've had recently, then, is that it's surprisingly difficult to collaborate on a Markdown document. There are various solutions that *almost* work but at best feel somehow inelegant, especially when compared with rock solid products like Google Docs. Finally, though, we're starting to see some real possibilities. Here are some of the things I've tried, but I'd be keen to hear about other options.
## 1. Just suck it up
To be honest, [Google Docs](https://docs.google.com/) isn't *that* bad. In fact it works really well, and has almost no learning curve for anyone who's ever used Word (i.e. practically anyone who's used a computer since the 90s). When I'm working with non-technical colleagues there's nothing I'd rather use.
It still feels a bit uncomfortable though, especially the vendor lock-in. You can export a Google Doc to Word, ODT or PDF, but you need to use Google Docs to do that. Plus as soon as I start working in a word processor I get tempted to muck around with formatting.
## 2. Git(hub)
The obvious solution to most techies is to set up a [GitHub](https://github.com/) repo, commit the document and go from there. This works very well for bigger documents written over a longer time, but seems a bit heavyweight for a simple one-page proposal, especially over short timescales.
Who wants to muck around with pull requests and merging changes for a document that's going to take 2 days to write tops? This type of project doesn't need a bug tracker or a wiki or a public homepage anyway. Even without GitHub in the equation, using git for such a trivial use case seems clunky.
## 3. Markdown in Etherpad/Google Docs
[Etherpad](http://etherpad.org/) is great tool for collaborative editing, but suffers from two key problems: no syntax highlighting or preview for markdown (it's just treated as simple text); and you need to find a server to host it or do it yourself.
However, there's nothing to stop you editing markdown with it. You can do the same thing in Google Docs, in fact, and I have. Editing a fundamentally plain-text format in a word processor just feels weird though.
## 4. Overleaf/Authorea
[Overleaf](http://overleaf.com/) and [Authorea](http://authorea.com/) are two products developed to support academic editing. Authorea has built-in markdown support but lacks proper simultaneous editing. Overleaf has great simultaneous editing but only supports markdown by wrapping a bunch of LaTeX boilerplate around it. Both OK but unsatisfactory.
## 5. StackEdit
Now we're starting to get somewhere. [StackEdit](https://stackedit.io/) has both Markdown syntax highlighting and near-realtime preview, as well as integrating with Google Drive and Dropbox for file synchronisation.
## 6. HackMD
[HackMD](https://hackmd.io/) is one that I only came across recently, but it looks like it does exactly what I'm after: a simple markdown-aware editor with live preview that also permits simultaneous editing. I'm a little circumspect simply because I know simultaneous editing is difficult to get right, but it certainly shows promise.
## 7. Classeur
I discovered [Classeur](https://classeur.io/) literally today: it's developed by the same team as StackEdit (which is now apparently no longer in development), and is currently in beta, but it looks to offer two killer features: real-time collaboration, including commenting, and pandoc-powered export to loads of different formats.
## Anything else?
Those are the options I've come up with so far, but they can't be the only ones. Is there anything I've missed?

View File

@ -1,94 +0,0 @@
---
title: "Software Carpentry: SC Track; hunt those bugs!"
description: |
For the SC Track competition
entrants were asked to design a better bug-tracker.
slug: sc-track
date: 2016-09-12T08:50:15+01:00
draft: false
type: post
topics:
- Technology
tags:
- Software Carpentry
- Web archaeology
- Bug trackers
- GitHub
series: swc-archaeology
---
> This competition will be an opportunity for the next wave of developers to show their skills to the world --- and to companies like ours.
> --- *Dick Hardt, ActiveState (quote taken from [SC Track page][SC Track])*
[SC Track]: https://web.archive.org/web/20071014042747/http://software-carpentry.com/sc_track/index.html
All code contains bugs,
and all projects have features that users would like
but which aren't yet implemented.
Open source projects tend to get more of these
as their user communities grow and start requesting improvements to the product.
As your open source project grows,
it becomes harder and harder to keep track of and prioritise
all of these potential chunks of work.
What do you do?
The answer, as ever,
is to make a to-do list.
Different projects have used different solutions,
including mailing lists, forums and wikis,
but fairly quickly a whole separate class of software evolved:
the [bug tracker][],
which includes such well-known examples as
[Bugzilla](https://www.bugzilla.org/),
[Redmine](http://www.redmine.org/)
and the mighty [JIRA](https://www.atlassian.com/software/jira).
[bug tracker]: https://en.wikipedia.org/wiki/Bug_tracking_system
Bug trackers are built entirely around such requests for improvement,
and typically track them through workflow stages
(planning, in progress, fixed, etc.)
with scope for the community to discuss and add various bits of metadata.
In this way,
it becomes easier both to prioritise problems against each other
and to use the hive mind to find solutions.
Unfortunately most bug trackers are big, complicated beasts,
more suited to large projects with dozens of developers and hundreds or thousands of users.
Clearly a project of this size
is more difficult to manage and requires a certain feature set,
but the result is that the average bug tracker
is non-trivial to set up for a small single-developer project.
The [SC Track][] category asked entrants to propose a better bug tracking system.
In particular,
the judges were looking for something
easy to set up and configure
without compromising on functionality.
The winning entry was a [bug-tracker called Roundup][Roundup],
proposed by Ka-Ping Yee.
Here we have another tool which is still in active use and development today.
Given that there is now a huge range of options available in this area,
including the mighty [github][],
this is no small achievement.
[Roundup]: http://roundup.sourceforge.net/index.html
[github]: https://github.com/
These days, of course,
github has become something of a *de facto* standard
for open source project management.
Although ostensibly a version control hosting platform,
each github repository also comes with
a built-in issue tracker,
which is also well-integrated with the "pull request" workflow system
that allows contributors to submit bug fixes and features themselves.
Github's competitors,
such as GitLab and Bitbucket,
also include similar features.
Not everyone wants to work in this way though,
so it's good to see that there is still a healthy ecosystem
of open source bug trackers,
and that Software Carpentry is still having an impact.

View File

@ -1,51 +0,0 @@
---
title: "Rewarding good practice in research"
description: |
Are promotion criteria really the only way to build good practice?
slug: rewarding-good-practice-in-research
date: 2016-10-13T08:32:50+01:00
draft: false
type: post
topics:
- Higher education
tags:
- Research data management
- Open Access
- Research
- Good practice
---
{{< figure alt="Carrot + Stick < Love from opensource.com" src="https://farm6.staticflickr.com/5292/5537457133_dd19bca843_o_d.png" attr="From opensource.com on Flickr" attrlink="http://www.flickr.com/photos/opensourceway/5537457133/" class="main-illustration fr" >}}
Whenever I'm involved in a discussion about how to encourage researchers to adopt new practices, eventually someone will come out with some variant of the following phrase:
> "That's all very well, but researchers will never do *XYZ* until it's made a criterion in hiring and promotion decisions."
With all the discussion of carrots and sticks I can see where this attitude comes from, and strongly empathise with it, but it raises two main problems:
1. It's unfair and more than a little insulting to anyone to be lumped into one homogeneous group; and
2. Taking all the different possible *XYZs* into account, that's an awful lot of hoops to expect anyone to jump through.
Firstly, "researchers" are as diverse as the rest of us in terms of what gets them out of bed in the morning. Some of us want prestige; some want to contribute to a greater good; some want to create new things; some just enjoy the work.
One thing I'd argue we all have in common is this: nothing is more offputting than feeling like you're being strongarmed into something you don't want to do.
If we rely on simplistic metrics, people will focus on those and miss the point. At best people will disengage and at worst they will actively game the system. I've got to do these ten things to get my next payrise, and still retain my sanity? Ok, what's the least I can get away with and still tick them off. You see it with students taking poorly-designed assessments and grown-ups are no difference.
We do need to wield carrots as well as sticks, but the whole point is that these practices are beneficial in and of themselves. The carrots are already there if we articulate them properly and clear the roadblocks (don't you enjoy mixed metaphors?). Creating artificial benefits will just dilute the value of the real ones.
Secondly, I've heard a similar argument made for all of the following practices and more:
- Research data management
- Open Access publishing
- Public engagement
- New media (e.g. blogging)
- Software management and sharing
Some researchers devote every waking hour to their work, whether it's in the lab, writing grant applications, attending conferences, authoring papers, teaching, and so on and so on. It's hard to see how someone with all this in their schedule can find time to exercise any of these new skills, let alone learn them in the first place. And what about the people who sensibly restrict the hours taken by work to spend more time doing things they enjoy?
Yes, all of the above practices are valuable, both for the individual and the community, but they're all new (to most) and hence require more effort up front to learn. We have to accept that it's inevitably going to take time for all of them to become "business as usual".
I think if the hiring/promotion/tenure process has any role in this, it's in asking whether the researcher can build a coherent narrative as to why they've chosen to focus their efforts in this area or that. You're not on Twitter but your data is being used by 200 research groups across the world? Great! You didn't have time to tidy up your source code for github but your work is directly impacting government policy? Brilliant!
We still need convince more people to do more of these beneficial things, so how? Call me naïve, but maybe we should stick to making rational arguments, calming fears and providing low-risk opportunities to learn new skills. Acting (compassionately) like a stuck record can help. And maybe we'll need to scale back our expectations in other areas (journal impact factors, anyone?) to make space for the new stuff.

View File

@ -1,100 +0,0 @@
---
title: "Software Carpentry: SC Test; does your software do what you meant?"
description: |
The SC Test competition invited
proposals for better testing tools.
slug: sc-test
date: 2016-10-06T18:51:42+01:00
draft: false
type: post
tags:
- Software Carpentry
- Web archaeology
- Testing
topics:
- Technology
series: swc-archaeology
---
> "The single most important rule of testing is to **do it**."
> --- *Brian Kernighan and Rob Pike, The Practice of Programming (quote taken from [SC Test page][SC Test]*
[SC Test]: https://web.archive.org/web/20071014042742/http://software-carpentry.com/sc_test/index.html
One of the trickiest aspects of developing software
is making sure that it actually does what it's supposed to.
Sometimes failures are obvious:
you get completely unreasonable output
or even (shock!) a comprehensible error message.
But failures are often more subtle.
Would you notice if your result was out by a few percent,
or consistently ignored the first row of your input data?
The solution to this is testing:
take some simple example input with a *known* output,
run the code and compare the actual output with the expected one.
Implement a new feature, test and repeat.
Sounds easy, doesn't it?
But then you implement a new bit of code.
You test it and everything seems to work fine,
except that your new feature required changes to existing code
and those changes broke something else.
So in fact you need to test *everything*,
and do it *every time you make a change*.
Further than that,
you probably want to test
that all your separate bits of code work together properly (*integration testing*)
as well as testing the individual bits separately (*unit testing*).
In fact, splitting your tests up like that is a good way of holding on to your sanity.
This is actually a lot less scary than it sounds,
because there are plenty of tools now to automate that testing:
you just type a simple `test` command and everything is verified.
There are even tools that enable you to have tests run automatically
when you check the code into version control,
and even automatically deploy code that passes the tests,
a process known as *continuous integration* or CI.
The big problems with testing are that
it's tedious, your code seems to work without it
and *no-one tells you off for not doing it*.
At the time when the Software Carpentry competition was being run,
the idea of testing wasn't new,
but the tools to help were in their infancy.
> "Existing tools are obscure, hard to use, expensive, don't actually provide much help, or all three."
The [SC Test category][SC Test] asked entrants
"to design a tool, or set of tools, which will help programmers
construct and maintain black box and glass box tests of software
components at all levels, including functions, modules, and classes,
and whole programs."
The SC Test category is interesting
in that the competition administrators clearly found it difficult
to specify what they wanted to see in an entry.
In fact,
the whole category was reopened
with a refined set of rules and expectations.
Ultimately, it's difficult to tell whether this category
made a significant difference.
Where the tools to write tests used to be very sparse and difficult to use
they are now many and several options exist for most programming languages.
With this proliferation,
several tried-and-tested methodologies have emerged
which are consistent across many different tools,
so while things still aren't perfect they are much better.
In recent years there has been a culture shift
in the wider software development community towards
both testing in general and test-first development,
where the tests for a new feature are written *first*,
and then the implementation is coded incrementally
until all tests pass.
The current challenge is to transfer this culture shift
to the academic research community!

View File

@ -1,85 +0,0 @@
---
title: "Implementing Yesterbox in emacs with mu4e"
description: |
"Yesterbox" is a clever system for managing email
that involves hiding today's new email.
Turns out that's trivial to implement in emacs.
slug: yesterbox-emacs-mu4e
date: 2016-10-27T08:30:33+01:00
draft: false
type: post
topics:
- Technology
tags:
- Emacs
- Productivity
- Email
---
I've been meaning to give [Yesterbox][] a try for a while.
The general idea is
that each day you only deal with email that arrived
yesterday or earlier.
This forms your inbox for the day,
hence "yesterbox".
[Yesterbox]: http://www.yesterbox.com/
Once you've emptied your yesterbox,
or at least got through some minimum number
(10 is recommended)
*then* you can look at emails from today.
Even then you only really want to be dealing with
things that are absolutely urgent.
Anything else can wait til tomorrow.
The motivation for doing this
is to get away from the feeling that we are King Canute,
trying to hold back the tide.
I find that when I'm processing my inbox toward zero
there's always a temptation to keep skipping to the new stuff that's just come in.
Hiding away the new email
until I've dealt with the old
is a very interesting idea.
I use [mu4e][] in emacs for reading my email,
and handily the mu search syntax is very flexible
so you'd think it would be easy
to create a yesterbox filter:
```
maildir:"/INBOX" date:..1d
```
[mu4e]: http://www.djcbsoftware.nl/code/mu/mu4e.html
Unfortunately,
`1d` is interpreted as "24 hours ago from right now"
so this filter misses everything that was sent
yesterday but *less than* 24 hours ago.
There was a feature request raised on the mu github repository
to [implement an additional date filter syntax](https://github.com/djcb/mu/issues/582)
but it seems to have died a death for now.
In the meantime,
the answer to this is to remember that
my workplace observes fairly standard office hours,
so that anything sent more than 9 hours ago
is unlikely to have been sent today.
The following does the trick:
```
maildir:"/INBOX" date:..9h
```
In my mu4e bookmarks list,
that looks like this:
```emacs-lisp
(setq mu4e-bookmarks
'(("flag:unread AND NOT flag:trashed" "Unread messages" ?u)
("flag:flagged maildir:/archive" "Starred messages" ?s)
("date:today..now" "Today's messages" ?t)
("date:7d..now" "Last 7 days" ?w)
("maildir:\"/Mailing lists.*\" (flag:unread OR flag:flagged)" "Unread in mailing lists" ?M)
("maildir:\"/INBOX\" date:..1d" "Yesterbox" ?y))) ;; <- this is the new one
```

View File

@ -1,41 +0,0 @@
---
title: "IDCC 2017 reflection"
description: |
Every spring, the world's research data management and digital
curation communities congregate for the International Digital Curation
Conference. This year's conference was in picturesque Edinburgh.
slug: idcc17-summary
date: 2017-04-06T07:26:22+01:00
draft: false
type: post
topics:
- Research communication
tags:
- IDCC
- Digital curation
- Conference
- Edinburgh
- Research data management
---
For most of the last few years I've been lucky enough to attend the [[http://www.dcc.ac.uk/events/idcc][International Digital Curation Conference (IDCC)]]. One of the main audiences attending is people who, like me, work on research data management at universities around the world and it's begun to feel like a sort of "home" conference to me. This year, IDCC was held at the Royal College of Surgeons in the beautiful city of Edinburgh.
For the last couple of years, my overall impression has been that, as a community, we're moving away from the "first-order" problem of trying to convince people (from PhD students to senior academics) to take RDM seriously and into a rich set of "second-order" problems around how to do things better and widen support to more people. This year has been no exception. Here are a few of my observations and takeaway points.
- Everyone has a repository now :: Only last year, the most common question you'd get asked by strangers in the coffee break would be "Do you have a data repository?" Now the question is more likely to be "What are you using for your data repository?", along with more subtle questions about specific components of systems and how they interact.
- Integrating active storage and archival systems :: Now that more institutions have data worth preserving, there is more interest in (and in many cases experience of) setting up more seamless integrations between active and archival storage. There are lessons here we can learn.
- Freezing in amber vs actively maintaining assets :: There seemed to be an interesting debate going on throughout the conference around the aim of preservation: should we be faithfully preserving the bits and bytes provided without trying to interpret them, or should we take a more active approach by, for example, migrating obsolete formats to newer alternatives. If the former, should we attempt to preserve the software required to access the data as well? If the latter, how much effort do we invest and how do we ensure nothing is lost or altered in the migration?
- Demonstrating Data Science instead of debating what it is :: The phrase "Data Science" was once again one of the most commonly uttered of the conference. However, there is now less abstract discussion about what, exactly, is meant by this "data science" thing; this has been replaced more by concrete demonstrations. This change was exemplified perfectly by the keynote by data scientist [[https://twitter.com/alice_daish][Alice Daish]], who spent a riveting 40 minutes or so enthusing about all the cool stuff she does with data at the British Museum.
- Recognition of software as an issue :: Even as recently as last year, I've struggled to drum up much interest in discussing software sustainability and preservation at events like this; the interest was there, but there were higher priorities. So I was completely taken by surprise when we ended up with 30+ people in the Software Preservation Birds of a Feather (BoF) session, and when very little input was needed from me as chair to keep a productive discussion going for a full 90 minutes.
- Unashamed promotion of openness :: As a community we seem to have nearly overthrown our collective embarrassment about the phrase "open data" (although maybe this is just me). We've always known it was a good thing, but I know I've been a bit of an apologist in the past, feeling that I had to "soften the blow" when asking researchers to be more open. Now I feel more confident in leading with the benefits of openness, and it felt like that's a change reflected in the community more widely.
- Becoming more involved in the conference :: This year, I took a decision to try and do more to contribute to the conference itself, and I felt like this was pretty successful both in making that contribution and building up my own profile a bit. I presented a paper on one of my current passions, [[http://librarycarpentry.github.io][Library Carpentry]]; it felt really good to be able to share my enthusiasm. I presented a poster on our work integrating our data repository and digital preservation platform; this gave me more of a structure for networking during breaks, as I was able to stand by the poster and start discussions with anyone who seemed interested. I chaired a parallel session; a first for me, and a different challenge from presenting or simply attending the talks. And finally, I proposed and chaired the Software Preservation BoF session (blog post forthcoming).
- Renewed excitement :: It's weird, and possibly all in my imagination, but there seemed to be more energy at this conference than at the previous couple I've been to. More people seemed to be excited about the work we're all doing, recent achievements and the possibilities for the future.

View File

@ -1,34 +0,0 @@
---
title: "Lean Libraries: applying agile practices to library services"
description: |
I've had call to read quite a bit about the lean and agile movements lately, which got me thinking: how would all this work in an academic library context?
slug: lean-libraries-intro
date: 2017-07-19T17:42:16+01:00
draft: false
type: post
topics:
- Librarianship
tags:
- Lean
- Agile
- Management
---
{{< figure alt="A Scrum board suggesting to use Kanban"
caption="Kanban board"
src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/Simple-kanban-board-.jpg/640px-Simple-kanban-board-.jpg"
attr="Jeff Lasovski (via Wikimedia Commons)"
attrlink="https://en.wikipedia.org/wiki/File:Simple-kanban-board-.jpg"
class="main-illustration fr" >}}
I've been working with our IT services at work quite closely for the last year as product owner for our new research data portal, ORDA. That's been a fascinating process for me as I've been able to see first-hand some of the agile techniques that I've been reading about from time-to-time on the web over the last few years.
They're in the process of adopting a specific set of practices going under the name "Scrum", which is fun because it uses some novel terminology that sounds pretty weird to non-IT folks, like "scrum master", "sprint" and "product backlog". On my small project we've had great success with the short cycle times and been able to build trust with our stakeholders by showing concrete progress on a regular basis.
Modern librarianship is increasingly fluid, particularly in research services, and I think that to handle that fluidity it's absolutely vital that we are able to work in a more agile way. I'm excited about the possibilities of some of these ideas. However, Scrum as implemented by our IT services doesn't seem something that transfers directly to the work that we do: it's too specialised for software development to adapt directly.
What I intend to try is to steal some of the individual practices on an experimental basis and simply see what works and what doesn't. The Lean concepts currently popular in IT were originally developed in manufacturing: if they can be translated from the production of physical goods to IT, I don't see why we can't make the ostensibly smaller step of translating them to a different type of knowledge work.
I've therefore started reading around this subject to try and get as many ideas as possible. I'm generally pretty rubbish at taking notes from books, so I'm going to try and record and reflect on any insights I make on this blog. The framework for trying some of these out is clearly a Plan-Do-Check-Act continuous improvement cycle, so I'll aim to reflect on that process too.
I'm sure there will have been people implementing Lean in libraries already, so I'm hoping to be able to discover and learn from them instead of starting froms scratch. Wish me luck!

View File

@ -1,12 +0,0 @@
10 about:
Name: about
URL: "/about/"
20 tags:
Name: tags
URL: "/tags/"
50 rdm:
Name: rdm resources
URL: "/rdm-resources/"

View File

@ -1,2 +0,0 @@
swc-archaeology: |
the origins of [Software Carpentry](http://software-carpentry.org)

View File

Before

Width:  |  Height:  |  Size: 856 KiB

After

Width:  |  Height:  |  Size: 856 KiB

View File

Before

Width:  |  Height:  |  Size: 77 KiB

After

Width:  |  Height:  |  Size: 77 KiB

View File

Before

Width:  |  Height:  |  Size: 157 KiB

After

Width:  |  Height:  |  Size: 157 KiB

Some files were not shown because too many files have changed in this diff Show More