Council spending – open data
This article is originally published at https://scottishsnow.wordpress.com
My local authority recently decided to publish all spending over £500 in an effort to be more transparent. Here’s a post taking an overview of what they’ve published. I’ve used R for the analysis. The dataset doesn’t contain much detail, but if you’ve analysis suggestions, please add them in the comments!
You can download the spending data here. It’s available in pdf (why?!) and xlsx (plain text would be more open).
First off, some packages:
library(tidyverse) library(readxl) library(janitor) library(lubridate) library(formattable)
Read in the dataset:
df = read_excel("~/Downloads/midlothian_payments_over_500_01042019_to_15092019.xlsx") %>% clean_names()
We’ve got six columns:
- type
- date_paid
- supplier
- amount
- our_ref
- financial_year
Busiest day:
df %>% mutate(day = weekdays(date_paid)) %>% group_by(day) %>% summarise(transactions = n(), thousands_pounds_spent = sum(amount) / 1000) %>% mutate(day = fct_relevel(day, rev(c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")))) %>% gather(var, value, -day) %>% ggplot(aes(day, value)) + geom_col() + facet_wrap(~var, scales = "free_x") + coord_flip() + scale_y_continuous(labels = scales::comma) + labs(title = "Busiest day of the week", x = "", y = "")
Busiest time of year:
df %>% mutate(dow = weekdays(date_paid), dow = if_else(dow == "Tuesday" | dow == "Friday", "Tue/Fri", "Other")) %>% group_by(date_paid, dow) %>% summarise(transactions = n(), pounds_spent = sum(amount)) %>% gather(var, value, -date_paid, -dow) %>% ggplot(aes(date_paid, value, colour = dow)) + geom_point() + facet_wrap(~var, scales = "free_y") + scale_y_log10(labels = scales::comma) + scale_colour_brewer(type = "qual", palette = "Set2") + labs(title = "Busiest day of the year", x = "", y = "")
Top 10 payees by value:
df %>% group_by(supplier) %>% summarise(pounds_spent = sum(amount), transactions = n()) %>% arrange(desc(pounds_spent)) %>% top_n(n = 10, wt = pounds_spent) %>% mutate(pounds_spent = currency(pounds_spent, "£", digits = 0L)) %>% formattable(list(`pounds_spent` = color_bar("#FA614B"), `transactions` = color_bar("lightpink")))
In Scotland local authorities collect water charges on behalf of the water authority, which they then pass on. It’s not surprise that Scottish Water is the biggest supplier.
Top 10 payees by frequency:
df %>% group_by(supplier) %>% summarise(pounds_spent = sum(amount), transactions = n()) %>% arrange(desc(transactions)) %>% top_n(n = 10, wt = transactions) %>% mutate(pounds_spent = currency(pounds_spent, "£", digits = 0L)) %>% formattable(list(`pounds_spent` = color_bar("lightpink"), `transactions` = color_bar("#FA614B")))
As a final note, writing this post is reminding me again I should be moving away from wordpress because incorporating code and output would be much easier with mark/blog down! As always, legacy is holding me back.
Thanks for visiting r-craft.org
This article is originally published at https://scottishsnow.wordpress.com
Please visit source website for post related comments.