Overview

This dashboard shows the protection rate of refugees in Europe and highlights the variance in the number of decisions on asylum applications by country in Europe and the large variance in the protection rates by country of origin. The dashboard also highlights the suprising differences in the protection rates by country of asylum in Europe, particularly for Iraqis and Afghans. It’s surprising as even though each country has its own asylum laws and processes, there have been huge efforts to standardise this within the European Union. As the data shows, more still needs to be done on this!

Here is the output and the next sections go through how it was produced step-by-step. All the source code and data are available on GitHub.

Background - what is a refugee?

When someone flees their country to escape e.g. percecution or war and arrives in another country, they can apply for asylum. If their case is well grounded, they would become refugees in that country. The protection rate then is the % of those applying for asylum that are successful. People fleeing countries like Syria tend to have high protection rates, while for other countries, e.g. Venezuela, the protection rate tends to be very low as most applicants are economic migrants rather than refugees.

There are three different statuses for refugees in Europe, ranging from Geneva convention through to humanitarian and subsidiary statuses, and refugees’ rights vary considerably with each type of status. Here is more information about what is a refugee from the United Nations Refugee Agency.

Data sources

Four data sources were used to compile this dashboard:

  1. Statistical data - is from Eurostat and is available via an API from EuroStat. If you can’t wait to download the data, use this already downloaded raw data. The asylum decision data is available on a quarterly basis.
  2. Country spatial data - in this case the centroid (which is the geographical centre) was scraped from a Google documentation page.
  3. Basemap - the tiles were produced using OpenStreetMap data and were styled in MapBox Studio and then published. MapBox Studio gives you a lot of flexibility to easily creating your ideal base map.
  4. Most common countries of origin - based on this factsheet https://data2.unhcr.org/en/documents/details/64846

The project idea

During the design phase of this project, the initial drafts looked a lot like the following image - pie charts on a map with the size of the pie representing the number of decisions and the segments of the pie representing the different protection statuses.

As you can see above, while there are differences between pie charts, those differences are difficult to interpret. It would be almost impossible to quickly determine the five areas with the most hispanics for example. Pretty, but verging on chart junk?

Instead, in this project, the visualisation was decomposed into three charts and a map. A simple map and summary chart show the total decisions by country. Bar charts are perhaps more effective at showing the absolute figures and for easily comparing values. The map usefully shows the areas in Europe where the number of cases are more signficant.

For the comparison of the protection rates (based on the four relevant statuses in Europe), two groups of chart multiples were produced to show the number of decisions and the percentage for each status by the five most common countries of origin in Europe. Only the dozen or so countries with the most asylum applications were shown to keep the data displayed as simple as possible. All the data is ordered so that the most significant data is shown at the top of the charts. The 100% stacked charts help to compare the variance in protection rates by country of asylum (y-axis) and country of origin (x-axis).

The simple dashboard frame, using GridExtra to do the arrangement, was just a couple of lines of code at the end - a relief after all the painfully fiddly tweaking to make the charts look OK!

R Libraries

Check you have all the following installed and note the phantomJS dependency for MapView and the process of installing the true type fonts:

# for the joins and lots of other cool things
library(dplyr)      
# for loading Json
library(jsonlite)
# for the string manipulation
library(stringr)
# For formatting pretty numbers
library(scales)
# for reordering arrays
library(forcats)

# for pretty fonts - see https://cran.r-project.org/web/packages/extrafont/README.html
#install.packages('extrafont')
library(extrafont)
# Run this once to load in all the available fonts (this will take a few mins) and restart R afterwards 
#font_import()
#fonts() # Check which fonts are now available
# Lets try to use Trebuchet for the charts
fontsForCharts <- c( "Trebuchet MS" )

# for visualisation on maps and in plots
library(leaflet)
library(ggplot2)

# for taking screenshots of the map - note that you also need to install phantomjs
#install.packages("mapview")
#webshot::install_phantomjs()
library(mapview)
# to read the png files
library(png)

# to render the images as graphical object to display in an arrangement
library(grid)
# for the final arrangements
library(gridExtra)

Loading the data

Lets use tribble to get the table of country spatial data from this Google documentation page.

countryCentroids <- tibble::tribble(
  ~country,  ~latitude,  ~longitude,                                          ~name,
      "AD",  42.546245,    1.601554,                                      "Andorra",
      "AE",  23.424076,   53.847818,                         "United Arab Emirates",
      "AF",   33.93911,   67.709953,                                  "Afghanistan",
      "And Lots more countries!!!", 0, 0, ""
)

# Europe tweaks - Use the correct Iso codes for UK and Greece
countryCentroids[countryCentroids=="GB"] <- "UK"
countryCentroids[countryCentroids=="GR"] <- "EL"

Then lets use tribble again to get the full list of European countries and their two character ISO codes from Eurostat.

# So these are the EU 28 countries with the four additional EU+ countries (Norway, Iceland, Liechtenstein and Switzerland)
euPlusCountries <- tibble::tribble(
  ~Iso,    ~Name,
  "BE",    "Belgium",
  "BG",    "Bulgaria",
  "CZ",    "Czech Republic",
  "DK",    "Denmark",
  "DE",    "Germany",
  "EE",    "Estonia",
  "IE",    "Ireland",
  "EL",    "Greece",
  "ES",    "Spain",
  "FR",    "France",
  "HR",    "Croatia",
  "IT",    "Italy",
  "CY",    "Cyprus",
  "LV",    "Latvia",
  "LT",    "Lithuania",
  "LU",    "Luxembourg",
  "HU",    "Hungary",
  "MT",    "Malta",
  "NL",    "Netherlands",
  "AT",    "Austria",
  "PL",    "Poland",
  "PT",    "Portugal",
  "RO",    "Romania",
  "SI",    "Slovenia",
  "SK",    "Slovakia",
  "FI",    "Finland",
  "SE",    "Sweden",
  "UK",    "United Kingdom",
  "IS",    "Iceland",
  "LI",    "Liechtenstein",
  "NO",    "Norway",
  "CH",    "Switzerland"
  )

The EuroStat API can be accessed using JSON and the URL is structured like this https://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/migr_asydcfstq?freq=Q&unit=PER&citizen=VE&sex=T&age=TOTAL&decision=TOTAL&time=2016Q4. The service has fairly tight limits on the volume of data you can download in each request, so we will need to iterate through the Citizen (country of origin) and Decision (Protection status) options and then compile our dataset by joining the data downloaded from each request. Here is more information on the structure of the EuroStat JSON API requests and the data returned.

#-------------------------------------------------------------------------------------------------------
#  We need to go through and get data by building this URL multiple times using specific parameters
# https://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/migr_asydcfstq
#   ?freq=Q&unit=PER&citizen=VE&sex=T&age=TOTAL&decision=TOTAL&time=2016Q4

jsonURLStub <-"https://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/"
jsonDataset <-"migr_asydcfstq"
# There is a bug at the moment in the Eurostats data with data from this year and last year currently not available - so lets get the latest data available.
jsonTimePeriod <- "2016Q4" # "2018Q1" 

#-------------------------------------------------------------------------------------------------------
# According to the mid year data factsheet for Europe, refugees and migrants from these five countries
# have submitted the most asylum application this year in Europe: Syrians, Iraqis, Afghans, Nigerians and Pakistanis
citizenList <- list()
citizenList[[ "TOTAL" ]] <- "Total"
citizenList[[ "SY" ]] <- "Syrian Arab Republic"
citizenList[[ "IQ" ]] <- "Iraq"
citizenList[[ "AF" ]] <- "Afghanistan"
citizenList[[ "NG" ]] <- "Nigeria"
citizenList[[ "PK" ]] <- "Pakistan"


#-------------------------------------------------------------------------------------------------------
# And here are the list of decisions we want to collect
decisionList <- data.frame(
  # Remember to set the levels as shown in this guide.  This is what enforces the order of the elements
  # https://stackoverflow.com/questions/31638771/r-reorder-levels-of-a-factor-alphabetically-but-one
  # Titles
  DecisionTitle=factor(c("Total", "Rejected", "Subsidiary", "Humanitarian", "Geneva convention"),
    levels=c("Total", "Rejected", "Subsidiary", "Humanitarian", "Geneva convention")),
  # Eurostat keys
  DecisionKey=factor(c("TOTAL", "REJECTED", "SUB_PROT", "HUMSTAT", "GENCONV"),
    levels=c("TOTAL", "REJECTED", "SUB_PROT", "HUMSTAT", "GENCONV")), 
  # Pretty legend colours
  DecisionLegend=factor(c("#505050", "#d23f67", "#f7bb16", "#e77b37", "#2c8ac1"),
    levels=c("#505050", "#d23f67", "#f7bb16", "#e77b37", "#2c8ac1"))

)



#-------------------------------------------------------------------------------------------------------
# declare our data cube as an empty data frame with the relevant column types
dataCube <- data.frame(
    Characters=character(),# Geo
    Characters=character(),# Citizen
    Characters=character(),# Decision
    Ints=integer())        # Count

Then here is the code that iterates through each of the thirty combinations of citizen and decision and downloads and compiles our raw data.

# Set this to true to run the loader code (it requires an internet connection and takes a few seconds)
doRun <- FALSE
if(doRun==TRUE) {
  # Set the counters
  counter <- 1
  countTotal <- length(citizenList) * length(decisionList$DecisionKey)
  # Now loop through each citizen / nationality option
  for( citizen in names(citizenList)) {
    
    # And an inner loop on the decisions
    for( decisionType in decisionList$DecisionKey) {
  
      # Write a message to the console so that the user can see that something is happening ...
      message(str_c(
        "Downloading ", counter, " of ", countTotal, 
        " JSON data from Eurostat for citizen ", citizen , " and decision type ", decisionType))
      
      jsonURL <- str_c(jsonURLStub, jsonDataset, 
          "?freq=Q&unit=PER&citizen=", citizen, 
          "&sex=T&age=TOTAL&decision=", decisionType, 
          "&time=", jsonTimePeriod )
      
      dataWrapper <- fromJSON(jsonURL)
  
      if(length(dataWrapper$dimension$geo$category$index) != length(dataWrapper$value) ) {
        warning( 
          str_c("Length of categories: ", length(dataWrapper$dimension$geo$category$index), 
              " is not the same as the length of values: ", length(dataWrapper$value), ".  Should be able to clean this up ..."))
      }
      
      # OK - now this is fiddly because of the way the JSON is structured
      # In order to reduce the download volume, missing data is not supplied 
      # This means that, not all GEO labels will be provided; and even not all of those will have values!!
      values <- c()
      
      i <- 1
      while(i <= length(euPlusCountries$Iso)) {
        # set the currentVal to NA
        currentVal <- as.integer(NaN)
        
        # See if the label exists
        cLab <- dataWrapper$dimension$geo$category$index[[ euPlusCountries$Iso[i] ]]
        
        if(is.na(cLab) == FALSE && is.null(cLab) == FALSE && is.null(cLab[0]) == FALSE) {
          # get the index as a character
          currentVal <- dataWrapper$value[ as.character(cLab) ]
          # Check for missing or bad data - the last clause seems to be the most useful one ...
          if(is.na(currentVal) || is.null(currentVal) || is.null(as.character(currentVal)) || as.character(currentVal) == "NULL") {
            message( "Found bad data and fixing it")
            currentVal <- as.integer(NaN)
          }
        }
  
        values[i] <- currentVal
        i <- i + 1
      }
  
      # now combine these two... THis is a bit hacky at the moment as if there is missing 
      # data the two arrays will be different lengths and everything country will be wonky
      dataCubeTemp <- do.call(rbind.data.frame, 
          Map('c', euPlusCountries$Iso, citizen, decisionType, values))
      
      colnames(dataCubeTemp) = c("GeoIso","Citizen","Decision","Count")
      
      # make the count list numeric ...
      dataCubeTemp  <- dataCubeTemp %>%
        mutate(Count=as.numeric(as.character(Count)))
  
      # remove the total counts (GeoIso == EU28 and TOTAL) - this should now be redundant due to the code changes above, but there is no harm in trying!
      dataCubeTemp <- filter(dataCubeTemp, GeoIso != "EU28" & GeoIso != "TOTAL")
      
      # Append to the global data cube ...
      ifelse(length(dataCube) == 0, 
             dataCube <- dataCubeTemp, 
             dataCube <- rbind( dataCube, dataCubeTemp))
      
      # Increment our process counter
      counter <- counter + 1
    }
  }
} else {
  # And read the data cube back out again
  dataCube <- read.csv2(str_c("../01_RawData/EuroStatsData_", jsonTimePeriod, ".csv"))
}

Then we want to join the Eurostat statistical data with the country spatial data

if(doRun==TRUE) {
  # Then join the the country data to the data
  dataCube <- left_join(dataCube, countryCentroids, by=c("GeoIso"="country") )
}

And then lets save the raw data

if(doRun==TRUE) {
  # This is a good point to save this data cube - so we can get back to it if needed ...
  write.csv2(dataCube, str_c("../01_RawData/EuroStatsData_", jsonTimePeriod, ".csv"))
}

And lastly, lets save the raw data and have a quick look at what it looks like!

library(knitr)
kable(dataCube %>% head(10))
X GeoIso Citizen Decision Count latitude longitude name
1 BE TOTAL TOTAL 6110 50.50389 4.469936 Belgium
2 BG TOTAL TOTAL 1365 42.73388 25.485830 Bulgaria
3 CZ TOTAL TOTAL 245 49.81749 15.472962 Czech Republic
4 DK TOTAL TOTAL 2570 56.26392 9.501785 Denmark
5 DE TOTAL TOTAL 211325 51.16569 10.451526 Germany
6 EE TOTAL TOTAL 45 58.59527 25.013607 Estonia
7 IE TOTAL TOTAL 565 53.41291 -8.243890 Ireland
8 EL TOTAL TOTAL 3845 39.07421 21.824312 Greece
9 ES TOTAL TOTAL 2135 40.46367 -3.749220 Spain
10 FR TOTAL TOTAL 24465 46.22764 2.213749 France

We got there!

Transforming the data

That was a lot of work! Now we have a few more steps to shape our data to make it easier to render graphically. To make the code more readable here are a couple of helper functions to produce percentages with fairly decent error checking.

#-------------------------------------------------------------------------------------------------------
# Worker functions ...
#-------------------------------------------------------------------------------------------------------

#----- Lookup for the country of origin name
LabelGetCoOName <- 
  function(value) {
    message(str_c(value, "    ", citizenList[value]))
    value <- citizenList[value]
}  

#----- Generates a percentage
GetPercent <- 
  function(enum, denom, rounding) {
    pc <- 0
    rounding <- as.numeric(rounding)
    enum <- as.numeric(enum)
    denom <- as.numeric(denom)
    
    if(enum > 0 && denom > 0) {
      pc <- round(enum/denom*100,rounding)
    }
    
    returnValue <- as.numeric(pc)
}

#----- Percent label creator for the stacked bar chart
GetPercentLabel <- 
  function(pc, threshold) {

    pc <- as.numeric(pc)
    threshold <- as.numeric(threshold)

    percStr <- ifelse(
      (is.na(pc) == FALSE && pc >= threshold),
      str_c(as.character(pc), "%"),
      ""
    )
    
    #    message(str_c("\nEnum: ", enum, " Denom: ", denom, " Threshold:", threshold, " Percent:", pc, " Str:", percStr))
    returnValue <- percStr
}  

And then lets produce three slices through the data to produce our charts.

#-------------------------------------------------------------------------------------------------------
# Lets build the dataCube that will support our map and charts ... note that we want to remove the NA values
dataCubeVis <- dataCube
dataCubeVis[is.na(dataCubeVis)] <- 0

# lets filter down to the 12 most common countries and then group all the others into an other category...
allCountries <- dataCubeVis %>% 
  filter(Citizen == "TOTAL" ) %>%
  filter(Decision == "TOTAL" ) %>%  
  group_by(GeoIso) %>%
  summarise(CountForOrder=sum(Count)) %>%
  ungroup() %>%
  arrange(desc(CountForOrder)) 
#View(allCountries)

# Get the counts of most common and others ...
totalCount <- sum( allCountries$CountForOrder)
# Here is our sliced data
mostCommonCountries <- allCountries[1:12,]
mostCommonCount <- sum( mostCommonCountries$CountForOrder)
otherCount <- totalCount - mostCommonCount

dcvSummary <- rbind( mostCommonCountries, data.frame(GeoIso="Other", CountForOrder=otherCount))


#-------------------------------------------------------------------------------------------------------
# Now lets create our actual data cube
# Lets join it to the most common countries to pull accross the CountForOrder col which will be NA fo other countres
dcvt <- left_join(dataCubeVis, mostCommonCountries, by=c("GeoIso"="GeoIso") )
# Important - ensure that the levels of the Citizen data are consistent with the order in the citizen list...
levels(dcvt$Citizen) <- citizenList

# Then we split our dataCube into two by filtering the other countries and set the GeoIso col and a new GeoName col to 0
# We also want to reset the CountForOrder as we want other to appear at the end...
dataCubeTemp <- filter(dcvt, is.na(CountForOrder)) %>% 
  mutate(GeoIso="Other", GeoName="Other", CountForOrder=0)

dcvt <- filter(dcvt, is.na(CountForOrder) == FALSE)  %>% mutate(GeoName=name)
# Then we join it back together again
dcvt <- rbind( dcvt, dataCubeTemp)

# Collapse all the "other" rows by grouping the data
dcvt <- dcvt %>% 
  group_by(GeoIso, GeoName, Citizen, Decision) %>%
  summarise(Count=sum(Count), CountForOrder=max(CountForOrder)) %>%
  ungroup() %>%
  arrange(desc(CountForOrder))
# Double check that the names are gucci  
names(dcvt) <- c("GeoIso", "GeoName", "Citizen","Decision","Count", "CountForOrder")

# Filter the data summary to include just the total counts
dcvtTotal <- dcvt %>% 
  filter(Citizen == "TOTAL") %>% 
  filter(Decision == "TOTAL")
# Filter the data summary to include just the total counts
dcvtTotalCitizens <- dcvt %>% 
  filter(Citizen != "TOTAL") %>% 
  filter(Decision == "TOTAL")

# And for the detailed views, remove the citizens and decisions total, which is not relevant for these charts
dcvtDetails <- dcvt %>% 
  filter(Citizen != "TOTAL") %>% 
  filter(Decision != "TOTAL")

#levels(dcvtDetails$Citizen) <- citizenList

# We're going to try to show a few totals on the chart directly, so lets create a well formatted total
dcvtTotal <- dcvtTotal %>% mutate(CountLabel=comma(Count))
dcvtTotalCitizens <- dcvtTotalCitizens %>% mutate(CountLabel=comma(Count))

# Check them
#View(dcvtTotal)
#View(dcvtTotalCitizens)
#View(dcvtDetails)



#-------------------------------------------------------------------------------------------------------
# Summary 3 - the % of each decision type by country and broken out by country of origin

# Lets pull across the proper name for the Decisions from the decisionList data fram
dcvDecisions <- left_join(dcvtDetails, decisionList, by=c("Decision"="DecisionKey")) 
# And lets remove the totals as they are not necessary for this view
dcvDecisions <- filter(dcvDecisions, Decision != "TOTAL")

# This creates the percent and the percent label columns - it looks a little intense, 
dcvDecisions <- dcvDecisions %>% 
  group_by(GeoIso, Citizen) %>% 
  # okay, we've got the total, now we can do some math with mutate
  mutate( GeoCitizenTotal=sum(Count, na.rm=T)) %>%    
  ungroup() %>%    
  group_by(GeoIso, Citizen, Decision) %>% 
  # okay, now lets also create a label for all columns
  mutate(
    Percent=GetPercent(Count, GeoCitizenTotal, 0), 
    PercentLabel=GetPercentLabel(Percent, 15.0)) %>% 
  ungroup()

# Then strip out all the zeros - this is probably not necessary, but they will also not be shown..
dcvDecisions <- filter(dcvDecisions, Percent > 0 )
#warnings()  
# Good to have a quick look at the data here
#View(dcvDecisions)

# Lets write the three summary files to disk
# This is a good point to save this data cube - so we can get back to it if needed ...
write.csv2(dcvtTotal, "../02_OutputData/Summary_Totals.csv")
# This is a good point to save this data cube - so we can get back to it if needed ...
write.csv2(dcvtTotalCitizens, "../02_OutputData/Summary_Totals_by_CountryOfOrigin.csv")
# This is a good point to save this data cube - so we can get back to it if needed ...
write.csv2(dcvDecisions, "../02_OutputData/Summary_Decisions.csv")

Note that we saved these datasets there at the end if you need them.

Visualise the data

And here we go! First, the summary bar chart showing the total number of decisions for the dozen or so countries in Europe receiving the most applications.

#-------------------------------------------------------------------------------------------------------
# Summary 1 - the total number of decisions by the top 12 countries of asylum and the others grouped together

# Maybe use a log scale here?
plot1 <- ggplot(dcvtTotal, 
    aes(x=fct_reorder(GeoName, CountForOrder, desc=TRUE), 
    # Lets plot in '000s to reduce the number of zeros shown        
    y=(Count),
    label=CountLabel)) +
  geom_bar(stat="identity") +
  # And this is to present the labels ...
  geom_text(hjust=-0.2, size = 4, colour="#505050") + 
  scale_y_continuous(limits=c(0,max(dcvtTotal$Count)*1.15)) +
  coord_flip() +
  labs(
    title="Decisions by country of asylum", 
    y="", 
    x="", 
    #    caption="Source: Eurostat",
    caption="",
    family=fontsForCharts) +
  # set a very minimal theme
  theme_minimal(base_family=fontsForCharts) +   
  # Tweak the axis text
  theme(
      axis.text.x=element_text(family=fontsForCharts, colour="#aaaaaa", size=10), 
      axis.text.y=element_text(family=fontsForCharts, size=12))

  
plot1

Not bad! We could probably lose the gridlines too without losing too much sleep.

Then here is the second chart with the multiples by country of origin

#-------------------------------------------------------------------------------------------------------
# Summary 2 - the total number of decisions by the top 12 countries of asylum and top 5 countries of origin
# Maybe use a log scale here?
plot2 <- ggplot(dcvtTotalCitizens, 
      aes(x=fct_reorder(GeoName, CountForOrder, desc=TRUE), 
      y=Count,
      label=CountLabel)) +
  geom_bar(stat="identity") +
  # And this is to present the labels ...
  geom_text(hjust=-0.2, size = 3, colour="#505050") + 
  coord_flip() +
  facet_wrap(~Citizen, labeller=as_labeller(LabelGetCoOName), ncol=5) +
  labs(
    title="Number of decisions by nationality", 
    y="", 
    x="", 
#    caption="Source: Eurostat",
    caption="",
    family=fontsForCharts) +
  # set a very minimal theme
  theme_minimal(base_family=fontsForCharts) +   
  # Tweak the axis text
  theme(
    axis.text.x=element_text(family=fontsForCharts, colour="#aaaaaa", size=7), 
    axis.text.y=element_text(family=fontsForCharts, size=12))

plot2

And our third chart with the protection rates shown with a multiple for the five most common countries of origin. See how the percentage labels are hidden for the smaller percentages so the chart does not get too messy - check the transform code above to see how we did that.

#-------------------------------------------------------------------------------------------------------
# Summary 3 - the % of each decision type by country and broken out by country of origin
pos <- position_fill(vjust=0.47)

plot3 <- ggplot(dcvDecisions,
    # The x axis is the names of the countries ordered by the overall count                
    aes(x=fct_reorder(GeoName, CountForOrder, desc=TRUE),
    # and the y axis is the percentage based on the variable (with the zeros removed)
    y=Percent,
    # and the labels are the percentage label strings
    label=PercentLabel,    
    # and the fill is the decisions
    fill=DecisionTitle, 
    na.rm=TRUE)) +
  geom_bar(position=pos, stat="identity") +
  geom_text(position=pos, size = 3, colour="#ffffff") + 
  # Then set our colours and legend labels using the parameters of scale_fill_manual
  # note that we strim as needed to avoid the total count
  scale_fill_manual(values=as.vector(decisionList$DecisionLegend[2:5])) +
  # flip the coordinates
  coord_flip() +
  # set the labels
  labs(title="Type of decisions by nationality (%)", 
       y="", 
       x="", 
       fill="", 
       caption="Source: Eurostat", 
       family=fontsForCharts) +
  # set a very minimal theme
  theme_minimal(base_family=fontsForCharts) +   
  # These two lines tweak the positioning of the legend and hide the x axis ticks need to go AFTER the call to theme_minimal
  theme(
    axis.text.x=element_blank(), 
    axis.text.y=element_text(family=fontsForCharts, size=12)) +
  theme(legend.position="bottom" ) +
  facet_wrap(. ~Citizen, labeller=as_labeller(LabelGetCoOName), ncol=5)  
  
plot3

And here is our slippy map. Note that we are using MapBox Studio to produce a nice clean basemap.

To do - find out why it is not appearing when uploaded to GitHub - following up!

#-------------------------------------------------------------------------------------------------------
dcvMap <- left_join(allCountries, countryCentroids, by=c("GeoIso"="country") )

# Get the range of values ... would need to logarithm the counts here ... to make them look prettier
radiusFactor = (40)/log10(max(dcvMap$CountForOrder)) 

# The url to our published base map
mapBoxURL <- "https://api.mapbox.com/styles/v1/edgarscrase/cjl1c78tn37v72sofn80jo9en/tiles/256/{z}/{x}/{y}?access_token=pk.eyJ1IjoiZWRnYXJzY3Jhc2UiLCJhIjoiY2pram90c3M3MWRxdjNxcWhzOXRzY3N6ZCJ9._HQxSBcViAYVr9Bg1OWI_A"

# Try using leaflet but actually this is not going to look that clean!
m <- leaflet(dcvMap)  %>% 
  addTiles(urlTemplate = mapBoxURL,
           attribution="MapBox") %>%
  setView(5, 50, zoom = 4) %>% 
  addCircleMarkers(~longitude, ~latitude, popup=dcvMap$name, weight = 3, radius=round(radiusFactor * log10(dcvMap$CountForOrder),0), 
                   color="#e77b37", stroke = F, fillOpacity = 0.5) 


m

Now for the static dashboard - we want to save a snapshot of this map and then load that PNG file as another plot.

# Magic.  we got there finally - after 750 lines of code - lets take a screenshot and print out the map using the MapView library
mapshot(m,file="MapScreenShot.png")
# and then read it back in.
mapImage <- readPNG("MapScreenShot.png")

# Add the map image to another plot
plotMap <- qplot(1,1) + annotation_custom(rasterGrob(mapImage)) +
  labs(title="Number of decisions on asylum applications in Europe by country", 
     y="", 
     x="", 
     fill="", 
     caption="", 
     family=fontsForCharts) +
  # set a very minimal theme
  theme_minimal(base_family=fontsForCharts) +   
  # These two lines tweak the positioning of the legend and hide the x axis ticks need to go AFTER the call to theme_minimal
  theme(
    axis.ticks = element_blank(),  
    axis.line = element_blank(),
    axis.text.x=element_blank(), 
    axis.text.y=element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank()
  )


plotMap

And then finally lets use the GridExtra library to arrange our charts into our dashboard. The code generated versions below don’t look super as the width and height have not been set correctly.

#-------------------------------------------------------------------------------------------------------
# Finally - output - lets bring it all together!

# Try summary chart and map and then the two detailed charts
firstRow <- grid.arrange(plot1, plotMap, ncol=2)

grid.arrange(firstRow, plot2, plot3, nrow=3)

Next steps

Next steps would include:

  1. Exploring the interactive charts in more detail. For example, some of the content in the charts at the bottom of the dashboard could also be included in a popup in leaflet too for each specific country.

  2. Adding percentages on the first chart so it is easy to see the % of decisions in Europe that were made in e.g. Germany.

  3. Make the code more flexible by using more functions to wrap repeated tasks. Then it would be easy to e.g. let the user specify how many countries of origin and / or asylum to show.