Amid the steep decline in the number of refugees who have resettled in the United States, a group of researchers say an algorithm can make the process of resettlement more efficient by placing refugees in areas where they’re most likely to find work.
The algorithm was developed by researchers from Stanford University’s Immigration Policy Lab (IPL) , ETH Zurich and Dartmouth College. Their recent study in the journal Science tested the algorithm on recent refugee data in the U.S. and Switzerland. Currently refugees are assigned to locations based on their capacity in the U.S. and proportional distribution in Switzerland.
The algorithm instead uses supervised machine learning to find the optimal place for a refugee to be located in the country in order to enter work fast, based on historical data and taking into account a range of refugee and location characteristics.
In the U.S. the model improved refugees’ probability of getting a job within 90 days – after which most government assistance stops – by between 25 and 50 percent, according to the study. In Switzerland, refugees’ probability of employment after three years improved by 73 percent.
We spoke to two of the study’s co-authors, Jens Hainmueller and Kirk Bansak, about the limits and opportunities of big data, measuring integration and reactions to the algorithm.
Refugees Deeply: How can an algorithm improve upon the judgments of refugee placement officers? Isn’t it common sense that French-speaking refugees are more likely to find employment in French-speaking cantons of Switzerland?
Jens Hainmueller: It’s a little bit more complicated than that, because as a placement officer you may have a lot of refugees coming in and need to make a lot of these decisions in short amounts of time. Then there are constraints you have – not every case can be sent to every location. These locations vary in terms of their characteristics, and the refugees vary in terms of their characteristics. There’s a gazillion different ways you could allocate refugees to the different locations.
Finding the one allocation that optimizes employment and at the same time satisfies the allocation constraints, that’s actually a very non-trivial kind of decision problem that’s very hard to do by trial and error. If you are doing this manually – let’s say with an Excel sheet – you very quickly get overwhelmed just by the sheer number of different ways to do the allocation.That’s where a machine-learning model can be very helpful to dive into this whole treasure trove of historical data and look for patterns that are very hard to detect otherwise.
The way we envision this working in the real world is a computer-assisted allocation, where the caseworker might have a sense of where they want to send the refugee based on their common sense and their expertise, and then the algorithm will make suggestions based on the historical data of the optimal location. The placement officer can override the suggested allocation of the algorithm or go with the algorithmic allocation. It basically gives them information to make better decisions.
Kirk Bansak: If there were only two types of refugees, one that speaks language A and one that speaks language B, and there were only two different types of destinations, like a French-speaking canton and a German-speaking canton, no algorithm would be needed. But the reality is that there are many different locations. In Switzerland, you have 26 different cantons, and some of them are French-speaking, some of them are German-speaking and some of them speak both languages, while others are Italian-speaking.
And it’s not just language, it’s the local labor markets, and characteristics like different nationalities, age, gender, education level. You have interactions between variables that are discrete or continuous, and the discrete variables can have a ton of different levels. There’s no reason to expect that a single human or a set of humans sitting down with all those different variables, and 20–50 different possible locations, can handle that in a very efficient manner without computational assistance.
Hainmueller: That’s how we view the use of algorithms in general: This is not designed to replace what humans are doing, it’s really how the machine can help the human make better decisions.
Refugees Deeply: Would you expect any political backlash to introducing your algorithm over existing systems – like proportional allocation in Switzerland, which rests on the principle of fairness?
Hainmueller: The algorithm is very flexible, so if a policymaker wants to build specific constraints into the allocation – like proportional allocation to the cantons in Switzerland – we can program those in and the algorithm will try to find the best possible allocation given the constraints. The more constraints you pile on, the fewer gains you will get from the algorithm, but that’s ultimately something that has to be decided in the political process.
Refugees Deeply: Have you seen any interest in piloting the algorithm as yet?
Hainmueller: We are actively in communication with resettlement agencies here in the U.S. and are planning a pilot to test the algorithm. We’re also in communication with the Swiss government on scoping out a potential pilot there.
Refugees Deeply: Could you talk about the significance of the algorithm yielding higher employment rates in areas of both high and low employment in the U.S.?
Bansak: We wanted to make sure that gains were being achieved on average and not just coming from one particular subset of refugees. So it was important that we did see consistent gains across virtually all locations and across all the different subsets. We want this algorithm to help out all refugees regardless of where they’re sent and what their individual characteristics might be, and it’s also important from a political perspective, for there to be buy-in from governments and other stakeholders.
Hainmueller: But also because of the flexibility of the algorithm, if you’re a policymaker and you do want to make sure that certain groups get help more, that’s something that you could also build into the algorithm, for example giving higher weighting to single moms.
Refugees Deeply: How does this algorithm compare to matching algorithms, which take into account community and refugee preferences as well? What are the benefits and drawbacks of each approach?
Bansak: There are a couple of reasons why we went with our approach and not a preference-based matching framework. The first is that preference data don’t actually exist systematically. That could change in the future if governments decide to invest the money and interest into collecting those but for now, that’s a practical impediment.
The second arguable disadvantage is that preference-based matching only works under the assumption that preferences are well-informed. One would have to come up with a really good way of measuring informed preferences on the part of refugees. Right now, those data don’t exist and it’s unclear even if those data did exist, how reliable they would be.
So, with that in mind, we decided to take a different approach that is outcome-based: to figure out what outcome refugees and resettlement agencies care about. Early employment was clearly important from both perspectives. Instead of trying to match people with places that they think they want to live in, we try to match them to places where they can actually achieve better outcomes.
Hainmueller: Another concern that you might have about preference-based approaches is that once communities develop preferences over what types of refugees they want – “We only take Afghanis” or “We don’t want any single males,” for example – you end up in an endless political bargaining situation that could be pretty detrimental. That’s another potential hurdle to implementing this in the real world, and our goal was to build something that you could implement within the existing institutional framework and just make it more efficient. It’s also not the case that you couldn’t combine these two approaches in a combined model. If the preference data were to become available, you could include that information in the algorithm.
Refugees Deeply: Employment is just one measure of refugee integration. Do you have any concerns that the algorithm could risk obscuring other important integration metrics, such as social ties or political participation?
Bansak: That’s something that we definitely think about a lot. At the end of the day, the algorithm is designed to optimize on any variable or index of variables, so you could optimize political participation or a measure of happiness. The key constraint is that that data have to be measured. Given the fact that some things are not measured, it does make us wonder, at least theoretically, whether or not optimizing on early employment would somehow lead to perverse outcomes in other ways, and it’s something that we’d like to track. We can’t come up with any really plausible, good theoretical arguments that that would lead to systematically perverse outcomes. But if it were to become empirically apparent that this were happening then we’d have to rethink the way that this would be implemented.
[pllquote]”In the U.S., data collection has been really very limited – surprisingly, given that the U.S. has been running this refugee resettlement program for decades and we still don’t know much about how refugees fare beyond 90 days.”[/pullquote]
Hainmueller: It would be great if you could optimize on a broader basket of outcomes, but it’s very hard to do right now. In the U.S., data collection has been really very limited – surprisingly, given that the U.S. has been running this refugee resettlement program for decades and we still don’t know much about how refugees fare beyond 90 days. In Switzerland, the data situation is better, so we can optimize on longer-term employment, for example. We do tend to find that early employment is quite predictive of longer-term employment, but it definitely shows the need to collect more data.
Refugees Deeply: Does shaping resettlement around the speed of entering employment risk reinforcing downward-mobility? For example, by prioritizing refugees getting any kind of job fast rather than spending time adapting their skills and qualifications in order to maintain their professional level?
Hainmueller: There are some significant benefits to early employment in terms of creating social ties with natives, learning the language at the workplace and getting out of this situation of limbo that a lot of the refugees find themselves in. The empirical evidence is limited, especially in the U.S., but early employment does seem to be beneficial for better long-term integration.
This gets to a bigger question that we study here at the IPL about what’s the best model for refugee integration. In the U.S., there’s a very heavy focus on early self-sufficiency. You get very limited support from the government, and then you’re on your own – a sink-or-swim model. That can be very beneficial, because you might fall on your feet very quickly, but it can also be tough in the sense that you might be locked into a low-wage job for the rest of your life. Contrast that with what’s sometimes offered in Europe: “Let’s train refugees up and then have them enter into the labor market at a higher-skill job.” Which is better: the sink-or-swim or the train up and enter later model? I think the jury is still pretty much out on that.
This interview has been edited for length and clarity.