Today something a bit different. We finally got the chance to speak with Lars Zeigermann, author of the Stata module for accessing our geocoding API. Since Lars published his module about a year ago, we see more and more academics from around the world using Stata to access our service.
1. Who are you and what is your interest in geodata?
Until very recently, I was a researcher at the Düsseldorf Institute for Competitions Economics (DICE), studying competition in the German grocery retailing sector. Studies show – not surprisingly - that consumers shop in close proximity to their homes. While the big national chains have thousands of stores across the country and are practically available for everyone, smaller competitors are only present to consumers in some local markets. Although chains with fewer outlets may seem not too important from a national perspective, they often have regional strongholds.
Here, geodata comes into play: I use a comprehensive data set of German supermarket addresses to model the supplier structure specific to the locality of the each consumer in my data. The results show that substitution patterns and competition between retail chains vary greatly between regions. Even retailers with only a small number of stores may exert substantial competitive pressure on national rivals in some local markets. Hence, competition authorities should carefully investigate local supplier structures when assessing mergers in the grocery retailing sector.
In my new position as an economic analyst, I work on a related project where geodata again plays a crucial role.
2. What prompted you to write your Stata module to access our API?
Well, when I started my research project I was looking for a geocoding routine for Stata (the software I use for my statistical analysis) that was a) easy to use and b) had terms of use allowing me to store the data.
While there are a number of alternatives available satisfying the first requirement, none would allow me to store the data and use it in my statistical models. So I started looking around and came across OpenCage Data which was - thanks to its flexible terms of use - ideal for my purpose. Talking to colleagues and following discussions on the internet I realized, that a Stata module accessing OpenCage Data’s API could be useful to others.
3. I have to admit Stata isn’t a language I know well. What are its strengths and weaknesses? When are the types of projects where it is a good fit?
Stata is a statistical software package used by researchers in the social sciences, but also in biomedicine and epidemiology (says Wikipedia). For any kind of statistical analysis in these fields, Stata provides many well-documented and straightforward to use commands and packages. Unfortunately, users don’t get access to built-in functions provided by StataCorp and hence cannot easily adjust them to their specific needs – this can be very frustrating. Also, Stata is not open-source and rather expensive. For all those who do not have access to Stata (or need a more flexible environment) and are willing to acquire some basic programming skills, R is great alternative. Fortunately, someone else made a wrapper for the OpenCage Data API available in R.
4. Any feedback on our service? What could we do better?
The OpenCage Geocoder does very well what it does. One feature you might consider adding in the future is batch-geocoding. A second issue is misspelled addresses which almost always give no results. Many sources of address data have typos which are sometimes hard to detect and correct. It would be extremely helpful if the OpenCage Geocoder would be tolerant to minor spelling mistakes.
5. What’s the best way for people to give you feedback on the module or contribute fixes or enhancements?
By email at zeigermann @ dice.hhu.de. Suggestions or comments are highly appreciated!!
6. In 2014 OSM celebrated its 10th birthday. Where do you think the project will be in 10 years time?
I am relatively new to OSM and therefore cannot really say much about this. However, I can say that I am amazed about what is already possible today. I hope we will see more APIs like OpenCage Data’s in the future which open up the treasurers of OSM to everyone.
Thanks Lars! For taking the time to talk with us, but also of course for making the module. Agree, it’s amazing what is possible with open data today. Also agree we need to get better at dealing with misspellings. It’s hard, but we’re working on it.
You can see all the Open Geo interviews here. Please let us know if your community would like to be part of our series. If you are or know of someone we should interview, please get in touch, we’re always looking to promote people doing interesting things with open geo data.