Hello everyone!
I am beginning this conversation with two goals in mind. First, I would like to share and discuss my new transformer OpenAIGeographicCreator, and second, we at Safe would like to see how a post of the new type, “Conversation” performs in the community. The idea is to have a way to expand conversations beyond simple “question-answer” style and have meaningful discussions among multiple participants about problems, solutions, functionality and everything FME related. Here, we could run our ideas by the whole FME community, get advice or inspiration and share our successes and even failures - together, we may be able to help.
So, the story of this transformer began with
I quickly made a working prototype and realized that OpenAI can really create geometries in the form of GeoJSON from a free-form text description of a location. For example, we can ask the API to give us a center point and a bounding box of Vancouver or a much smaller town in Finland called Porvoo or a skiing resort on the North Shore of Lower Mainland in British Columbia. I thought that would be it - a quick way to create an approximate geometry on any location on the planet with no geocoders. But then I tried more interesting inputs and they worked, too. For example, the following text returns a box, through which we see the words “Red Deer” on the background map.
the biggest city on a highway halfway between Calgary and Edmonton
I tried other languages (Київ), old names (Terijoki, Petrograd) - it all worked. This meant I couldn’t limit my output to geometry only - after all, the generated geometry needs to know the name of that city between Calgary and Edmonton. I added a few attributes, and OpenAI was happy to create the full GeoJSON for me. Now the output contains the name, the local name (if different from English), country, confidence and match level, notes about the location and AI decision making and even English pronunciation, quite a rich set!
There was one thing I really struggled with. No matter how far I stretched my imagination in creating fictional street names in my own city (for example, Smaug the Dragon Alley, Port Moody, BC), I was getting a “Street” level match with “High” confidence. What was the problem? Well, AI does not think like we do. Combining statements, ideas, instructions in different order may significantly influence the outcome.
In my original prompt I asked for a GeoJSON for a given location, explained the GeoJSON structure, and after that, I gave instructions for cases when a requested location is unknown (set match level to a higher level geography and set confidence to “medium”).
To get a more adequate result, I had to change the order of my instructions. Now I request a location and explain right away what to do if it is not found. Only then, I explain how the GeoJSON should look like and that made all the difference. Smaug the Dragon Alley does not exist in my city anymore.
An important part of the transformer creation is testing, and here I also used OpenAI for generating test sets for me. One example contained a very diverse set of geographies around the world. Here is a small snippet:
Machu Picchu, Cusco Region, Peru
Lake Baikal, Siberia, Russia
Niagara Falls, Ontario, Canada / New York, USA
Angel Falls, Canaima National Park, Venezuela
Salar de Uyuni, Potosi Department, Bolivia
Plitvice Lakes National Park, Croatia
Halong Bay, Quang Ninh Province, Vietnam
Serengeti National Park, Tanzania
Petra, Ma'an Governorate, Jordan
Santorini, Cyclades, Greece
Victoria Falls, Zambia/Zimbabwe
Bora Bora, French Polynesia
Aoraki Mount Cook, Canterbury, New Zealand
Bourbon Street, New Orleans, Louisiana, USA
Another dataset used crossword-like clues. Here is a small sample of this dataset:
The mountain range that separates Europe from Asia.
An island nation in the Pacific Ocean known for its unique wildlife, situated off the southeastern coast of Africa.
The highest mountain peak in North America.
The capital city located on the banks of the River Thames.
The smallest country in the world, surrounded entirely by Rome.
The largest coral reef system, located off the coast of Queensland, Australia.
A major city famously split by a canal connecting two major oceans, in Central America.
For both datasets, the API produced good results, although you may correctly notice, the AI was answering its own questions. This is a valid point, and it means that now is your turn to try the OpenAIGeographicCreator and let the community know what you think. Is this a useful transformer? How can we make it better? Does it work for you on geographies you know? What other attributes would be useful in the output?
Let’s try to get this conversation going!