The University of Minnesota is no stranger to software innovation. Back before the World Wide Web took hold, the U’s invention of Gopher protocol in 1991 allowed users to find and retrieve documents over the Internet in a more easily navigable format than its alternative, the File Transfer Protocol (FTP). More recently, U researchers teamed up with Google to pioneer next-generation smartphone mapping software that can draw 3-D maps just by scanning the interior of a building.
The trend continued in 2013 when Mohamed Mokbel, Ph.D., an associate professor of computer science and engineering with the U’s College of Science and Engineering, created a platform that could handle giant sets of spatial data more quickly and gracefully than anything that came before it. He distributed the software online under the name SpatialHadoop, allowing it to spread rapidly as an “open source” project — one that lets its users freely access and add to its code. Now the Eclipse Foundation, an international organization backed by industry leaders like IBM, Google and Oracle that encourages commercially friendly software development, has adopted the project under the name GeoJinni. Supported by the foundation’s resources, Mokbel’s system will be able to grow and develop in ways that would never have been otherwise possible.
“This software, born through university research, holds enormous potential in opening doors in a field that previously went underserved by the software available to it,” Mokbel said. “I am honored to have the interest of the Eclipse Foundation, whose efforts and resources can help this platform grow much faster. I am intrigued to see how users will apply it to fit their needs in business and in research.”
The software is based on Hadoop, a widely-used program that links the processing power of networked computers to analyze sets of data too large for traditional spreadsheets or databases to handle. Sorting through all that data allows researchers and businesses to see new patterns that help them solve complex problems of all kinds, from designing city traffic infrastructure, to finding new medical treatments, to thwarting online fraud. The field continues to grow, with users finding new uses for such data systems all the time.
Using Mokbel’s system, Hadoop users can apply geographic filters for the first time. That means they can tell the computer to ignore irrelevant sections of the data, which dramatically reduces the time it takes to run an analysis. Imagine, for example, that scientists divided up the Earth’s surface into a grid, with each section covering about 300 square yards. Within each section, they measured the local temperature, humidity, vegetation cover and numerous other data points. Now imagine scientific instruments recorded the whole planet’s data in this manner twice a day for 50 years.
There would be an enormous amount of data available. But what if a researcher only wanted to analyze the portion that made up Minnesota? Mokbel’s software can limit the analysis to that region, while traditional big data systems would still need to scan through the whole world’s data for the same result. In terms of time, that could cut a half-hour process down to just a few seconds.
Open-source development: A two-way street
Since Mokbel released the original version of SpatialHadoop back in March 2013, more than 80,000 users have downloaded his software from the original website. Many more users have obtained it from other users or separate sites, a practice open-source software encourages. As GeoJinni, the program will grow to reach even more users, allowing more programmers to contribute additions to the program’s code.
Before, Mokbel had been limited by funding, part of which came from the U and part from two National Science Foundation grants, and lacked the programming team to review other users’ additions. Now, through the Eclipse Foundation, “two-way” development of GeoJinni can take off. Spatial data has a wide range of applications, and Mokbel anticipates the efforts will uncover new uses for the software that will benefit companies, government agencies and research institutions of all sizes. Over time, he expects SpatialHadoop and GeoJinni will evolve in different ways and focus on meeting different needs.
Meanwhile, he plans to continue developing Spatial Hadoop specifically for the needs of research at the university. Mokbel is now in talks with various groups across the university, including the Minnesota Population Center, as to how SpatialHadoop may benefit them. In his own research, it’s already paying dividends.
“It helps me solve problems that I could not solve before,” he said.