4 minute read

In this short discussion, I will recap the main points posed by Chapter 3 of Understanding reproducibility and replicability from the National Academy of Sciences, as well as Rey 2009’s Show me the code: Spatial analysis and open source.

Additionally, I will reflect on the following, along with other considerations:

1) To what extent does open source GIS help solve the problems of the reproducibility crisis for geography? How?

2) Are there problems with reproducibility and replicability in geography that open source GIS cannot help solve?

There exist more than 230 distinct scientific fields and subfields, each with very specialized published bodies of literature which often prioritize forms of statistical analysis (across subfields). However, growing support for the open source movement has challenged this traditional, slow-moving, corporate, and hierarchical way of “doing science.” Yet, today’s shift towards open science is merely the next step in a trend that has already been happening (ex. shift towards emphasis on randomized experiments with masking and the introduction of rigid experimental and trial protocols in the 1970s). Overall, we are seeing a democratization of data and computation available in all disciplines. But still, pressure for researchers/scholars to get published in prestigious journals reinforces closed science.

First, I will define two important concepts in the domain of open source science:

Reproducibility = includes the act of a scond researcher recomputing the original results using the same data, code, and methods (transparencey and reproducibility of computations)

Replicability = obtaining consistent results across studies aimed at answering the same scienficic question, each of which has obtained its own data (can be the same or different researchers…what matters is collecting new data)

Additionally, the National Academy of Sciences proposes the following four aspects of reproducible and replicable studies: 1) are data layed out with sufficient transparency and clarity that the results can be checked? 2) if checked, do the data and analysis offered in support of the result actually support that result? 3) if the data and analysis are shown to support the original result, can the result reported be found again in the specific study context investigated? 4) can the result reported or the inference drawn be found again in a broader set of study contexts?

Now that a conceptual groundwork is laid, I will now discuss the questions at hand. Open source refers to the revolutionary collection of tools and processes through which individuals create, share, and apply new software and knowledge. Open source GIS, characterized by the collaborative development and sharing of software code, has become a beacon of hope for the geography community.

Let’s unravel how it positively addresses the reproducibility challenge:

Community Collaboration: At the heart of open source GIS lies a vibrant community of volunteers and enthusiasts. They not only contribute to the development of software but also nurture a culture of shared interests. This collaborative spirit ensures that the code continues to evolve, adapt, and stay relevant over time.

Continuous Feedback Loop: Open source GIS offers continuous feedback channels where users can report bugs, request features, and offer help to fellow users. This real-time engagement enhances the software’s robustness and usability.

Transparency: Perhaps one of the most significant advantages is the transparency of open source software. Users can delve into the source code to examine the precise implementation of spatial analytical methods. This transparency allows users to identify errors in algorithms and directly contribute to their improvement.

Accessibility: Open source GIS software is typically free, breaking down financial barriers. Students, in particular, benefit from the ability to use their own laptops for learning, freeing them from reliance on expensive lab computer licenses.

Learning and Innovation: Access to the source code empowers users to understand and modify GIS tools. This hands-on approach not only facilitates learning but also encourages innovation as users can enhance and release their modified code back to the community.

Are there problems with reproducibility and replicability in geography that open source GIS cannot help solve?

Challenges that Persist

While open source GIS offers a promising path towards reproducibility, it is not without its limitations. Let’s explore some of the persistent challenges:

Developer-Centric Nature: Open source projects often prioritize those with programming skills, inadvertently fostering technological elitism. This leaves non-technical users at a disadvantage.

Documentation Quality: Inadequate documentation is a barrier for both technical and non-technical users. Clear and comprehensive documentation is essential for effective use.

Frequent Updates: The dynamic nature of open source software can be a double-edged sword. While updates bring improvements, they can also be disruptive, especially for long-term research projects reliant on specific versions.

Quality Control: Multiple, often uncredentialed developers working on open source projects can raise concerns about security and quality control.

Dependence on Packages: Geography projects frequently rely on various packages and libraries. These dependencies can create compatibility issues and hinder reproducibility.

Ethical Concerns: Open source code can be used to create closed-source proprietary packages without proper attribution to the original authors, raising ethical questions.

Recognition and Funding: Developers of open source GIS often go unrecognized and underappreciated in academic circles. This lack of recognition can deter scholars from contributing to open source projects.

Peer Review Differences: Peer review in open source projects differs from traditional academic peer review, potentially affecting the credibility of research relying on open source tools.

Conclusion

Open source GIS undeniably plays a pivotal role in addressing geography’s reproducibility crisis, fostering collaboration, promoting transparency, and making research tools accessible. However, it is essential to acknowledge its limitations and actively work towards overcoming them. As the geography community continues to navigate this terrain, the commitment to open source ideals and the recognition of its potential challenges will pave the way for more reproducible and robust research in the field.

REFERENCES

NASEM. 2019. Reproducibility and Replicability in Science. Washington, D.C.: National Academies Press. DOI: [10.17226/25303] Chapter 3, Understanding reproducibility and replicability (pages 31-43 )

Rey, S. J. 2009. Show me the code: Spatial analysis and open source. Journal of Geographical Systems 11 (2):191–207. DOI: [10.1007/s10109-009-0086-8]

Updated: