Page 5 - i1052-5173-31-2
P. 5
SOME GUIDING PRINCIPLES AND IDEAS 4. Data and information disappeared. This is always an area of worry
What are some of the needs to move forward with online informa- that stems from the fact that no one wants to be responsible for keep-
tion and preserve it through GSA? GSA is doing an excellent job of ing data in perpetuity (whatever that means). Some of the mechanics
making its members’ work available. These contributions are at the behind this should not be a concern. We accept on a daily basis that
heart of what the Society is about. We have established paths for open cloud technologies make it possible to preserve our very important
access for our journal articles. Along with the papers, though, we have information. There is also a worry about whether the data will be
to view the underlying data and observations as essential resources. It readable in the future. However, this seems less of a problem now
is being able to put our hands on the combination of ideas and the data that we have serialization protocols like JSON and GeoJSON that
presented by researchers that forms the infrastructure of much of mod- should be long-lived and easily parsed. We can deposit code and data
ern science. We view this as a cyberinfrastructure that forms the high- structures in places like GitHub and schema.org.
ways for data and the onramps and offramps for ideas. However, geology is different from other sciences in that a criti-
Let’s start by considering the primary data we collect, whether it cal component of our data is knowing why it was collected; we can
is samples, maps, or measurements. Again, my discussion is partly call this the purpose. We collect data and make observations for
but not wholly referenced to field data. A popular way of talking some reason. Any specific purpose will mean that there will be
about data is asking whether it is FAIR—findable, accessible, some bias in data collection. For example, I was asked by a friend
interoperable, and reusable. These qualities and the FAIR principle in grad school who was studying engineering whether I had a pic-
have been fully articulated and written down only in the last five ture of jointed rocks. I did not remember at the time ever taking
years (Wilkinson et al., 2016) but have guided much of the way we one, but told him I would look. It turns out that every picture I had
work with information for a long time. taken as a geologist was of jointed rocks. GSA can take the role of
I think of the idea of FAIR in somewhat different terms using understanding this observation and filtering bias. Considering a
statements developed during NSF-funded workshops on cyberin- study’s purpose leads to a fifth statement:
frastructure and geoinformatics in the middle 2000s. Out of one I need to know why these data were collected.
workshop about evaluating a national geoinformatics community Our activities for these five statements must not be limited to just
organization came the following statements about the scientific and field data and studies. GSA should be willing to take the lead in
public needs surrounding data and publications. almost any area of geology. GSA has a scientific Division structure
I can’t integrate what I can’t find; that is suited to this purpose. We have Divisions for structural geol-
I can’t use something I don’t understand; ogy and tectonics, geoinformatics, sedimentary geology, and geo-
I don’t want to use something I don’t trust; chronology, to name a few. These groups can and should take the
I can’t use something that isn’t there anymore. leading roles on trust, understanding, and purpose. GSA can team
I think these statements give FAIR a more human or individual with other organizations to make things findable and preserved.
level to scientists and anyone wanting to read or understand or use What are the most problematic aspects of the process and these
scientific data. These statements also cover all cases of using cyber- activities? The first is finding a way to maintain what you have, and
infrastructure for research or teaching or self-education. They rep- we will call this sustainability. The second is knowing when you have
resent the concerns of the typical user. done enough. GSA can play a pivotal role in addressing both issues.
What should GSA be doing to address these concerns and be Sustainability is a fundamental problem. As a learned society,
FAIR? We should look at our current activities as a professional how do we preserve our efforts and keep our data for a long time in
Society in light of the statements made above. We must also remem- the online world? This is not worrying about storage and retrieval
ber that a lot of our science starts with field data and products and minimized by cloud resources, standard protocols for electronic
builds from there. storage, and robust data structure formats, but is the process’s orga-
1. I cannot find it. Making information findable is a fundamental nizational oversight. Two things are necessary: Someone has to
goal. We need to ensure that search results are thorough and rele- keep attentive to storing the information, and some group needs to
vant and as complete as reasonable. GSA may not lead in this ensure that the standards for data reporting change as the science
aspect, but we already organize data and maps, and we contribute and reasons for study change. GSA can take a leading role in these.
directly and indirectly to indexing by GeoRef and Google. Along We have published the Geological Society of America Bulletin for
with other societies and organizations such as the USGS, GSA the past 133 years. Surely we can contemplate keeping electronic
must continue to make our products organized and well described. resources going for the next few decades. Just as we did not print
2. I cannot understand it. GSA can take the lead by bundling our journals in-house, we will not store our information in-house
resources for teaching and research, along with all its data and but will work with experts in the field dedicated to this goal. GSA
information. Such activities in the past were singled out as educa- members organized under the scientific Divisions can keep up with
tion and outreach but should be integrated into publications and cutting-edge data collection and reporting needs in different fields.
searches. This is an extra effort but can expand the reach of our This is essentially part of the peer-review process but could be
scholarly products. taken on more fully and explicitly by the Society.
3. I do not trust it. GSA is in an excellent position to deal with trust So who pays for storage and maintenance? Cost is always the
because it is known for its peer review and publications. We cannot most pressing question and one that has not found a good answer.
rest on these accomplishments but must build to the future with At present, the National Science Foundation is funding research on
data-reporting standards with an eye for reusing data in the future. preservation, interoperability, and community engagement in its
GSA should be a leader in setting community standards for data EarthCube program and building cyberinfrastructure in computer
reporting. In that way, we serve all needs, and the GSA imprimatur sciences. While these programs foster cutting-edge scientific and
assures the highest quality. engineering developments, they are not scalable or sustainable for
www.geosociety.org/gsatoday 5