2: An Introduction to Wikibase and Wikidata with Barbara Fischer and Sarah Hartmann

Nov 16, 2020 14:29 · 7387 words · 35 minute read interested 45 many different documents

So greetings and welcome to our series of WikiCite discussion series brought to you by IFLA, the WikiCite Project and the Wikimedia Foundation by the IFLA Wikidata Working Group. We’re really pleased to come to you today and we want to acknowledge and thank the WikiCite Project, Wikimedia Foundation for funds to support this work as well as IFLA. The Wikidata Working Group for IFLA exists to encourage and support the use and development of Wikidata and Wikibase by the professional library community. And this series is intended to allow us to learn about different projects in the library world using open bibliographic data and open citation. We’re really pleased today to have with us two people who are really leading a lot of work around Wikibase and authority data. So with us today is Barbara Fischer. She’s a humanist and arts manager.

01:20 - She works as a Communication Manager and Liaison Counsel at the Office for Library Standards at the German National Library , fostering cross domain collaborations and cooperations concerning the GND, the common authority control for German speaking countries. And also with us today is Sarah Hartmann, and she is a librarian and works at the Office for the Library Standards at the German National Library. She’s part of the team which is responsible for the authority file, the GND, which is used in libraries and numerous other institutions in the German speaking countries. The title of their talk today is the “GND meets Wikibase”, and I’m going to turn it over to them. Hello from our side, best greetings from Germany, at least we can say we are both in Germany, Sarah and me.

02:08 - Sarah is actually in the area of Frankfurt for the moment and I’m in the area of Berlin. We’re very happy to be able to talk to you and give you some insights on the GND meets Wikibase project, which we have been running since the beginning of 2019. What we aim with that cooperation we are running with Wikimedia Deutschland is we would like to make trough Wikibase our free structured authority data easier accessible and interoperable. Moreover, we are testing Wikibase on its functionality as a toolkit for regulations. Maybe you don’t know the GND yet… I can’t blame you. Not everybody knows the GND, but if you live in German speaking countries and you work for libraries, you surely know our authority file.

03:17 - For the moment,it has 16 million identifiers that refer to persons, names of persons. The latter we will delete them shortly. So it’s only eight millions left then. But beyond those persons, we have corporations, conferences, geographical names, subject headings and work titles. Which means it has been the authority file of librarians for a very long time. Now, opening towards other domains such as museums, archives and science institutions. It is cooperatively run by the GND agencies and it has about 1000 institutions as editors or active users.

04:08 - It is completely licenced under CC0 (Creative Commons 0), which means you can really reuse it and it has both an API and documentation, of course, how to use it. For the moment, as I said, we are opening this handy tool of librarians towards the GLAMs field; galleries, libraries, archives and museums and science and even others. There are some administration authorities that are interested in our authority data in order to be able to organise their data and make it more interoperable with others, which means that we are looking for how to integrate these different institutions. This led us to Wikibase. For those that haven’t heard about Wikibase yet, I mean, there are lots of Wiki projects around the world. Wikibase is an open source product on behalf of the Wikimedia Foundation that runs Wikipedia, Wikidata, Wikisource, another Wikimedia projects.

05:22 - Wikibase is like an extension to the well known MediaWiki software, and it is developed by the staff of Wikimedia Deutschland. That extension, Wikibase, is basically serving the needs of Wikidata, which is structured data to empower Wikipedia. We are now facing a development where it becomes a standardised product in order to suit the needs of other institutions or projects beyond Wikidata. This is why we stepped in with our evaluation process. We have, as I said in the beginning, we are running this project since the beginning of 2019.

06:18 - The first year, we focussed on the proof of concept if the Wikibase software as such would be suitable for our needs. Now in the second year, we are testing the capacity. I would like you, if you have the opportunity to follow the links and the presentations to the blog post, you are kindly invited to read them through in order to get further information. As I said, our proof of concept was basically our work in 2019, and we had three basic questions. One question was if the Wikibase software is more convenient for collaborative production than our actual software we are using now.

07:16 - As I said in the beginning, the GND is already produced by lots of different institutions. They are all libraries and of course, they have a rather common set of software. But opening to across the main field means that we face lots of different software solutions out there and we hope that using Wikibase will make it easier to organise that collaboration. We also like to find out if using Wikibase makes it easier to edit authority files. So will it increase the usability in comparison to the software we use? And the last question is more linked to the idea of linked open data and the semantic.

08:15 - If we use Wikibase will that give us access to other authority data and further structured data instances? Those were the three questions. Now I would like to hand over to my colleague Sarah, in order to explain more in detail what we actually did. Thank you. Yeah, thanks. So what have we done so far in this proof of concept? First of all, we installed three GND Wikibase instances by using Linux and Docker and Docker Compose. The first instance for the data modelling which represent the actual GND, or, as we sometimes call it, GND Status Quo. Where we also import existing GND data. The second instance is more for testing purposes.

09:13 - It’s a kind of test system for the data modelling issues and for import of data. And the third instance is for the test of the data modelling concerning all the requirements which come up from the GLAMS, especially from archives and museums so far. To give you an example. There’s definitely a need to state the source for any statement, which is really difficult in our current systems to realise and can be easily fulfilled in Wikibase because it’s one of the key functions within Wikibase. Yeah, what we have done next is to define and create properties manually for the representation of the GND status quo, for example , and added then content via Quick Statements. For the GND status quo or the actual GND, we stick to our existing data model and format where possible and where it was reasonable because we have in mind to import the data and export it as well.

10:28 - Export the data as well, which is established in Wikibase and synchronise it with probably different systems. Yeah, we defined properties for descriptive metadata, for different kinds of entities such as persons, corporate bodies, conferences, work, subjects and geographic names. And additionally, we add some properties for administrative metadata, which are really necessary for some kind of workflows within the GND. As well, we included mappings to our internal format, which is called Pica, and our export format MARC 21, as well as mappings to GND ontology. That’s the vocabulary we use for the representation of the GND as linked data.

11:25 - We included some equivalent properties as well, such as RDA properties. Yeah, then we created some items, examples, so to speak, in order to test the data model we defined, including items for controlled vocabs , which means for use for properties which object data type is ‘item’, for example. Yeah, additionally, we tested and still test if Wikibase is suitable for the documentation and the curation of the GND data model and the rules we use and for the maintenance of other vocabs which are associated with the GND vocabulary. And then we imported a test set of approximately 19000 records, existing GND data records in the Wikibase instance, and therefore , it was necessary to do a conceptual mapping first. We tested it in two different ways and two different mappings, one based on MARC 21 and one based on our internal format.

12:48 - In order to check what’s best for mapping the data and the requirements for import and export of the data. The test hasn’t fully finished yet, but we can say at the moment that the internal format is sometimes easier to map to our defined properties, but on the other hand, we can’t really share our mappings easily and the data modelling issues with the wider community. And that’s probably some crucial point as well. And additionally, we had a closer look at the user and rights management within Wikibase on a more conceptual level, I would say. We just looked at it if it’s possible to rebuild the user and access restrictions in Wikibase. We have some of them within the GND context.

13:54 - Yes, some technical issues we’d like to address, our first approach regarding the import of the data was to use Quick Statements and the second one was to use a Python script. That means we read the MARC data with Pymarc. We changed and added some MARC fields and the properties as well as you can imagine, and the writing is done via Wikidata integrator and the first approach concerning the Quick Statements took longer than the one with WDI, with the Wikidata Integrator. It was a speed up four of 44 percent. That’s quite okay, we would say. If we use Wikidata Integrator for the import of the complete GND of at the moment, approximately eight million records, this will take roundabout 15 days for the initial import. In addition, we had to think about a plan for the import of the whole GND data because it is highly interlinked already.

15:27 - This means in a GND record there are links to other GND records. For example, when stating an affiliation or place of establishment, for example, it is done by an identifier to another GND record for the corporate body or place in this example. And sure, we like to keep these links and rebuild them in Wikibase. So the plan is to do the import incrementally. This means to initially create items QIDs for GND-IDs.

16:10 - Then have a SPARQL-Output or other export for mapping the QIDs to GND-IDs and then add the statements to these items by using these QIDs. So what’s the learning from this proof of concept so far? Wikibase is used primarily for Wikidata or for the creation of new databases so far. But if we want to import existing interlinked data, this requires some time and effort. Yeah, and in principle, one can say that the data modelling and the creation of the data is really easy within Wikibase. I’ll come back to that in a minute and we definitely realised that, especially for import and export purposes, if one likes to synchronise the data in Wikibase with data in other systems, the data modelling needs definitely further testing.

17:21 - So I just said that the creation and updating of the data within Wikibase is really easy and that’s definitely true, if you know and document the rules and stuff within the instance for the users. But we realised that editing templates are necessary for current users and especially for new GND users. That’s similar to our current systems. There are some kinds of templates what properties or attributes to add or to establish when establishing an authority record. Yeah, and we realised that the creation of these editing templates is more complex than expected. But can be done. Now I will hand over to Barbara again. Yeah, thank you, Sarah. Well, if you do have questions at the moment, we can take them or if you like. If you haven’t understood something.

18:32 - Basic questions like that, if not, I would just like to continue for what we are doing this year. All right, so this year we are testing the capacity of Wikibase as a software. As Sarah reflected before Wikibase has until now not established something like a bulk upload button or feature that suits various instances at the same time, because even if Wikipedia statements and the larger extent were integrated into Wikidata, they have to have lots of patience and do that for long nights in order to upload, for instance, the headings of Wikipedia articles. For now, Wikibase has been in use in various institutions and for various communities. But basically for now, in order to build up a database.

19:54 - But in our case, we have already a database and that database is quite large about eight million items. So this is really a challenge for not only for us, but also for our partners at Wikimedia Deutschland in order to really make it happen that we get all our data inside our Wikibase instance. Which means not only items, but of course, also their relations in between those items. So this is what we are working on right now in order to make that happen. And what we also face in that situation is that the exchange formats we use regularly in the librarian field are not standard formats that are used, let’s say, in the Wikimedia field. So you have to do some adaptions there as well.

21:03 - Sometimes it feels a little bit as if Brexit is already fulfilled. You know, when we are back to that situation where the sockets from one country is not fitting the plugins from the other country. Lots of standardisation is necessary here in order to turn Wikibase into a good software platform, where to handle lots of amounts of structured data. What we also are working on this year is we will have our Wikibase instance or the GND in our Wikibase instance as a second home. So we will keep our master of the GND in the CBS where we have it right now.

21:59 - In the software structure where you have it right now, because, in order to extract it there, we would have to change so many of our workflows, so it would not be worthwhile. So our idea is to have like the Wikibase instance for some users and the master copy, in order to facilitate the reviews of the authority file within our normal workflows as a library. The third focus will be, as Sarah mentioned, we need templates, we need interfaces where our users, our editors will find it easier to edit either the record of the person or the record of a work title or a geographical name. Which always will face different properties that are necessary and in order not to confuse them we would really like to provide templates, that only show what kind of properties they would have to fill in for a certain type of entity, and this takes some time to develop that as well because the software is not providing like a set of a modular framing You can simply add together, as you may use or may know from other database software, but you still have to do a lot of coding yourself. Concerning user roles and access management: It is feasible, as we saw in the first year of our evaluation. We can do it.

23:55 - But of course, here are two different concepts meeting each other. Like the Wikimedia concept is that everybody can do everything. And then in the second step, it is or maybe not and in the library field is just the other way around. Only a few can do everything and then it would be nice if more could do more. So, you see, it’s a different approach and we are working on it in order to get it feasible for our specific needs.

24:34 - I would like to come back to the synchronisation problem or the synchronisation idea that we have. We have not yet been able to really code all that. Right now we are looking for coders in order to help us here. But the idea is rather visible in that slide. As you can see on the right-hand side, you will see our Wikibase instance and the user, the editor will be able to modify new entries, enter new entries, get feedback for entries and then the data would be run over to the left side, to our GND-Master instance.

25:32 - And there it comes first as a suggestion which needs to be approved in order to be added to our GND-Master copy. So imagine you would like to edit authority record on Stacey, then you do that in the Wikibase instance and you will add all the boxes needed in order to make it disambiguous. So we won’t confuse Stacey with another Stacey that lives somewhere else and does something else and publishes something else. But we will know it’s Stacey Allison-Cassin and no other Stacey. Then we will transfer to the green part here, the suggestions and it will get approved, and then Stacy Allison’s GND record gets an identifier and becomes part of the GND-Master.

26:38 - This is a rather basic concept, but how to realise it in a technical way is far more complicated because those two software databases are not normally open to each other, which means that we have to focus a lot on API - developing interfaces and making it possible to run the system as much automatized as possible. Coming to the next slide. The other point we are focussing on is the GND-documentation. As you probably all are experienced librarians, you know authority files rely heavily on standards and those standards are created by rules and many of you may have to use the RDA, or a rule sets that derived from the RDA, And so are the German speaking countries doing as well. A lot of this documentation, all these rules and application profiles is dispersed in many different documents. Which makes it like a science of itself. In order to really be sure that you do the right thing when you’re editing an authority record.

So what we are doing now, opening for other domains, we need to focus all that 28:27 - different documentation placed in different spaces and in different documents into one platform. As it is used both for cataloguing and creating authority files, we will also try to specify when is it actually a rule that you need to create a new authority record. And when is it more tilted towards your cataloguing work. So we thought or think that Wikibase might be a very suitable place in order to document everything that is needed to create authority files. and then deriving from that instance of the documentation breaking templates that will help us to edit authority files in the different categories such as persons or geographical names or corporate bodies.

29:37 - We also want to use that documentation in order to give you a true set up of which statements or which kind of properties are needed, if you create a new authority record so you don’t have to like if you are creating an authority record on, let’s say, a geographical name you won’t have to bother about how do I have to create a statement on its time span. But we will know, okay, I need to deal with geo coordinates and I will have to define if it is just a spot or if it is a polygon or so, but I don’t have to focus on all the other properties as well. Sarah told me the other day, she has already developed more than 50 properties for some entity types and the GND and there are far more to come. I think people can be really happy if they get like a reduced set of properties within their Wikibase. So the idea would be that in the long run, you can access our Wikibase instance on the GND both on the entity type.

31:06 - Such as person or geographical name, corporate body or work title. But you can also go through the property like the date of birth or geo coordinate or other relations, meaning like this person is married to the other person. As you are known in Wikidata, which gives you the incredible spread of queries or possibilities of queries. So you can combine different relations types and different property types and entity types in order to create the most incredible kind of queries that give you results that for now, we would not be able to get them within the GND. Next slide, maybe. As we have mentioned now several times, Wikibase is, somehow, I don’t know if that is some idiomatic way to say it in English, but in German, we would say it’s still in its baby shoes. So it is not yet a full commercial product.

32:24 - I mean, it will never be commercial because it’s open source. But you understand what I mean. Wikimedia Deutschland is working on its governance model for that software product and to have it going beyond like a community driven extension of the media wiki. And when I speak of community driven. The default community member is not an institution, right now, the default is an individual and it’s quite different if you are working with individuals or if you are working with institutions, it’s not just the same. Throughout the year, we are tightly working together with Wikimedia, finding out together what that governance model means, and we are happy that they are sharing so openly their thoughts with us and we can tell them our requirements and our thoughts. What we also found is that it is really helpful to look out for other institutions that are interested in the use of Wikibase and would like to deepen their exchange on knowledge and skills in using Wikibase or applying Wikibase for their own needs.

33:55 - So here too we are very interested in getting the feedback in order to find out what are you doing with the Wikibase software? Because we feel we are somehow part of something greater. I mentioned before that, you see, by using Wikibase the opportunity to become part of the data ecosystem going far beyond libraries. Also going beyond the idea to link our data to Wikidata but to link it to various different data hubs and not pretending to be able to suck in everything to our data hub, but rather to build bridges from one data hub to the other. And so it’s creating a very large but solid network of information throughout the world. I reckon there’s still another slide. That’s the slide saying thank you and we are very, very delighted and curious to hear your remarks and questions, so please don’t hesitate to come up with them.

35:20 - Great, so thank you very much for that presentation. There is a lot of really, really fascinating and exciting content. I mean, I’m speaking as a probably a metadata geek, but I think that we’re probably all in that in that place. So thank you. We do have a few questions we’d like to ask. I will start and speaking to the world of professional librarians, people working in libraries around the world, what do you see as being sort of the area of greatest need that you would if you could communicate to these librarians around the world? What would you like to say to people to think about where they can help or how they can help with the project of bringing bibliographic data and especially authority data into this link data ecosystem? Well, I don’t know, Sarah, you certainly want to add things to that, but I believe that, for one thing, librarians have been working such a long time in sharing metadata across their infrastructures, so they’re far ahead in everything that concerns standards, and that’s something that museums and archives, because their working routines were different, have not developed yet, so or not that to that extent, let’s say. That’s one thing.

37:09 - There is a lot of standardisation skills within the libraries we would like to - from the GLAM field as a whole - that would be welcomed to share. And the other part is that, in the last years, we have digitised lots and lots of content, and this content is now online. And if you don’t find it, it is as if it wasn’t there, so we need to provide tools, provide means how to find this content. And of course, we can all rely on some algorithms that are provided by some nice enterprises. But, of course, they will provide these services on their business model and if we like to have like an alternative to that, we need to be more independent and providing means to interconnect our content and increase the retrieval of our content beyond those algorithms or alongside this algorithm.

38:27 - It’s not so much going against each other or being opponents. It is more to have like an alternative way to find that. I see that more and more GLAM institutions and more and more science institutions go for that, and they find that metadata is becoming something that is very useful to share. So they are looking around for people that have those skills to create metadata that you can share. And that’s where librarians come in. So I really hope that librarians are willing and find it easy to share their skills with the other GLAM folks.

39:21 - I just like to add one thing when talking about the algorithms stuff and just mentioning all the linked data representations out there, that’s definitely something to improve, I would say. But at the moment, I do see definitely a really positive within the Wikibase world to ease the linkage when establishing the record and authority records, so to speak, just to give you hints, where is another representation of this entity I’d just add something to or something like that. That’s probably a pain point in the systems we use at the moment. We can add some other identifiers for the same entity, but that’s not that easy, you often have to look in another system and then add the identifier or copy paste it or something like that. Hopefully, this could be very easy to solve so that some cataloguers, I use the old term or editors can just add links to other entities and representations of the same entities as well.

40:46 - May I just add, like a specific point there, if you use an open source software product instead of using software, that i s run under licence systems, then it may make it easier to build these links, and make it easier to build those bridges . Which will then be like in the long run, more sustainable for our system as well. Like when we all started with the digital transformation, most people thought, well, the best idea would be to have it all in one, right? Get it all into Wikidata or get it all into Wikipedia or another system. But then you see that… It’s not only data, but it’s also persons. It is humans that create this data, and it’s humans that also care for that data.

41:57 - Humans are needed in order to increase the quality of the data. Humans need the motivation to work. It’s not only their paycheque that makes them work it’s also their love for the data they care for. But then they need to be able to identify with that data. And when the datahub gets too large, then the distance between me and my data set is getting too large as well. So it’s probably better to have like smaller data hubs where you as a community rule how this data is created and how you care for it, which rules apply to it and what properties you like to have instead of having a huge “one fits all” kind of data hub, so here, we really hope that using the open-source software Wikibase will make it easier to collaborate but still keep our own space. Yeah, that’s great, thank you.

43:13 - I’m just going to I think that’s really wonderful to think about this idea and I think we’ve seen it in some of the other talks we’ve had to around thinking about how, you know, it’s data, but it’s it is related to humans its human culture, human relationships. And so something to think about is sort of the relationality of the data and what that means for us as sort of caretakers or thinking about how we create this data in a lot of ways to make the world a better place, to bring more visibility to the things we care about and something that I think is really wonderful about the possibilities of this project, as we know, is cataloguers. And I’ll even use the old terminology for catalogue’s that when we create authority files or authority records frequently add in lots and lots of content, which often doesn’t get used. There’s a lot of research that often happens when we’re creating entities in an original way around people or places or topics. And so the ability to connect up some of those more granular pieces of data around, let’s say, a person entity with a person and entity of the same person somewhere else, and then to be able to query that data, to bring about really interesting pictures of some of those relationships is really fascinating and wonderful and I think is very motivating.

44:46 - And so being able to bring it into an open-source system which allows bibliographic authority data to build those relationships across in an international context is really exciting. So I think it’s a really great way of thinking about this idea of caring or caretaking for sort of our data sets that we produce. But then also allowing for these interconnections with all different kinds of data stores and beyond the library as well, which I think again is really exciting. So thank you for that elaboration. I’m going to turn it over to Miguel, who I think has a question. Yeah, I do. You mentioned from your presentation that the problem with templates to describe data. Right.

So one of the things I was thinking about is whether and you also speak about the 45:37 - Wikibase governance so we are also depending on the evolution of the software itself. So my question is a very straightforward one is how can libraries or librarians in this case that have a lot of years of experience without the modelling help, maybe Wikimedia projects to fit our problems? Because, as you mentioned, probably in the GLAMs sector, you have a lot of descriptions that are already structured. So you have to find out where from one million properties that you have on Wikidata, for example, you already had that. So probably we would need to get the coders and say, OK, so I need, for example, a file or a model for a person and what are fields that are necessary to describe that. Okay, so and of course, with the flexibility to add extra items.

46:49 - So my question is, how do you see the library community working together with the project of Wikimedia, mainly Wikidata and Wikibase. Shall I start? I do believe there are editing templates for Wikidata at the moment and but it’s really I would say it’s kind of hard coded and it’s strictly sticked to Wikidata and it’s not very open to all Wikibase instances. And there’s at the moment, it would mean a lot of scripting and stuff to establish such templates. We heard I think it was last week or so that the Wikimedia Foundation established a subgroup of librarians. So this would be a chance to just discuss these issues and bring forward all the requirements we have.

48:03 - And I do believe it’s on the roadmap for Wikimedia to just ask for requirements, especially in this field of editing templates. I would also like to remark then that the specific good thing about open-source is that you can add code snippets to it. So it’s not as working with, let’s say, SAP, where you have to turn to SAP with a requirement procedure. Well, we would like to have this kind of alternation in the software code and then maybe a few years later it is done. But you can do it yourself. What is needed is then to find out if your extension of the extension, like the template you build, if that will still work when there is a new version of the Wikibase software.

49:07 - So this is like the specific interface where we’ll have to see how the Wikimedia governance is working. But the good thing is that we can build our own templates as soon as we have the capacity to do so. I mean, I can’t do that coding nor Sarah but I mean, it is not that. We don’t have to rely on the Wikimedia staff for it, so it’s not sending out like a Christmas letter to Wikimedia saying, please give us a new template. Now we can do that by ourselves. And then what we are debating right now is how to make sure that if we create templates, how will they stick to Wikibase if that gets further developed? That’s one thing.

50:00 - The other thing is that there are lots of tools that make it easier to work with Wikidata. But they are specific for Wikidata, so now we are looking at that long list of tools, finding out which are the tools that are applicable to Wikibase straight ahead. And what is needed in order to make them applicable? And which of those tools are the ones that we really love to be applicable? And then again, I think… Creating this subdivision of librarians within the Wikidata group for Wikimedia also means, okay, it’s not only hearing our requirements and listening to our requirements, but it’s also gathering skills like there are lots of people around the world that work on this topic. And in order like not to have the same thing recreated over and over again, it is very good to have it in a structured way and gathered together. So. Yeah, it’s a little bit like…

51:12 - being a pioneer, if you start to work with Wikibase, it’s a little bit like being a pioneer in a new field that has many, many advantages because you’re rather free. But it also has some disadvantages because you don’t know what you are ending up with. Thank you. It’s great, I think Joachim you a question next. Hi, my name is Joachim Neubert from ZBW in Germany. One question perhaps related to the last one by Miguel, you mentioned entity types in GND, like thousands of corporate bodies, or our work titles.

52:06 - And do you have some learning how to implement this entity types? And do you have plans for using the relatively new schema extension of Wikidata forces, which would allow perhaps describe the shape of each entity type and even perhaps base some tooling on this shape description, which is done experimentally with cradle, for example? Which could make perhaps it easier to adapt these entity types to Wikidata, as well as such GND conventions. You’re definitely right Joachim. We hadn’t had the chance to have a deeper look into the schema extension so far and our proof of concept, but that’s definitely on our agenda. For example, as you mentioned, referring to the entity types and the constraints, so to speak. May I add something to that answer, I think the one of the challenges in creating the linked open data ecosystem we talked about is finding a way of doing federation in a rather, let’s say, smooth way. For an authority file, it is very hard to say that something is not exactly matching.

54:04 - Normally you create authority file in order to be able to say it’s exactly that, that’s what they are for. But if you are going to link it to other data hubs, you might find that it is not that easy to link these semantics of one data hub to the exact semantics of the other data hub. So we will have to find solutions for, let’s say, like a fuzzy way of matching those two or three or four or even more concepts. I think that is also valid for the data models , like the schemes that we find in Wikidata are for defining categories, may not be fully identical with the schemes we use within the GND, but we can say there are somehow, more or less, fitting to each other and in order to balance this, somehow, you need to be both in the technical way, but also in the conceptional way. You need to be a little bit flexible. And this is something we… We are working on. Sorry. I suppose this was a bit of a misunderstanding, my idea was not to use any of the existing schemes in Wikidata, which would make no sense because things are different conceptually, but do you see the mechanism of schema building for describing the GND structures? And I think there could be some potential tool to use it in this way.

56:12 - So as a policy of currently just defining classes for different entity types. This is a good start for connecting more semantics to this how these classes should be defined. Sarah said this is on our agenda and we will have to look deeper into it and maybe we can take up the topic in a year or so, and then we would be able to give you some deeper insights. Great. So I think we just have time for one more question, so I’m going to turn it over to Carla to ask our last question of our session. Thank you so much for your presentation and talk, I belong to the scientific academic community and because I work in a university and, you know, there are a lot of issues because there is always the peer review, the data, etc..

57:32 - So I would like to have an insight from you or reply if you are facing the same situation. This is an interesting meeting because with the scientific community, they might say that you are an academic librarian so this work will be too demanding and too much work, so we should leave it. So can you reply to it? Thank you. Sarah would you like to start? Or you want me to start? Go ahead. Okay. Right now, they’re across Europe, you will find there is a lot of infrastructures built for scientific research data.

58:26 - I believe that standardisation within the research field is even less developed than within the GLAM field as a whole. But we also see that… Research is done more and more across domains, so it’s across disciplines, so it’s not only the historian working with historic data, but it is also the other humanities and maybe even life sciences that like to share that data. And in order to be able to share that data, they would need to find reference points where they could link it up. So I think that could be a very basic motivation for also scientists to include in their metadata on their research data to include authority records in order to make it easier to link other data to that. And it’s not only about linking data, but it is also reducing your workload because if you use authority records, you don’t have to repeat everything that is already within the authority record.

59:50 - That’s why they were created in the first place. Librarians are very efficient workers they like to use or they establish this idea of authority record in order to reduce the workload. And this is something especially scientists, they are very often working on their own, like one-woman-shows, should be keen to hear and keen to learn how they could possibly reduce their workload and gaining at the same time all the features of linked open data. So I think your question is going in that direction, like how to motivate the scientists to use authority records is a two-fold advantage. You have like for one of the advantages, you gain more connexion and visibility for your data.

And the second advantage, you reduce your workload because you don’t have to add all 01:01 - the information that is already out there, but you can reuse it. There might be a third one. Adding like not only this ability to use data but making your data by cooperating with others more standardized… I can’t say the word - following up the standards and thus make it easier that this data may be reused. So you live up to these fair data principles that very often are linked to the funding of research across Europe as well. I’d just like to add one thing short, I promise.

01:53 - Probably you’ve heard of the activities which we’ve done right now within Germany, we had and have a project called ORCID DE for Germany. This approach is to just link but link the identifiers, ORCID and GND as well. It’s possible to integrate within the ORCID profile and have a look up of the GND and establish a linkage between ORCID and GND, and we do import the identifiers as well if we get those from scientific publishers. For example, we get those ORCID identifiers and then match it with the GND identifiers and establish a link between our bib data and the authority records just mentioning that project. Well, I think that’s a really wonderful place to end, considering this is a WikiCite project and thinking about those linkages between authority, data created by libraries, data in ORCID, the data needed for a citation and the visibility of publications and research around the world.

03:14 - I mean, these are really important ways of leveraging and reusing the data that is created for all of these things rather than having to recreate it. Of course, I do love efficiencies and metadata creation. So how do we not duplicate all of those efforts in different databases, but also allowing for really important interlinking to make all of those things more visible? So that’s excellent. So we are at the end of our time. But thank you both very much, Sarah and Barbara, for joining us today and talking about that work that’s happening. It’s very, very exciting. We really do look forward to hearing more about it in the future. So thank you. Oh, you’re so welcome.

04:03 - We really enjoyed being with you and here all your questions. And please, when you see that recording, don’t hesitate to contact us. Reach out to us in order to link up. Thank you. Thank you. .