admin管理员组

文章数量:1134752

I have an instance of Wikibase, and I am working on a project in which I study university professors and the courses they taught (19th-20th centuries).

I would like to be able to encode information such as:

prof_1 taught course_A from semester 1 of academic year 1890-1891 to semester 2 of academic year 1893-1894

My objective is to encode the data as directly and easily as possible so that I can build SPARQL queries to ask questions like

who did teach course_A from semester 1 1891-1892 to semester 1 1892-1893 ?

I guess the natural way to encode that would be to use qualifiers, such as Wikidata's start time (P580) and end time (P582).

But here the "natural" chronological granularity is the academic semester, which is a priori not a "Point in time" value.

What is the best solution to encode this?

  1. creating an entity for each academic semester, and encoding start and end dates for each of them (which will be approximate, as we do not know them exactly), so that I can simply rely on the dates themselves, using properties like "start semester" and "end semester" (I guess this is the easy way, but I feel I'm getting around a problem that could be solved thanks to RDF's flexibility);
  2. finding a way to make academic semesters "Point in time" values, so that I can directly search in the "ordered" semesters (I don't even know if that makes sense);
  3. any other solution I have not thought of?

I have an instance of Wikibase, and I am working on a project in which I study university professors and the courses they taught (19th-20th centuries).

I would like to be able to encode information such as:

prof_1 taught course_A from semester 1 of academic year 1890-1891 to semester 2 of academic year 1893-1894

My objective is to encode the data as directly and easily as possible so that I can build SPARQL queries to ask questions like

who did teach course_A from semester 1 1891-1892 to semester 1 1892-1893 ?

I guess the natural way to encode that would be to use qualifiers, such as Wikidata's start time (P580) and end time (P582).

But here the "natural" chronological granularity is the academic semester, which is a priori not a "Point in time" value.

What is the best solution to encode this?

  1. creating an entity for each academic semester, and encoding start and end dates for each of them (which will be approximate, as we do not know them exactly), so that I can simply rely on the dates themselves, using properties like "start semester" and "end semester" (I guess this is the easy way, but I feel I'm getting around a problem that could be solved thanks to RDF's flexibility);
  2. finding a way to make academic semesters "Point in time" values, so that I can directly search in the "ordered" semesters (I don't even know if that makes sense);
  3. any other solution I have not thought of?
Share Improve this question asked Dec 29, 2024 at 11:16 InstantonInstanton 3994 silver badges15 bronze badges 2
  • 1 There exist also academic trimesters. – Stanislav Kralin Commented Dec 29, 2024 at 16:34
  • Indeed! The problem would be similar, it seems to me – Instanton Commented Dec 29, 2024 at 18:18
Add a comment  | 

1 Answer 1

Reset to default 1 +50

The way you could encode it is to think abstractly of the period, that is, instead of encoding separate semesters/trimesters, encode the time period when they were teaching. So, if someone was teaching each semester between 1890-1910 a subject, then you would encode the start date and the end date of the period he/she was teaching said subject, of course, with a subject id/code as well. So if you search for

who did teach course_A from semester 1 1891-1892 to semester 1 1892-1893 ?

then you effectively search for interval intersection between the periods of teaching and the semesters. Hence, either start date of semester 1 1891-1892 is between the start date of the period of teaching (presuming, of course that we speak of the same subject) or the end of the semester 1 1892-1893.

This would answer who taught that subject in this period of time, but you can vary your search of course.

You could search for interval subset, that is, if you want to only find teachers who were teaching the subject in the entirety of said period, then you check for the teaching period of teachers being subset or equal to the interval of search.

Or, the other way around, you can search for teachers not teaching outside an interval as well.

Now, how you define the start point and end points of your intervals? This remains the last question to solve. I would say that by default you would have a start date (like September 1st) for a semester, which would be stored to all semesters which plausibly could have started at that date but the actual start is not known, but wherever you know the exact date, you can override it.

As about search, if you search for a specific university's teachers, you can use the actual start and end dates of the semesters, but if you search more broadly across different universities which may have had their semesters start and end at different times, then you could define the concept of teaching year and fill this with data like 1890 (start year of the teaching year), etc. and semester with values of 1 and 2 and define functions to which you pass the start year and the semester which would figure out for each period whether it overlaps, encapsulates or integrates into said period.

本文标签: dateHow to smartly encode academic semesters in RDFWikibaseStack Overflow