Issue
Problem description
I am currently working on a project which requires a relational database for storage. After thinking about the data and its relations for a while I ran into a quite repetitive problem:
I encountered a common data schema for entity A which contains some fields e.g. name, description, value. This entity is connected with entity B in multiple n-1 relations. So entity B has n entities A in relation rel1 and n entities A in relation rel2.
Now I am trying to break down this datamodel into a schema for a relational database (e.g. Postgres, MySQL). After some research, I have not really found "the best" solution for this particular problem.
Some similar questions I have found so far:
My ideas
So I have thought about possible solutions which I am going to present here:
1. Duplicate table
The relationship from entity B to entity A has a certain meaning to it. So it is possible to create multiple tables (1 per relationship). This would solve all immediate problems but essentially duplicate the tables which means that changes now have to be reflected to multiple tables (e.g. a new column).
2. Introduce a type column
Instead of multiple relationships, I could just say "Entity B is connected with n entity A". Additionally, I would add a type column that then tells me to which relation entity A belongs. I am not exactly sure how this is represented with common ORMs like Spring-Hibernate and if this introduces additional problems that I am currently unaware of.
3. Abstract the common attributes of entity A
Another option is to create a ADetails entity, which bundles all attributes of entity A. Then I would create two entities that represent each relationship and which are connected to the ADetails entity in a 1-to-1 relationship. This would solve the interpretation problem of the foreign key but might be too much overhead.
My Question
- In the context of a medium-large-sized project, are any of these solutions viable?
- Are there certain Cons that rule out one particular approach?
- Are there other (better) options I haven't thought about?
I appreciate any help on this matter.
Edit 1 - PPR (Person-Party-Role)
Thanks for the suggestion from AntC. PPR Description
I think the described situation matches my problem. Let's break it down:
Entity B is an event. There exists only one event for the given participants to make this easier. So the relationship from event to participant is 1-n.
Entity A can be described as Groups, People, Organization but given my situation they all have the same attributes. Hence, splitting them up into separate tables felt like the wrong idea.
To explain the situation with the class diagram: An Event (Entity B) has a collection of n Groups (Entity A), n People (Entity A) and n Organizations (Entity A).
If I understand correctly the suggestion is the following:
- In my case the relationship between Event and Participant is 1-n
- The RefRoles table represents the ParticipantType column that descibes to which relationship the Participant belongs (is it a customer or part of the service for the event for example)
- Because all my Groups, People and Organizations have the same attributes the only table required at this point is the Participant table
- If there are individual attributes in the future I would introduce a new table (e.g. People) that references the Participant in a 1-1 relationship.
- If there are multiple tables going to be added, the foreign key of the multiple 1-1 relationship is mutually exclusive (so there can only be one Group/Person/Organization for a participant)
Solution suggested by AntC and Christian Beikov
Splitting up the tables does make sense while keeping the common attributes in one table. At the moment there are no individual attributes but the type column is not required anymore because the foreign keys can be used to see which relationship the entity belongs to.
I have created a small example for this:
- There exist 3 types (previously type column) of people for an event: Staff, VIP, Visitor
- The common attributes are mapped in a 1-1-relationship to the person table.
- To make it simple: Each Person (Staff, VIP, Visitor) can only participate in one event. (Would be n-m-relationship in a more advanced example)
The database schema would be the following:
This approach is better than the type column in my opinion. It also solves having to interprete the entity based on its type in the application later on. It is also possible to resolve a type column in an ORM (see this question) but this approach avoids the struggle if the ORM you are using does not support resolving it.
Solution
IMO since you already use dedicated terms for these objects, they probably will diverge and splitting up a table afterwards is quite some work, also on the code side, so I would suggest you map dedicated entities/tables from the beginning.
Answered By - Christian Beikov