Issue
I'm looking for the fastest way to merge two unsorted collections based on a common id
key.
Below O(N^2) implementation
for (Person per : pers) {
for (Data data : datas) {
if (per.getId().equals(data.getId())) {
per.getData().add(data);
}
}
}
I'm looking for the fastest possible way (and lowest memory footprint possible) to achieve this result, possibly O(N). Duplicates should be removed from per.getData(). For now, per.getData() is a HashSet
Any idea how this could be optimized ? I'm using java 11
Solution
Do one pass over persons to collect into a map for later O(1) lookup, then do one pass over data adding it to person:
Map<Object, Person> people = pers.stream()
.collect(Collectors.toMap(Person::getId, p -> p));
datas.forEach(d -> people.get(d.getId()).add(d));
If it’s possible for a data to have a matching person, filter out unmatched data:
datas.stream()
.filter(d -> people.containsKey(d.getId()))
.forEach(d -> people.get(d.getId()).add(d));
Both ways are O(m+n) (m people, n datas), because all map operations are O(1).
You mentioned that duplicates should be removed from person’s data. Being a HashSet (or any kind of Set), duplicates are automatically removed if equals()
and hashCode()
are coded properly for Data.
Answered By - Bohemian
Answer Checked By - Dawn Plyler (JavaFixing Volunteer)