Issue
I have an issue. I got a csv in enter, with values like :
99,TEST_ABDC,AB01,0000001
99,TEST_ABDC,AB01,0000002
99,TEST_ABDC,AB01,0000003
99,TEST_ABDC,AB01,0000004
99,TEST_ABDC,AB01,0000005
01,TEST_ABDC,AB01,0000006
01,TEST_ABDC,AB01,0000007
02,TEST_ABDC,AB01,0000007
The name of the file is like : AB01_TEST_ABDC_YYYYMMDD.csv
I succesfully read the file and put it in a list of Bar like :
[Bar[99,TEST_ABDC,AB01,0000001],
Bar[99,TEST_ABDC,AB01,0000002],
Bar[99,TEST_ABDC,AB01,0000003],
Bar[99,TEST_ABDC,AB01,0000004],
Bar[99,TEST_ABDC,AB01,0000005],
Bar[01,TEST_ABDC,AB01,0000006],
Bar[01,TEST_ABDC,AB01,0000007],
Bar[02,TEST_ABDC,AB01,0000007]]
and I need to put them into a map like :
baz = HashMap<Foo, List<Bar>>
Bar is the bean of a line in the csv and Foo is a bean created from some elements of Bar
For now, I successfully split the list to a map by the first column's values of the csv. Output of that :
Foo[99, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 0]=[Bar[99,TEST_ABDC,AB01,0000001], Bar[99,TEST_ABDC,AB01,0000002], Bar[99,TEST_ABDC,AB01,0000003], Bar[99,TEST_ABDC,AB01,0000004], Bar[99,TEST_ABDC,AB01,0000005]
Foo[01, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 0]=[Bar[01,TEST_ABDC,AB01,0000006], Bar[01,TEST_ABDC,AB01,0000007]]
Foo[02, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 0]=[Bar[02,TEST_ABDC,AB01,0000007]
Here is how I do it :
baz = listBar.stream().distinct()
.collect(Collectors.groupingBy(b -> b.getFooFromBar(fileAbsolutePath)));
baz must be split by the first column of the csv and the iteration of the line per the first column value. Because it must not have more than N element in a list for a key. (in the example below : 2)
Foo[99, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 0]=[Bar[99,TEST_ABDC,AB01,0000001], Bar[99,TEST_ABDC,AB01,0000002]]
Foo[99, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 1]=[Bar[99,TEST_ABDC,AB01,0000003], Bar[99,TEST_ABDC,AB01,0000004]]
Foo[99, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 2]=[Bar[99,TEST_ABDC,AB01,0000005]
Foo[01, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 0]=[Bar[01,TEST_ABDC,AB01,0000006], Bar[01,TEST_ABDC,AB01,0000007]]
Foo[02, TEST_ABDC, AB01_TEST_ABDC_YYYYMMDD.csv, 0]=[Bar[02,TEST_ABDC,AB01,0000007]
I need to do it with java's streams.
I need to split the list in N element max per key. I have a huge amount of data (something like 10000 k+). It must be split in, for example, lists of 2k elements.
How can i do it, please?
Solution
I would suggest to first chunk your lists into desierd length of sublists and get a Map<Foo,List<List<Bar>>>
. To do so just add the following method to your main class:
static <T> List<List<T>> chunk(List<T> list, int n){
final AtomicInteger counter = new AtomicInteger();
return new ArrayList<>(list.stream()
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / n))
.values());
}
then add another constructor to your Foo
class, which accepts a Foo
and an int
and to create a new Foo
by copying the first three fields and apply the new param to the forth field
public Foo(final Foo foo, int field4) {
this.field1 = foo.field1;
this.field2 = foo.field2;
this.field3 = foo.field3;
this.field4 = field4;
}
Now you can use the above to stream over your baz
map, chunk the values into sublist create new key-value pairs for each entry and flatmap and finally collect to map. Example:
import java.io.IOException;
import java.util.AbstractMap;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import lombok.AllArgsConstructor;
import lombok.Getter;
public class Example {
public static void main(String[] args) throws IOException {
Map<Foo,List<Bar>> baz =
Map.of(new Foo("99", "TEST_ABDC", "AB01_TEST_ABDC_YYYYMMDD.csv", 0),
List.of(new Bar("99","TEST_ABDC","AB01","0000001"),
new Bar("99","TEST_ABDC","AB01","0000002"),
new Bar("99","TEST_ABDC","AB01","0000003"),
new Bar("99","TEST_ABDC","AB01","0000004"),
new Bar("99","TEST_ABDC","AB01","0000005")),
new Foo("01", "TEST_ABDC", "AB01_TEST_ABDC_YYYYMMDD.csv", 0),
List.of(new Bar("01","TEST_ABDC","AB01","0000006"),
new Bar("01","TEST_ABDC","AB01","0000007")),
new Foo("02", "TEST_ABDC", "AB01_TEST_ABDC_YYYYMMDD.csv", 0),
List.of(new Bar("02","TEST_ABDC","AB01","0000007")));
Map<Foo,List<Bar>> result =
baz.entrySet()
.stream()
.collect(Collectors.toMap(Map.Entry::getKey, e -> chunk(e.getValue(),2)))
.entrySet()
.stream()
.flatMap(e -> IntStream.range(0, e.getValue().size())
.mapToObj(i -> new AbstractMap.SimpleEntry<>(new Foo(e.getKey(),i),e.getValue().get(i))))
.collect(Collectors.toMap(Map.Entry::getKey,Map.Entry::getValue));
result.entrySet().forEach(System.out::println);
}
static <T> List<List<T>> chunk(List<T> list, int n){
final AtomicInteger counter = new AtomicInteger();
return new ArrayList<>(list.stream()
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / n))
.values());
}
@AllArgsConstructor
@Getter
static class Foo{
String field1;
String field2;
String field3;
int field4;
public Foo(final Foo foo, int field4) {
this.field1 = foo.field1;
this.field2 = foo.field2;
this.field3 = foo.field3;
this.field4 = field4;
}
@Override
public String toString() {
return "Foo[" + field1 + ", " + field2 + ", " + field3 + ", " + field4 + ']';
}
}
@AllArgsConstructor
@Getter
static class Bar{
String field1;
String field2;
String field3;
String field4;
@Override
public String toString() {
return "Bar[" + field1 + ", " + field2 + ", " + field3 + ", " + field4 + ']';
}
}
}
Answered By - Eritrean