Is the following behavior possible in AWS Glue? I am trying to create a single AVRO file by joining two DynamicFrames in a one-to-many fasion.
For example I have a DyF with many Teacher types: teacher_id teacher_name
and a Dyf with many Student types: student_id teacher_id student_name
I am trying to combine these so that a teacher may have many students for example:
[
{
teacher_id: 1,
teacher_name: 'John',
students: [
{
student_id: 100,
teacher_id: 1
student_name: 'Sally'
},
{
student_id: 200,
teacher_id: 1,
student_name: 'Jack'
}
]
},
...
]
Using Join.apply(teacher, student, 'teacher_id', 'teacher_id') only results in repeated rows as such:
[
{
teacher_id: 1,
teacher_name: 'John',
student_id: 100,
teacher_id: 1
student_name: 'Sally'
},
{
teacher_id: 1,
teacher_name: 'John',
student_id: 200,
teacher_id: 1
student_name: 'Jack'
}
...
]
]
There are probably better ways to do this than what I am suggesting below, but I expect that the following approach would work:
After this you should end up with a row with all the teacher data, and the teacher data will have struct column containing the students that are related to the teacher. If you were to serialize or output into a hierarchical format (e.g. JSON), it should show each student as a child of the teacher.