Convert dynamic JSON format to CSV using ChoETL

734 Views Asked by At

I would receive JSON in various formats and hierarchies, Need to know the possibility to be able to convert any JSON format into CSV by forming columns dynamically.As an example,i have provided the required details,below is an multi level JSON and the expected CSV output

JSON

{
   "getUsers":[
      {
         "UserInformation":{
            "Id":1111122,
            "firstName":"*****1",
            "UserType":{
               "name":"CP"
            },
            "primaryState":"MA",
            "otherState":[
               "MA",
               "BA"
            ],
            "createdAt":null
         }
      },
      {
         "UserInformation":{
            "Id":3333,
            "firstName":"*****3",
            "UserType":{
               "name":"CPP"
            },
            "primaryState":"MPA",
            "otherState":[
               "KL",
               "TN"
            ],
            "createdAt":null
         }
      }
   ]
}

CSV Output expected CSV output

2

There are 2 best solutions below

1
On

Here is how you can do it using Cinchoo ETL programatically.

First construct configuration object, along with fields configurations at runtime and pass it to reader for loading JSON

StringBuilder csv = new StringBuilder();

var config = new ChoJSONRecordConfiguration();
config.JSONPath = "$..getUsers[*].UserInformation";
config.AllowComplexJSONPath = true;

config.JSONRecordFieldConfigurations.Add(new ChoJSONRecordFieldConfiguration("Id"));
config.JSONRecordFieldConfigurations.Add(new ChoJSONRecordFieldConfiguration("FirstName"));
config.JSONRecordFieldConfigurations.Add(new ChoJSONRecordFieldConfiguration("UserType", "$.UserType.name"));
config.JSONRecordFieldConfigurations.Add(new ChoJSONRecordFieldConfiguration("primaryState"));
config.JSONRecordFieldConfigurations.Add(new ChoJSONRecordFieldConfiguration("otherState", "$.otherState[*]") { FieldType = typeof(string[]) });
config.JSONRecordFieldConfigurations.Add(new ChoJSONRecordFieldConfiguration("createdAt"));

using (var r = ChoJSONReader.LoadText(json, config))
{
    using (var w = new ChoCSVWriter(csv).WithFirstLineHeader()
        .UseNestedKeyFormat(false)
        )
        w.Write(r);
}

Console.WriteLine(csv.ToString());

Fiddle: https://dotnetfiddle.net/Gfw3r7

2
On

A workable code like this:

Read json string and write CSV file

 StringBuilder sb = new StringBuilder();
            sb.Append(@"
{
   ""getUsers"":[
      {
         ""UserInformation"":{
            ""Id"":1111122,
            ""firstName"":""*****1"",
            ""UserType"":{
               ""name"":""CP""
            },
            ""primaryState"":""MA"",
            ""otherState"":[
               ""MA"",
               ""BA""
            ],
            ""createdAt"":null
         }
      },
      {
         ""UserInformation"":{
            ""Id"":3333,
            ""firstName"":""*****3"",
            ""UserType"":{
               ""name"":""CPP""
            },
            ""primaryState"":""MPA"",
            ""otherState"":[
               ""KL"",
               ""TN""
            ],
            ""createdAt"":null
         }
      }
   ]
}
");

var reader = ChoJSONReader<User>.LoadText(sb.ToString());
var user = reader.Read();

using (var parser = new ChoCSVWriter<UserInformation>("User.csv").WithFirstLineHeader())
{
    parser.Write(user.getUsers.Select(x => x.UserInformation));
}

Serialization Classes

class User
{
    public List<UserDetail> getUsers { get; set; }
}

class UserDetail
{
    public UserInformation UserInformation { get; set; }
}
class UserInformation
{
    public int Id { get; set; }
    public string firstName { get; set; }
    public UserType UserType { get; set; }
    public string primaryState { get; set; }
    public List<string> otherState { get; set; }
    public DateTime? createdAt { get; set; }

}

class UserType
{
    public string name { get; set; }
}

Edit

Dynamic parsing all format only with JSON reader is impossible. You need some tricky steps to parse it.

  1. Get a json string and minimize it.
  2. Use regular expression or something, recognize what the json pattern is.
  3. Use the pattern to get correct Model Class to deserialize it.