This question is about Parquet.Net:
https://github.com/aloneguid/parquet-dotnet
I am trying to use Parquet.Net to build a parquet file from a couple of database tables that I have joined together. The table columns can be dynamic, so I can create a dynamic query using
// The @TableName can be read in from my list of tables and a loop
// can then get all the names in a string of columnNames
// (e.g. TableId, ColumnName1, ... , ColumnNameN
SELECT name
FROM sys.columns
WHERE object_id = OBJECT_ID(@TableName)
AND name <> N'ID';
Once I have the column names in a string, I am stuck on how to dynamically create a schema for Parquet.Net to write to, as the examples below has it predefined:
//create data columns with schema metadata and the data you need
var idColumn = new DataColumn(
new DataField<int>("id"),
new int[] { 1, 2 });
var cityColumn = new DataColumn(
new DataField<string>("city"),
new string[] { "London", "Derby" });
// create file schema
var schema = new Schema(idColumn.Field, cityColumn.Field);
using (Stream fileStream = System.IO.File.OpenWrite("c:\\test.parquet"))
{
using (var parquetWriter = new ParquetWriter(schema, fileStream))
{
// create a new row group in the file
using (ParquetRowGroupWriter groupWriter = parquetWriter.CreateRowGroup())
{
groupWriter.WriteColumn(idColumn);
groupWriter.WriteColumn(cityColumn);
}
}
}
Is this possible to do in .NET, perhaps using a reflection class or something else?