Update for my original problems
Well, maybe I didn't describe my problem completely before. Sorry for all! The following is the real problem:
I have a txt file containing patent data, such as:
1/1523 DWPI
AP - JP29446999A 19991015
PN - JP2000188399 A 20000704 DW200044 JP4568930B2 B2 20101027 DW201071
AN - 2000495116
PA - (NPDE ) DENSO CORP
PR - JP1998000295406 19981016
MC - U11-C18A3,U12-D02A
OPD - 1998.10.16
ICAI - H01L29/12,H01L29/78,H01L21/265,H01L21/336
TI - Planar type metal oxide semiconductor field effect transistor
AB - <-contents eliminated for simplicity->
CPY - NPDE
FN - JP2000188399
There are 1523 items with the similar format. I want to analyze the patent data, so I have to parse the data. I have defined the data type for every field, such as:
data AP = AP String Day String
data PN = PN String Day String
data AN = AN String
data PD = PD day
.... -- many other data types are not shown just for simplicity.
Now I have written the parser for every field with megaparsec
, such as apField
, pnField
, anField
, etc.
However, not every record has the same field, for example, the 2nd item may only contain fields of AP, PN, PA, PR, OPD, TI, AB, CPY and FN, with AN, MC, and ICAI missing. Besides, someone may be interested in different fields, and he just exports the txt file containing records only with fields of AP, PN, PA, OPD and CPY.
Now I want to write a generic code, which can parse the records with fields people are interested in, and write the parsing result into a SQLite
database.
For example, if I want to parse records with fields of AP, PN, PA, OPD and CPY, I can construct a record parser according to the input, such as toParser "ap,pn,pa,opd,cpy"
, or toParser "ap,pa,cpy"
, which I have figured out. The parsed result should be Record AP PN PA OPD CPY
or Record AP PA CPY
respectively. Then I'd like to write the parsed results into a database. Since every record in the data corresponds to a Record
data type, and the record to be parsed may be different, I have to construct a Record
data type with different fields depending on the user's input. This is the problem that I have met.
I can work around it by defining all the field data types as data Field = Field [String]
and the record as data Record = Record [Field]
. However, I want more control over data type, such as a day as a Day
type, and id number as a Int
type.
If constructing Record
data type with different fields depending on the input is impossible, maybe there are other ways to solve my problem. I appreciate any advices! And sorry for the long description of my problem and my ambiguous descriptions for my problem before!
Well, if I got your question right, no you can't write a single function which returns different data types depending on the input. However what you can do is write a function that returns a single data type that can be constructed in different ways depending on input.. i.e. like:
so now you can write a function
parseRecord :: String -> Maybe PatentRecord
for example which parses your input and depending on what it matches returns aPatentRecord
built using thePN
constructor, or theAN
constructor, etc...PS: Implementation Tip: use rather an
Either SomeErrorType
instead ofMaybe
to provide richer information upon parsing errors ;-)