I am trying to retrieve specific cpc codes AND assignees via SQL in the Google public patent data. I am trying to search for the term "VOLKSWAGEN" and cpc.code "H01M8".
But I got the error:
No matching signature for operator = for argument types: ARRAY <STRUCT<name STRING, country_code STRING>>, STRING. Supported signature: ANY = ANY at [15:3]
code:
SELECT
publication_number application_number,
family_id,
publication_date,
filing_date,
priority_date,
priority_claim,
ipc,
cpc.code,
inventor,
assignee_harmonized,
FROM
`patents-public-data.patents.publications`
WHERE
assignee_harmonized = "VOLKSWAGEN" AND cpc.code = "H01M8"
LIMIT
1000
I'm also interested in searching multiple assignees such as:
in ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")
I have recently started to work with SQL and do not see the mistake :/
Many thanks for your help!
In Google BigQuery
UNNESTis needed to access ARRAY elements. This is described here:https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays
The following query works for me.
The following are changes I made to generate results:
UNNEST(assignee_harmonized) as assignee_harmonized__uto accessassignee_harmonized__u.name.UNNEST(cpc) as cpc__uto accesscpc__u.code.assignee_harmonized__u.name = "VOLKSWAGEN AG"as"VOLKSWAGEN"returns no results.cpc__u.code LIKE "H01M8%"as"H01M8"returns no results. An example value isH01M8/10.This returns the following:
If you want to screen multiple assignee names,
INwill work like the following, however, you need to have an exact match likeVOLKSWAGEN AGorAUDI AG.assignee_harmonized__u.name IN ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")If you want to do a
LIKEstyle match with multiple strings, you can tryREGEXP_CONTAINS:https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_contains