25 February 2021

What about patent data?

You may be wondering if CIPO has any patent data that could be analyzed in similar fashion to the trademark data analysis examples you can see by clicking the tabs along the top of this page.  The answer is yes, but no.

CIPO publishes two patent data sets in XML form, namely bibliographic and full text XML data and patent administrative status XML data.  Both of CIPO’s patent data sets encompass all Canadian patent filings which are open to public inspection—right back to 1869.  Almost 2.5 million patent documents are represented in each of the patent data sets.  Unfortunately however, neither of them are as richly detailed as CIPO’s trademark XML data.

The bibliographic and full text XML data set contains details such as inventor / applicant /assignee names & nationalities, invention titles, IPC classifications, ‟critical application processing dates” (e.g. filing dates, request for examination dates, grant dates, etc.); plus the full text of the abstract, specification and claims.

The patent administrative status XML data set also contains inventor / applicant /assignee names & nationalities, invention titles, IPC classifications and ‟critical application processing dates”, plus ‟administrative status codes” which sounds promising but actually isn’t very helpful.  (The administrative status codes correspond to what you’ll see in the Administrative Status Help section of CIPO’s online patent database.)

Unlike CIPO’s trademark XML data, which provides a complete date-specific record of prosecution, opposition, post-registration and cancellation events applicable to each mark—including applicable response deadlines—CIPO’s patent XML data provides no such record.  Bottom line: you can’t do much with the patent data beyond making simple “who filed what when from where and how” type queries, e.g. “Show me a list of Swiss nationality applicants who filed French language PCT/CA applications for inventions classified in IPC C12N between 01-Apr-2019 and 31-Mar-2020.”  As interesting as that might be (e.g. if you’re CIPO), it’s probably not as useful (e.g. if you’re a patent practitioner) as something like “Show me a list of deadlines for responding to examination reports that have issued in respect of applications being handled by my firm’s Vancouver office”; or “Show me a list of maintenance fee payment deadlines upcoming within the next month for all cases currently assigned to ABC Enterprises Ltd.”  You can’t do those types of queries if you’re working with CIPO’s patent XML data, but you can if you’re working with CIPO’s trademark XML data.

To investigate the possibilities, I built a multi-dimensional data warehouse based on CIPO’s patent administrative status XML data (all of it—going back to 1869—almost 2.5 million XML files).  Here’s a dashboard derived from that data warehouse, based on CIPO’s data as of 18-Feb-2021 (click to enlarge the image):

Canadian Patent Data Dashboard

The dashboard reflects the results of several “who filed what when from where and how” type queries.

I’m hoping to see an improvement to CIPO’s patent XML data to bring it into line with CIPO’s excellent trademark XML data.