This new article dives into the state of artificial intelligence (AI) as it relates to the patent search realm. This 2023 release is a continuation of the original article from 2022 which you can read here. Reading the original article is not required but it may be helpful if one is not so familiar with general AI terms and definitions.
As a backdrop to this new article, let’s first review where we stand with AI in general and what we mean when we say AI. Referring to the 2022 release of this article series, we are again using the following definition for AI in 2023:
Anytime a computer algorithm does something that we think requires or emulates human intelligence. (*1)
Why is AI advancing so quickly all of a sudden?
Well, it has always been advancing quickly. While it may feel like there’s been a recent explosion of new AI tools and applications, the field has been marching along at a steady pace of advancement for decades.
If we scan a chronological list of milestone events in the development of AI tools (like this one *2), we see new major developments every few years since the beginning of the 2000s. Peppered between those are many incremental advances, some cataloged in these 43 volumes of AI Magazine going back to 1980. (*3)
…and it’s already everywhere.
Ignoring that some form of AI is incorporated behind-the-scenes into almost every system we use, here’s a short list of more direct interactions most of us have with AI every day:
- Facial recognition to open your phone/unlock apps
- Social media personalizing feeds and making suggestions
- Email spam filters
- Search engines – and the ads associated with them
- Digital voice assistants
- Smart home devices
- GPS, with real-time traffic information and route suggestions
- Banking
- Amazon and Netflix recommendations
Whether directly or indirectly, nearly all of us are AI-saturated, at all times. (*4)
What is different now?
Despite the ubiquitous status of AI in our lives, it sure feels like we’ve been hearing about it a lot more lately. What’s going on that makes now so special?
There are a few reasons the subject is more prominently discussed, and why new applications and businesses are popping up in the AI space. This linked article is a great overview (*5) which covers AI growth-contributing developments, like the ever-increasing amount of high-quality data available for model training (*6), and how open-source sharing may be accelerating the process of new AI model development (*7). In this blog we will expand on two of the more impactful recent developments: inexpensive GPUs and the arrival of the Transformer model.
Graphics Processing Units (GPUs) and Cloud Access
Generally speaking, AI models analyze datasets to find patterns and make predictions. As AI models become more complex, they also demand greater processing power for implementation. In the past, hardware costs and access were significant barriers in the development, testing, and implementation of AI. Within the last few years, however, powerful central processing units (CPUs), powerful graphics processing units (GPUs), and increased storage capacities have become much more accessible at a reasonable cost.
Within the High-Performance Computing (HPC) space, GPUs, in particular, have been found in recent times to provide the computational power and speed required for the large-scale mathematical operations that underlie most AI and deep learning models. This, combined with the on-demand availability of online HPC computing power implementing such GPUs (e.g., Amazon Web Services ParallelCluster), has resulted in a significant acceleration in the research, development, and application of AI technologies.
The Google Paper, and the Transformer Model
Arguably the biggest reason we’ve been hearing about AI in headlines almost every day for the last year is a 2017 paper titled “Attention Is All You Need” by a group of researchers from the Google Brain team. The paper is sometimes just called “The Google Paper.”
The Google Paper introduces the idea of deep learning architecture referred to as the “transformer model,” which revolutionized the field of natural language processing (NLP). Historically, the models used to process long strings of data (like natural language) were limited in a number of ways, and as a consequence, not especially effective. The Google team came up with what they called “attention mechanisms” that allow the transformer model to weigh the importance of different words in a sequence relative to each other, where this mechanism enabled the transformer model to capture contextual information efficiently, even from distant words. As a results, the transformer model uses these attention mechanisms to process groups of words in sentences simultaneously instead of one-by-one, for example, both speeding the process and also capturing context more effectively. (*8)
The ChatGPT-Era
Transformer models are useful in a number of applications, and one of the most popular is the generative AI known as ChatGPT, which is built on OpenAI’s GPT (Generative Pretrained Transformer) models. Generative AI is a category of models and tools that can create novel text, images, video, music and even computer code. Traditionally, AI is used to identify patterns in large amounts of data, while generative AI goes a step further by producing new output based on a prompt. The “P” in ChatGPT, which stands for Pre-trained, is the final component of this tool; put simply, a human reviewer will “grade” the generative text, giving it feedback on the quality of the output. (*9) With this feedback, the ChatGPT AI model “learns” to produce more and more natural-sounding novel output.
This is what makes ChatGPT and tools like it so remarkable: how natural-sounding the outputs are. Some responses are essentially indistinguishable from normal human speech and writing. Since being released, ChatGPT has been able to pass a Biology Olympiad, an MBA exam, and even a Uniform Bar Exam. (*10) Since the release of The Google Paper, hundreds of new AI tools have also been released across myriad industries, and thousands of companies are incorporating those tools into their current workflow and service offerings. (*11)
What Does This Mean for IP and the AI Patent Search?
The Rate of Publication Accumulation
A clear challenge in patent and literature searching, which will only worsen with time, is the sheer volume of published materials. According to the WIPO IP Facts and Figures 2022 report, 3.4 million new patent applications were filed worldwide in 2021 by applicants in the 193 member States of WIPO, up from 3.3 million the year prior. (*12) According to the National Science Foundation’s Science & Engineering Indicators reporting on Publication Output, about 2.9 million scientific articles were published in 2020. (*13) All signs indicate that the annual rate of new patent and scientific publication will increase in the years to come.
As time progresses, the number of new publications which crowd every technology space will make traditional searching a more difficult task. This is a matter we, at TPR, take very seriously. We are continuously evaluating the most reliable and cost-efficient tools and methods to deal with this ever-growing body of searchable literature.
Patent Searching – What Can AI Do?
There are a number of patent search firms that claim to either supplement traditional searching with AI tools, or to completely replace the traditional searcher. Researchers have spent time evaluating these options, and their feedback tends to converge on a few points regarding how current AI tools can improve searching:
- AI can suggest keywords. An important part of developing an effective search strategy is identifying the most relevant keywords, and also developing a list of keyword synonyms. Modern AI tools are well-suited to the task of extracting relevant keywords and suggesting expanded keywords.
- AI can suggest classifications. As a starting point for the searcher, AI tools can suggest CPC and IPC classification codes where the searcher may focus their search.
- AI can rank search results. Essentially all modern search databases will provide a “relevancy ranking,” by which results are presented in a ranked order according to an algorithmic assessment of relevance to the query used. This can help guide a searcher in the early stages of a search, and help to quickly identify some “low hanging fruit.”
- AI can categorize results. This was covered more extensively in our previous blog, but with the training of a subject matter expert and professional searcher, AI models can appropriately categorize different types of references. See part 1 to learn more. (*1)
- AI can provide visualizations of datasets. There is an abundance of data-analytics tools that will generate an array of visuals from a given dataset, meant to help identify and understand trends. For patent searching, this could take the form of heat maps, top assignee charts, key jurisdiction and classification field tables, concept and keyword maps, and so on. AI can make these visuals available at the click of a button.
Patent Searching – What Can’t AI Do?
One point that both academic researchers and expert patent analysts tend to converge on is that for now, the most reliable and efficient methodology is “human-in-the-loop” searching.
From a presentation on AI applied to prior art searching for patent offices, given at the International Conference on Computational Intelligence and Sustainable Engineering Solutions earlier this year:
Having analyzed the method of patent search, the model suggested is not fully automatic; it should have certain intervention stages by the examiner to streamline the AI search process and keep a valid check on the keywords and syntax developed by the AI for patent search. In the framing of statement queries and input of valid statements to AI human intervention is a must. (*14)
According to the assessment of experts in the field of patent and literature searching, as well as TPR’s own patent search analyst experts:
- AI cannot reliably “approve/deny” keywords. A frequent failure in the way the transformer model operates is finding a grammatical relationship between terms, but not necessarily a meaningful one. As a consequence, terms may be recommended that are commonly found within proximity of the core concepts, but don’t actually have a meaningful relationship to the topic of interest. An expert patent search analyst with an understanding of the technology in question should be involved to cull these suggested terms to those most likely to return relevant results. Additionally, as technology evolves, what was once the “term of art” may evolve into another term or phrase. For now, such evolutions in terminology can only be reliably studied by a researcher, by reviewing prior art over a historical period of time, so they can formulate keywords that cover all variations in such terminology.
- AI cannot reliably recommend a comprehensive listing of CPC/IPC classification codes for a given search. As contemporary innovations are multidisciplinary by their nature, relevant patents can be found in a plurality of CPC categories depending on the specific application or domain of the particular invention. An expert searcher is best suited to identify such a comprehensive listing of CPCs/IPCs that may be worth exploring. Additionally, AI cannot recommend CPC/IPC classification codes for “analogous” art.
- AI cannot reliably check syntax and build effective search queries. In ongoing testing, expert patent search analysts unquestionably outperform even the most sophisticated AI models in generating effective search queries. Best practices are for an expert patent search analyst, familiar with the various patent and non-patent literature databases, classifications, and indexes involved, to validate the syntax to be applied.
- AI cannot reliably select the most relevant answers. As anyone who has ever spent time Googling can attest, even the most sophisticated AI will not always rank the best result at the top of the reporting list, and sometimes it may completely miss the target all-together. In these cases, a professional patent search analyst that also has a thorough understanding of the underlying technology is often required to steer these tools and provide the appropriate perspective to sort results effectively.
- Potential Bias: AI models are understood to have biases that can negatively affect their efficacy. Sources of bias can be varied, and extend beyond just the data and computational models used (*15). A patent search expert may be able to apply “quality control” strategies to capture the most relevant art in spite of their own biases, in ways that AI models may not yet be capable of emulating.
Conclusions and Looking Ahead at AI Patent Searching
It seems reasonable to infer that AI tools will continue to become more sophisticated with time, as they have done over the last several decades, and will continue to be incorporated into nearly every product and service we engage with. The rate of improvement and incorporation into every workflow will likely continue to increase for the reasons discussed in this blog. GPUs and cloud access allow many researchers and entrepreneurs to tap into computing power that previously was only available to organizations with enormous resources. Those working with tools like the transformer model will almost certainly find new and more efficient applications.
The possibilities and future promise of AI are encouraging.
While the current tools can’t replace an expert patent search analyst, they can absolutely supplement their efforts. AI is an assistive technology – functionality has been incorporated into all of the commercial patent searching tools that TPR routinely uses. Some potential future applications are to incorporate transformer models like those used by ChatGPT into search databases to summarize long patent disclosures and scientific publications, allowing searchers to scan documents more efficiently. As a pioneer in professional IP research, TPR has been asked to assess this functionality for a major patent database producer. Going through this process will arm TPR with an even greater understanding of the cutting-edge best practices for incorporating AI tools into the patent search workflow.
We will continue to keep an eye on the development of these tools, and to apply best practices in patent and literature searching.
If you’d like to discuss any of these points, or have more questions on this topic, please contact us.
© Copyright 2023, Technology & Patent Research International, Inc.
References
- 2022 State of Artificial Intelligence (AI) and Patent Searching; TPR; Aug 2022; https://www.tprinternational.com/2022-state-of-artificial-intelligence-ai-and-patent-searching/
- The Timeline of Artificial Intelligence – From the 1940s; Verloop.io; Nov 2022; https://verloop.io/blog/the-timeline-of-artificial-intelligence-from-the-1940s/
- Archives – AI Magazine; PKP Publishing Services Network; updated quarterly; https://ojs.aaai.org/aimagazine/index.php/aimagazine/issue/archive
- The 10 Best Examples Of How AI Is Already Used In Our Everyday Life; Forbes; Dec 2019; https://www.forbes.com/sites/bernardmarr/2019/12/16/the-10-best-examples-of-how-ai-is-already-used-in-our-everyday-life/?sh=108439881171
- See why AI like ChatGPT has gotten so good, so fast; The Washington Post; May 2023; https://www.washingtonpost.com/business/interactive/2023/artificial-intelligence-tech-rapid-advances/
- Data Quality in AI: Challenges, Importance & Best Practices; AIMultiple; Apr 2023; https://research.aimultiple.com/data-quality-ai/
- The Impact of Open-Source AI on Scientific Research; TS2; May 2023; https://ts2.space/en/the-impact-of-open-source-ai-on-scientific-research/
- Attention Is All You Need; Vaswani et al; Google Brain; https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- What developers need to know about generative AI; Damian Brady; Github.blog; April 2023; https://github.blog/2023-04-07-what-developers-need-to-know-about-generative-ai/
- OpenAI.com; GPT-4; Aug 2023; https://openai.com/gpt-4
- Beyond ChatGPT: 14 Mind-Blowing AI Tools Everyone Should Be Trying Out Now; Bernard Marr; Forbes; Feb 2023; https://openai.com/gpt-4
- WIPO IP Facts and Figures 2022; WIPO; 2022; https://www.wipo.int/edocs/pubdocs/en/wipo-pub-943-2022-en-wipo-ip-facts-and-figures-2022.pdf
- Publications Output: U.S. Trends and International Comparisons; National Science Foundation; Aug 2023; https://ncses.nsf.gov/pubs/nsb20214/publication-output-by-country-region-or-economy-and-scientific-field
- Artificial Intelligence Reducing the Intricacies of Patent Prior Art Search; Rawat et al; 2023 International Conference on Computational Intelligence and Sustainable Engineering Solutions (CISES); Jul 2023; https://www.researchgate.net/publication/372568925_Artificial_Intelligence_Reducing_the_Intricacies_of_Patent_Prior_Art_Search
- There’s More to AI Bias Than Biased Data, NIST Report Highlights; National Institute of Standards and Technology, US Gov; Mar 2022; https://www.nist.gov/news-events/news/2022/03/theres-more-ai-bias-biased-data-nist-report-highlights