Technology16 min read

NLP Patents: From Word Embeddings to Large Language Models

Expert guide to the NLP patent landscape covering Word2Vec, transformers, BERT and LLM patents. Analysis of patent strategies, prior art and licensing.

WeAreMonsters Technical Team2026-02-03

NLP Patents: From Word Embeddings to Large Language Models

Introduction

Natural Language Processing (NLP) has undergone a revolutionary transformation over the past two decades, evolving from simple bag-of-words models to sophisticated large language models that demonstrate human-like linguistic capabilities. This evolution has been accompanied by an equally dramatic transformation in the patent landscape, as companies race to protect their innovations in embedding technologies, transformer architectures, and generative AI systems.

We have witnessed a pivotal shift in the NLP patent ecosystem, particularly post-2016, where the field moved from modular, domain-specific innovations toward integrated generative frameworks and API-driven platforms (1). The transformer architecture's introduction in 2017 catalysed explosive growth in generative AI patents, with patent families increasing from 733 in 2014 to over 14,000 in 2023—representing more than an 800% increase (2). This transformation represents not just technological advancement, but a fundamental reimagining of how machines process, understand, and generate human language.

The patent landscape in NLP now encompasses everything from foundational word embedding algorithms to cutting-edge multimodal AI systems. More than half a million AI patent applications have been filed in the past five years, with generative AI comprising 22% of them (3). As we examine this evolution, we see a clear trajectory from statistical methods to neural architectures to transformer-based models, each phase bringing new challenges and opportunities for intellectual property protection.

Important: This article provides general information about NLP patent landscapes for educational purposes. It is not legal advice and should not be relied upon as such. Patent strategy involves complex legal considerations that require qualified legal counsel. Always consult a patent attorney before making intellectual property decisions.

The Evolution of NLP: From Bags to Transformers

The Bag-of-Words Era

The earliest approaches to NLP relied heavily on bag-of-words models and statistical methods, establishing crucial foundations for computational linguistics patent protection throughout the 1990s and early 2000s. A representative 2002 patent application (US20020022956) demonstrates this era's approach: an automated text classification system that parsed documents into tokens with frequency counts, assigned statistical weights indicating feature-category associations, and computed category scores as sums of feature count-weight products (4).

IBM's foundational patent US6233575B1 (filed 1998) exemplifies the sophisticated statistical classification methodologies emerging in the late 1990s, covering multilevel taxonomy-based document classification using Fisher values as discrimination values (5). Earlier patents like Xerox's US5675819A (filed 1994) on "Document information retrieval using global word co-occurrence patterns" demonstrated vector-space approaches to information retrieval, treating documents and terms as vectors in statistical space (6).

Additional early patents established core statistical NLP techniques: TEXTWISE LLC's US6026388A (filed 1995) covered natural language information retrieval systems with user interface enhancements (7), while US6847966B1 (filed 2002) described methods for optimally searching document databases using representative semantic space through vector-based retrieval approaches (8).

The limitations of bag-of-words approaches—particularly their inability to capture word order and semantic similarity—created the impetus for more sophisticated methods. This technological gap drove innovation toward embedding-based approaches that could represent words as dense vectors in high-dimensional spaces.

The Embedding Revolution

The breakthrough came with word embeddings, particularly Google's Word2Vec technology, originally published in the seminal paper "Efficient Estimation of Word Representations in Vector Space" by Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean (9). Submitted to arXiv on January 16, 2013, and published at ICLR 2013, this work demonstrated that high-quality word vectors could be derived in less than a day from a 1.6 billion word dataset, achieving state-of-the-art performance on word similarity tasks at significantly lower computational cost than previous neural network approaches (9, 10).

Google secured patent protection for this innovation through US Patent 10922488B1, titled "Computing numeric representations of words in a high-dimensional space" (11). Filed on March 25, 2019, with a priority date reaching back to January 15, 2013, this patent covers the fundamental algorithms developed by the same research team. The patent's scope encompasses both the continuous bag-of-words (CBOW) and skip-gram approaches to learning word representations, as well as optimisation techniques like hierarchical softmax and negative sampling (11).

The Word2Vec patent represents a watershed moment in NLP intellectual property, as it established legal protection for the core concept of learning distributed representations of words. The algorithm's development within Google allowed the company to file patent applications before the research was widely published, creating a strong intellectual property position around word embeddings (12).

Following Word2Vec, the field saw rapid development in embedding techniques. Stanford's GloVe (Global Vectors for Word Representation), introduced by Jeffrey Pennington, Richard Socher, and Christopher Manning at EMNLP 2014, combined the strengths of global matrix factorisation methods with local context window methods, achieving 75% accuracy on word analogy tasks (13, 14). Facebook Research's FastText, published in "Enriching Word Vectors with Subword Information" by Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov (2017), extended embeddings to subword information using character n-grams, enabling computation of word representations for out-of-vocabulary terms (15, 16).

Each of these innovations contributed to a growing patent portfolio around distributed representations of language, creating a complex landscape of overlapping claims and prior art.

Sequence-to-Sequence and Attention Mechanisms

The evolution toward modern transformers began with the encoder-decoder paradigm introduced by Ilya Sutskever, Oriol Vinyals, and Quoc V. Le in "Sequence to Sequence Learning with Neural Networks" (NIPS 2014). This foundational work used multilayered LSTMs to solve the fundamental limitation that standard DNNs required fixed-dimensionality inputs and outputs, achieving a BLEU score of 34.8 on English-to-French translation using the WMT-14 dataset (17, 18).

The next major leap came with attention mechanisms and the revolutionary transformer architecture. Google's patent US10452978B2, "Attention-based sequence transduction neural networks," protects the fundamental transformer architecture (19). This patent, with inventors including Ashish Vaswani, Noam M. Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Nicholas Gomez, Łukasz Kaiser, and Illia Polosukhin, covers the attention mechanism that would become the foundation for modern language models.

The transformer architecture was detailed in the groundbreaking paper "Attention Is All You Need," published at NIPS 2017. This work achieved extraordinary academic impact with 161,544 citations, including 18,995 highly influential citations, making it one of the most-cited papers in machine learning and artificial intelligence (20, 21). The paper demonstrated 28.4 BLEU score on WMT 2014 English-to-German translation and 41.8 BLEU on English-to-French translation after 3.5 days of training on eight GPUs (20).

The transformer patent, filed on June 28, 2018, with a priority date of May 23, 2017, remains active with an anticipated expiration date of May 23, 2038 (19). This long patent life ensures Google's continued influence over transformer-based innovations well into the next decade. The attention mechanism represented a fundamental shift from recurrent architectures to parallel processing capabilities. The patent claims cover the mathematical formulations of self-attention, multi-head attention, and the positional encoding schemes that make transformers so effective at processing sequential data (19).

Key Patent Milestones in NLP

Google's Dominance in Core Technologies

Google has established a commanding position in NLP patents through strategic protection of fundamental technologies. Beyond Word2Vec and transformers, Google holds numerous patents covering BERT and related bidirectional transformer technologies. Patent US10956819B2 extends Google's transformer portfolio with additional claims on attention-based sequence transduction (4).

The BERT (Bidirectional Encoder Representations from Transformers) architecture, detailed in the paper by Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova and published at NAACL 2019 (receiving the Best Long Paper award), introduced novel bidirectional training approaches that revolutionised language understanding tasks (22, 23). BERT achieved state-of-the-art results across eleven NLP tasks, including 80.5% GLUE score (7.7 point absolute improvement), 86.7% MultiNLI accuracy (4.6% improvement), 93.2 F1 on SQuAD v1.1 (1.5 point improvement), and 83.1 F1 on SQuAD v2.0 (5.1 point improvement) (22).

Google's patent strategy has effectively created a moat around bidirectional encoder representations, making it challenging for competitors to develop similar technologies without licensing agreements. Patent US10956819B2 extends Google's transformer portfolio with additional claims on attention-based sequence transduction (24).

Google's patent portfolio also extends to multimodal applications, as seen in patent US20210232773A1, "Unified Vision and Dialogue Transformer with BERT" (5). This patent demonstrates Google's efforts to protect integrated approaches that combine visual and textual understanding using transformer architectures.

The OpenAI Challenge

OpenAI has pursued a different patent strategy, focusing on application-specific innovations rather than foundational architectures. OpenAI's granted patent US12061880B2 covers "Systems and methods for generating code using language models trained on computer code" (6). This patent, filed in 2023 and approved in August 2024, protects AI-assisted code generation capabilities that have become central to modern software development tools.

OpenAI's patent applications also extend to semantic search and SEO applications, indicating a broader strategy to protect commercial applications of language model technology (7). This approach suggests OpenAI is building patent defences around specific use cases rather than competing directly with Google's foundational architecture patents.

Meta's Virtual Assistant Patents

Meta has focused its patent strategy on interactive and conversational AI applications. Patent application US20250053430A1 covers "Large language model-based virtual assistant for high-level goal contextualized action recommendations" (8). Filed in 2024 and published in February 2025, this patent remains pending but represents Meta's efforts to protect AI assistant technologies that compete with Google Assistant and OpenAI's ChatGPT.

Meta's patent strategy appears focused on the user interaction layer rather than the underlying model architectures, creating complementary intellectual property that could be valuable in cross-licensing negotiations with other major players.

Microsoft's RAG Patents

Microsoft has carved out a niche in retrieval-augmented generation (RAG) technologies. Patent application US20240346256A1, "Response generation using retrieval augmented AI model," was filed in April 2023 and published in October 2024 (9). This patent addresses how language models can incorporate retrieved information to generate more accurate responses, a critical capability for enterprise AI applications.

Microsoft's RAG patents complement their broader AI strategy, particularly their partnership with OpenAI and integration of language models into Microsoft 365 and Azure services. These patents provide defensive protection for enterprise AI applications that rely on combining language models with proprietary data sources.

Prior Art and Foundation Research

The Academic Foundation

The patent landscape in NLP is built upon decades of academic research that established the theoretical foundations for modern language processing. Key papers that serve as prior art include the original Word2Vec publications by Mikolov et al. (2013), the transformer architecture paper "Attention Is All You Need" by Vaswani et al. (2017), and the BERT paper by Devlin et al. (2018) (10, 11, 12).

These foundational papers create important prior art constraints on patent claims, ensuring that basic research concepts remain in the public domain while allowing for patent protection of specific implementations and applications. The interplay between academic publication and patent filing strategies has become increasingly important as the time between research breakthroughs and commercial applications has shortened.

The Word2Vec Legacy

The Word2Vec algorithm, first released by Google AI on July 29, 2013, represents a classic example of how academic research translates into patent protection (13). The algorithm's development at Google allowed the company to file patent applications before the research was widely published, creating a strong intellectual property position around word embeddings.

The Word2Vec patent claims cover both the continuous bag-of-words (CBOW) and skip-gram approaches to learning word representations, as well as optimisation techniques like hierarchical softmax and negative sampling (2). These broad claims have influenced subsequent research and commercial development in embedding technologies.

Transformer Prior Art

The transformer architecture builds upon extensive prior art in attention mechanisms, dating back to early work on neural machine translation and sequence-to-sequence models. However, the specific combination of self-attention, positional encoding, and layer normalization techniques claimed in Google's transformer patents represents novel contributions worthy of patent protection (3).

The careful balance between leveraging existing research and claiming novel combinations has allowed Google to secure broad patent protection while respecting the prior art established by decades of NLP research. This strategy demonstrates the sophistication required in modern patent prosecution for rapidly evolving AI technologies.

Current LLM Patent Activity

The New Generation of Patents

The emergence of large language models has created a new wave of patent applications focused on training methodologies, inference optimisation, and application-specific implementations. Patent publications are expected to accelerate further in 2024 and 2025 due to the standard 18-month publication lag following the post-2017 transformer boom (2).

Contrary to public perception, OpenAI is not a major patent filer and did not rank in the top 25 applicants for generative AI patents (3). Instead, IBM leads with 1,591 patent applications, followed by Google with roughly 1,000, and Microsoft with approximately 700 (3). This distribution reflects different strategic approaches: while Google focuses on foundational architecture patents, companies like IBM pursue broad application-specific patent portfolios.

Recent patent applications show increasing focus on retrieval-augmented generation, multimodal integration, and specialised training techniques. These patents reflect the maturing of LLM technology from research curiosities to commercial products requiring robust intellectual property protection. The World Intellectual Property Organization (WIPO) has recognised this trend by issuing comprehensive patent landscape reports specifically addressing generative artificial intelligence (25).

OpenAI's Strategic Focus

OpenAI's patent strategy has evolved significantly since the release of GPT models. The company's patent applications increasingly focus on specific applications and optimisation techniques rather than foundational architectures. This approach acknowledges Google's strong position in transformer patents while building defensible positions in high-value applications (6).

OpenAI's code generation patent represents a particularly strategic filing, as it protects one of the most commercially valuable applications of language model technology. The patent covers methods for training models on code datasets and generating syntactically correct code in response to natural language prompts (6).

Google's Continued Innovation

Google continues to file patents covering improvements to transformer architectures and new applications of language models. Recent applications include patents on efficient attention mechanisms, improved training procedures, and novel applications in search and knowledge retrieval.

Google's patent US12086552B2, "Generating semantic vector representation of natural language data," represents continued innovation in embedding technologies beyond the original Word2Vec patent (14). This patent covers more sophisticated methods for generating contextual embeddings that can capture nuanced semantic relationships.

The Enterprise AI Patent Rush

Beyond the major technology companies, we are seeing increased patent activity from enterprise software companies developing AI-powered applications. These patents typically focus on domain-specific applications of language models rather than core architectural innovations.

Companies like Salesforce, with patents on unified vision and dialogue transformers, are building patent portfolios that protect their specific implementations of multimodal AI systems (5). These application-focused patents create valuable defensive positions without directly challenging the foundational architecture patents held by Google and others.

Technical Claim Analysis for NLP Patents

Claim Structure in NLP Patents

NLP patents present unique challenges in claim drafting due to the mathematical nature of the underlying algorithms and the need to balance breadth with enforceability. Effective NLP patent claims typically include several key components: the neural network architecture, the training methodology, the specific mathematical operations, and the intended applications.

For example, Google's transformer patent includes claims that cover the multi-head attention mechanism, the specific mathematical formulations for attention weights, and the combination of attention with feed-forward networks (3). These claims are carefully structured to capture the essential innovations while providing sufficient detail to enable implementation.

Method vs. System Claims

NLP patents often include both method claims and system claims to provide comprehensive protection. Method claims cover the specific steps involved in training or using neural networks for language processing, while system claims protect the hardware and software configurations that implement these methods.

The distinction between method and system claims becomes particularly important in NLP patents because the same algorithm can be implemented in various computational environments, from cloud-based services to edge devices. Comprehensive patent protection requires claims that cover these different implementation contexts.

Data Structure and Training Claims

Modern NLP patents increasingly include claims covering training data structures and dataset preparation methods. These claims recognise that the effectiveness of language models depends heavily on the quality and structure of training data, making dataset preparation techniques valuable intellectual property.

For instance, patents covering domain-specific language models often include claims on data preprocessing methods, tokenization approaches, and curriculum learning strategies. These claims protect not just the model architectures but the entire pipeline required to create effective language processing systems.

Novelty and Non-Obviousness Challenges

Establishing novelty and non-obviousness in NLP patents requires careful navigation of extensive prior art from both academic research and existing patents. Patent prosecutors must demonstrate that claimed combinations of techniques produce unexpected results or solve previously unsolved technical problems.

The rapid pace of NLP research creates additional challenges, as prior art landscapes can change significantly between filing and examination. Successful NLP patents often focus on specific technical improvements or novel combinations of existing techniques rather than claiming entirely new conceptual approaches.

Patent Classification and Retrieval in the NLP Domain

The Challenge of Patent Classification

The application of NLP techniques to patent analysis itself has become an active area of research and patent protection. Fine-tuning BERT for patent classification has demonstrated superior performance to CNN-based approaches, particularly on large datasets containing over 2 million patents (15). These advances in patent processing create new opportunities for patent protection in the intersection of AI and legal technology.

Patent classification systems using transformer-based models can achieve state-of-the-art results using only patent claims, without requiring full specification text (15). This finding has significant implications for patent prosecution and prior art searching, as it suggests that claim language alone contains sufficient information for accurate classification.

Domain-Specific BERT Models for Patents

Google has developed BERT models specifically trained on patent text, recognising that patents represent an ideal domain for specialised language models (16). With over 20 million active patents worldwide, averaging approximately 10,000 words each, the patent corpus provides a substantial training dataset for domain adaptation (16).

Patent-specific BERT models excel at understanding the unique syntactic structures and technical terminology found in patent documents. These models demonstrate superior performance in generating contextual synonyms and identifying related technologies, capabilities that are valuable for both patent prosecutors and patent examiners (16).

Retrieval-Augmented Generation in Patent Search

Recent developments in patent search systems have incorporated retrieval-augmented generation techniques to improve search accuracy and relevance. LLM-RAG systems specifically designed for patent retrieval achieve 80.5% semantic matching accuracy and 92.1% recall on Google Patents data, outperforming baseline LLM methods by 28 percentage points (17).

These advances in patent search and analysis represent a new frontier for patent protection, as companies develop proprietary systems for navigating the complex patent landscape. The intersection of AI and legal technology creates opportunities for patent protection that extend beyond the underlying NLP technologies to the systems and methods for applying these technologies to legal problems.

Multimodal NLP Patents

The Convergence of Vision and Language

The evolution toward multimodal AI systems has created new patent opportunities at the intersection of computer vision and natural language processing. Recent patent applications demonstrate increasing sophistication in protecting integrated systems that process both textual and visual information.

Google filed patent application US20250005293A1 in January 2025, covering multimodal dialogs combining large language models with visual language models (26). Salesforce published patent US20240160853A1 in May 2024 on vision-language pretraining frameworks, with prior art dating to November 2022 (27). A Korean application (US20240176959) filed in May 2024 addresses cross-modal training specifically, describing methods for generating language models using crossmodal information by converting both language-based and non-language-based modality information into embeddings and using a crossmodal transformer to generate semantic associations (28).

Patent applications like US20210232773A1 represent early examples of multimodal patent protection, covering systems that combine BERT-based language understanding with visual processing capabilities (25). These patents establish important precedents for protecting the integration of multiple AI modalities within single systems.

Design Patent Applications

The application of multimodal AI to design patent analysis represents a particularly innovative area of development. DesignCLIP and similar technologies leverage vision-language models like CLIP for design patent understanding, incorporating class-aware classification and contrastive learning with generated captions (29). This 2025 research framework achieves significant performance improvements for patent classification and retrieval tasks through multi-view image learning approaches.

Complementary research has developed language-informed multimodal approaches using LLMs to enhance patent image retrieval, achieving remarkable performance improvements of +53.3% mAP and +41.8% Recall@10 (30). These applications demonstrate how NLP technologies are expanding beyond traditional text processing to encompass visual understanding tasks. Patent protection for these multimodal applications requires claims that cover both the language processing components and their integration with visual analysis systems.

Cross-Modal Training and Alignment

Patents covering multimodal NLP systems increasingly focus on the training methodologies required to align representations across different modalities. These training approaches often involve novel loss functions, contrastive learning techniques, and attention mechanisms that operate across modalities.

The technical challenges of multimodal training create opportunities for patent protection that extend beyond simple combinations of existing technologies. Successful multimodal patents typically claim novel approaches to cross-modal alignment and representation learning that produce superior performance compared to single-modal systems.

Future Trends in NLP Patents

Privacy-Preserving AI Technologies

The growing importance of data privacy has created new patent opportunities in privacy-preserving NLP technologies. IBM's privacy-preserving federated learning patent (US12160504B2) was granted in December 2024, covering encryption key aggregation and participant-based federated learning approaches (31). Microsoft filed a federated learning system patent (US20240211633A1) in December 2022, published in June 2024, addressing personal information protection in distributed machine learning training (32).

Advanced research has produced practical implementations for language models. Meta researchers published PrE-Text at ICLR 2024's Private ML workshop, proposing a method to generate differentially private synthetic textual data from federated client data for centralised language model training. This approach eliminates on-device resource constraints while achieving better privacy-utility trade-offs, requiring 7x less client computation and 40x less communication than traditional on-device federated learning with DP-SGD (33).

The distributed matrix mechanism (DMM), presented at ICML 2025, combines secure aggregation protocols with constant-overhead linear secret resharing to improve differential privacy in federated learning. This cryptographic approach enables distributed DP while maintaining better model utility, with support for dynamic user participation and dropout handling (34).

These privacy-focused patents address real market needs while creating new areas for intellectual property protection. As privacy regulations become more stringent globally, patents covering privacy-preserving AI techniques will likely become increasingly valuable.

Personalised Interactive Systems

The trend toward personalised AI systems has generated patent applications covering techniques for customising language models to individual users or specific domains. These patents typically cover methods for efficient fine-tuning, user modelling, and adaptive response generation.

Patent protection for personalised AI systems must address both the technical challenges of customisation and the privacy concerns associated with user-specific modelling. Successful patents in this area often claim novel approaches to balancing personalisation with privacy protection.

API-Driven Platform Patents

The shift toward API-driven AI platforms has created new patent opportunities in system architecture and service delivery methods. These patents cover not just the underlying AI technologies but the systems and methods for delivering AI capabilities through scalable web services.

Platform-focused patents often claim novel approaches to model serving, load balancing, and API design that enable efficient delivery of AI capabilities to diverse applications. These patents create defensive positions around business models rather than just technical innovations.

Quantum-Enhanced NLP

Emerging research in quantum-enhanced natural language processing represents a frontier area for patent protection, transitioning from theoretical work in the late 2000s through quantum-inspired word embeddings in the mid-2010s, to current quantum circuit-based models and hybrid approaches (35).

Researchers have demonstrated the first text-level quantum NLP implementation using the QDisCoCirc model on Quantinuum's H1-1 trapped-ion processor, emphasising interpretability through compositional structure and enabling "compositional generalisation" where components trained classically are composed for larger test instances requiring quantum evaluation (36). Two major frameworks have emerged: Adaptive Quantum-Classical Fusion (AQCF), introduced in 2025, dynamically orchestrates transitions between classical and quantum processing with entropy-driven adaptive circuits and quantum memory banks (37).

Practical implementations include Quantum Recurrent Neural Networks (QRNNs) and Quantum Convolutional Neural Networks (QCNNs) successfully trained end-to-end on real IBM Quantum processors, demonstrating the first empirical generative language modelling on actual quantum hardware using hardware-optimised parametric quantum circuits with lightweight classical projection layers (38).

These quantum NLP patents typically claim hybrid classical-quantum systems that leverage quantum computing for specific components of language processing pipelines. As quantum computing hardware matures, these patents may become increasingly important for protecting next-generation AI technologies.

Patent Strategy Implications

Building Patent Portfolios

Companies developing NLP technologies must carefully consider their patent strategy in light of the existing landscape dominated by Google's foundational patents. Successful strategies often focus on application-specific innovations, implementation optimisations, or novel combinations of existing techniques.

The key to effective patent protection in NLP lies in identifying specific technical contributions that provide measurable improvements over existing approaches. These contributions might involve training efficiency, inference speed, accuracy improvements, or novel applications to specific domains.

Licensing and Cross-Licensing

The concentration of foundational NLP patents among a few major companies has created a complex licensing landscape exemplified by recent high-profile agreements. Microsoft's exclusive IP rights to OpenAI's models extend through 2032 and now include post-AGI models, with Azure API exclusivity until AGI is declared. However, OpenAI can now jointly develop some products with third parties, though API products remain exclusive to Azure (39).

In a surprising 2025 development, OpenAI signed an unprecedented cloud deal with Google to access computing capacity, reducing its dependency on Microsoft despite their competitive AI rivalry (40). This demonstrates how massive computational demands are reshaping competitive dynamics and forcing strategic partnerships even among rivals.

Cross-licensing agreements have become increasingly important as companies recognise their mutual dependence on each other's innovations. The licensing landscape extends beyond core technologies to commercial applications: OpenAI can now release open-weight models meeting requisite capability criteria and provide API access to US government national security customers (39). These agreements enable continued innovation while providing some protection against patent litigation risks.

Open Source and Patent Risks

The widespread use of open source NLP frameworks creates potential patent risks for companies building on these technologies. While open source licences provide copyright protection, they typically do not address patent licensing issues.

Companies must carefully evaluate the patent landscape when choosing open source NLP technologies and consider obtaining appropriate patent licences when building commercial products. This evaluation process requires understanding both the specific patents that might be implicated and the licensing terms offered by patent holders.

Enforcement and Litigation Trends

The Absence of Major NLP Patent Litigation

Despite the valuable patent portfolios held by major technology companies, we have not yet seen significant litigation around core NLP patents. This absence of litigation likely reflects several factors, including the mutual dependence of companies on each other's innovations and the ongoing uncertainty about patent validity under Section 101 eligibility standards following the Supreme Court's Alice Corp. v. CLS Bank decision (2014) (41).

The Alice/Mayo framework established a two-step test for patent eligibility: courts must first determine whether claims are directed to patent-ineligible concepts (laws of nature, natural phenomena, or abstract ideas), then assess whether claim elements "transform" the exception into a patent-eligible application (42). In July 2024, the USPTO issued comprehensive guidance on AI patent subject matter eligibility, emphasising that Step 2A arguments—determining whether claims are directed to abstract ideas—are most effective for AI inventions (43, 44).

The USPTO's 2025 memorandum reminded examiners in software and AI arts to carefully evaluate whether claims are directed to judicial exceptions, distinguishing genuine technological improvements from abstract ideas with computer implementation (45). This ongoing legal uncertainty may contribute to the current litigation détente.

The current détente may not persist as the commercial stakes continue to grow and as patents begin to mature. Companies should prepare for potential litigation by building strong defensive patent portfolios and establishing clear prior art records for their innovations.

International Patent Strategies

NLP patent protection extends beyond the United States to major markets including Europe, China, and Japan. Companies must consider international filing strategies that provide protection in key commercial markets while managing the costs and complexity of global patent prosecution.

Different jurisdictions may have varying standards for patentability of software and AI innovations, requiring tailored approaches to claim drafting and prosecution strategy. Successful global patent protection often requires coordination among patent attorneys with expertise in local requirements and practices.

Conclusion

The evolution of NLP from bag-of-words models to large language models represents one of the most dramatic technological transformations in recent history. This evolution has been accompanied by an equally dramatic transformation in the patent landscape, as companies have raced to protect their innovations in embedding technologies, transformer architectures, and generative AI systems.

We have seen how Google's early patent filings for Word2Vec and transformers established a dominant position in foundational NLP technologies, while companies like OpenAI, Meta, and Microsoft have pursued strategies focused on application-specific innovations and complementary technologies. The resulting patent landscape creates both opportunities and challenges for continued innovation in NLP.

The future of NLP patents will likely be shaped by emerging trends in multimodal AI, privacy-preserving technologies, and personalised interactive systems. Companies seeking to participate in this evolution must carefully navigate the existing patent landscape while building their own intellectual property positions around novel technical contributions.

As we look toward the future, the intersection of artificial intelligence and intellectual property law will continue to evolve. The patent landscape in NLP provides a fascinating case study in how legal frameworks adapt to rapid technological change, and how companies balance innovation with intellectual property protection in competitive markets.

The ongoing evolution of NLP technologies ensures that patent activity in this domain will remain vibrant and strategically important. Success in this landscape requires not just technical innovation, but sophisticated understanding of patent strategy, prior art landscapes, and the competitive dynamics that shape intellectual property decisions in the AI industry.


References

  1. Nature Portfolio. "Mapping the technological evolution of generative AI: a patent network analysis." Nature Scientific Reports, 2025. https://www.nature.com/articles/s41598-025-26810-7

  2. AI Business. "OpenAI, Microsoft or Google: Who's Winning the Gen AI Patent Battle?" 2024. https://aibusiness.com/nlp/openai-microsoft-or-google-who-s-winning-the-gen-ai-patent-battle-

  3. AI Business. "OpenAI, Microsoft or Google: Who's Winning the Gen AI Patent Battle?" 2024. https://aibusiness.com/nlp/openai-microsoft-or-google-who-s-winning-the-gen-ai-patent-battle-

  4. Justia Patents. "System and method for automatically classifying text." US Patent Application 20020022956, February 21, 2002. https://patents.justia.com/patent/20020022956

  5. Google Patents. "Multilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values." US Patent 6233575B1, 2001. https://patents.google.com/patent/US6233575B1/en

  6. Google Patents. "Document information retrieval using global word co-occurrence patterns." US Patent 5675819A, 1997. https://patents.google.com/patent/US5675819A/en

  7. Google Patents. "User interface and other enhancements for natural language information retrieval system and method." US Patent 6026388A, 2000. https://patents.google.com/patent/US6026388A/en

  8. Google Patents. "Method and system for optimally searching a document database using a representative semantic space." US Patent 6847966B1, 2005. https://patents.google.com/patent/US6847966B1/en

  9. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781.

  10. Google Research. "Efficient Estimation of Word Representations in Vector Space." 2013. https://research.google/pubs/efficient-estimation-of-word-representations-in-vector-space/

  11. Google Patents. "Computing numeric representations of words in a high-dimensional space." US Patent 10922488B1, 2021. https://patents.google.com/patent/US10922488B1/en

  12. Wikipedia. "Word2vec." https://en.wikipedia.org/wiki/Word2vec

  13. Pennington, J., Socher, R., & Manning, C. D. (2014). "GloVe: Global vectors for word representation." Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543. https://aclanthology.org/D14-1162/

  14. Stanford NLP. "GloVe: Global Vectors for Word Representation." https://www-nlp.stanford.edu/projects/glove/

  15. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). "Enriching word vectors with subword information." Transactions of the Association for Computational Linguistics, 5, 135-146. https://aclanthology.org/Q17-1010/

  16. Meta Research. "Enriching Word Vectors with Subword Information." 2017. https://research.facebook.com/publications/enriching-word-vectors-with-subword-information-2/

  17. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). "Sequence to sequence learning with neural networks." Advances in Neural Information Processing Systems, 27. https://papers.nips.cc/paper_files/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html

  18. NIPS Proceedings. "Sequence to Sequence Learning with Neural Networks." 2014. https://proceedings.neurips.cc/paper_files/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf

  19. Google Patents. "Attention-based sequence transduction neural networks." US Patent 10452978B2, 2019. https://patents.google.com/patent/US10452978B2/en

  20. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). "Attention is all you need." Advances in Neural Information Processing Systems, 30. https://arxiv.org/abs/1706.03762

  21. Semantic Scholar. "Attention is All you Need." https://www.semanticscholar.org/paper/Attention-is-All-you-Need-Vaswani-Shazeer/204e3073870fae3d05bcbc2f6a8e263d9b72e776

  22. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). "BERT: Pre-training of deep bidirectional transformers for language understanding." Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, 4171-4186. https://aclanthology.org/N19-1423/

  23. Google Research. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding." https://research.google/pubs/bert-pre-training-of-deep-bidirectional-transformers-for-language-understanding

  24. Google Patents. "Attention-based sequence transduction neural networks." US Patent 10956819B2, 2021. https://patents.google.com/patent/US10956819B2/en

  25. Google Patents. "Unified Vision and Dialogue Transformer with BERT." US Patent Application 20210232773A1, 2021. https://patents.google.com/patent/US20210232773A1/en

  26. Google Patents. "Multimodal dialogs using large language model(s) and visual language model(s)." US Patent Application 20250005293A1, 2025. https://patents.google.com/patent/US20250005293A1/en

  27. Google Patents. "Systems and methods for a vision-language pretraining framework." US Patent Application 20240160853A1, 2024. https://patents.google.com/patent/US20240160853A1/en

  28. Justia Patents. "Method and apparatus for generating language model using crossmodal information." US Patent Application 20240176959, May 30, 2024. https://patents.justia.com/patent/20240176959

  29. arXiv. "DesignCLIP: Class-aware Design Pattern Recognition using Vision-Language Models." Computer Science > Computer Vision and Pattern Recognition, 2024. https://arxiv.org/abs/2508.15297

  30. arXiv. "Language-Informed Multimodal Patent Retrieval." Computer Science > Computer Vision and Pattern Recognition, 2024. https://arxiv.org/abs/2404.19360

  31. Google Patents. "Privacy-preserving federated learning." US Patent 12160504B2, December 2024. https://patents.google.com/patent/US12160504B2/en

  32. Google Patents. "System and Method for Federated Learning." US Patent Application 20240211633A1, 2024. https://patents.google.com/patent/US20240211633A1/en

  33. OpenReview. "PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs." ICLR 2024 Workshop on Private ML. https://openreview.net/attachment?id=SbyqM0cJM0&name=pdf

  34. IACR ePrint. "Distributed Matrix Mechanism for Differentially Private Federated Learning." 2024. https://eprint.iacr.org/2024/1665.pdf

  35. PatSnap Eureka. "Quantum Computing in Enhancing Natural Language Processing." 2024. https://eureka.patsnap.com/report-quantum-computing-in-enhancing-natural-language-processing

  36. arXiv. "Scalable and interpretable quantum natural language processing: an implementation on trapped ions." 2024. https://arxiv.org/abs/2409.08777

  37. arXiv. "Adaptive Quantum-Classical Fusion (AQCF): A Hybrid Framework for NISQ-era Language Processing." Quantum Physics, 2024. https://arxiv.org/abs/2508.07026

  38. arXiv. "Training Quantum Recurrent Neural Networks on IBM Quantum Processors." Quantum Physics, 2024. https://arxiv.org/abs/2512.12710

  39. SEC Filing. Microsoft Corporation Form 8-K. https://www.sec.gov/Archives/edgar/data/789019/000119312525256310/msft-ex99_2.htm

  40. Reuters. "Exclusive: OpenAI taps Google in unprecedented cloud deal despite AI rivalry." June 10, 2025. https://www.reuters.com/business/retail-consumer/openai-taps-google-unprecedented-cloud-deal-despite-ai-rivalry-sources-say-2025-06-10/

  41. Supreme Court. "Alice Corp. v. CLS Bank International." 573 U.S. 208 (2014). https://www.law.cornell.edu/supremecourt/text/13-298

  42. Reuters Legal. "Navigating patent eligibility for AI inventions after the USPTO's AI guidance update." October 8, 2024. https://www.reuters.com/legal/legalindustry/navigating-patent-eligibility-ai-inventions-after-usptos-ai-guidance-update-2024-10-08/

  43. Bitlaw. "2024 Guidance Update on Patent Subject Matter Eligibility, Including on Artificial Intelligence." July 2024. https://www.bitlaw.com/source/pto/AI-Subject-Matter-Eligibility.html

  44. USPTO. "2024 Guidance Update on Patent Subject Matter Eligibility, Including on Artificial Intelligence." July 2024. https://www.uspto.gov/sites/default/files/documents/memo-101-20250804.pdf

  45. USPTO. "Memorandum on Section 101 Eligibility Guidance for Examiners." August 4, 2025. https://www.uspto.gov/sites/default/files/documents/memo-101-20250804.pdf

  46. Google Patents. "Systems and methods for generating code using language models trained on computer code." US Patent 12061880B2, 2024. https://patents.google.com/patent/US20240020096A1/en

  47. Go Fish Digital. "OpenAI's Latest Patents Point Straight to Semantic SEO." 2024. https://gofishdigital.com/blog/openai-patent-semantic-search/

  48. Google Patents. "Large language model-based virtual assistant for high-level goal contextualized action recommendations." US Patent Application 20250053430A1, 2025. https://patents.google.com/patent/US20250053430A1/en

  49. Google Patents. "Response generation using a retrieval augmented AI model." US Patent Application 20240346256A1, 2024. https://patents.google.com/patent/US20240346256A1/en

  50. Google Patents. "Generating semantic vector representation of natural language data." US Patent 12086552B2, 2024. https://patents.google.com/patent/US12086552/en

  51. Roudsari, A. H., Afshar, J., Lee, W., & Lee, S. (2019). "Patent classification by fine-tuning BERT language model." World Patent Information, 54, 101-111.

  52. Google AI. "BERT for Patents White Paper." 2019. https://services.google.com/fh/files/blogs/bert_for_patents_white_paper.pdf

  53. arXiv. "LLM-RAG for Patent Retrieval Systems." Computer Science > Information Retrieval, 2024. https://arxiv.org/abs/2508.14064

  54. arXiv. "A Survey on Patent Analysis: From NLP to Multimodal AI." 2024. https://arxiv.org/html/2404.08668v3

  55. Academia. "Leveraging the BERT algorithm for Patents with TensorFlow and BigQuery." 2024. https://www.academia.edu/144926665/

  56. WIPO. "WIPO Issues A Patent Landscape Report On Generative Artificial Intelligence (GenAI)." Mondaq, 2024. https://www.mondaq.com/unitedstates/patent/1526956/wipo-issues-a-patent-landscape-report-on-generative-artificial-intelligence-genai

  57. CB Insights. "AI content licensing deals: Where OpenAI, Microsoft, Google, and others see opportunity." 2024. https://www.cbinsights.com/research/ai-content-licensing-deals/

Reader Tools

No notes yet

Select text anywhere and click
"Save" to add research notes