A joint letter to graph customers, the graph curious, and the Cypher community:
Last week, the database world reached a significant milestone. The International Organization for Standardization (ISO) published GQL, a new database language standard designed for property graphs. GQL, which stands for Graph Query Language, is the first new ISO database language since the introduction of SQL in 1987. This milestone has been eagerly anticipated by the graph community for many years, with many companies, including Neo4j and Amazon, actively advocating and contributing to its development.
A GQL standard makes it straightforward to use graphs
We’ve often quipped that there are far more graph problems than customers who realize that their problem is best handled as a graph. As we enter into the world of generative artificial intelligence (AI), there is an even greater explosion of applications where graphs are critical for getting accurate, reliable, and explainable results quickly, like GraphRAG. With GQL as a standard, practitioners and buyers alike will be able to use graph technologies with even greater confidence.
Cypher is the best and fastest path to GQL
The question you are probably asking is: What does this mean for my skills and code, and what does it mean for Cypher? We’re here with some good news: all of you who use Cypher now have a well-paved onramp to GQL. Because the two languages have been on a natural and deliberate convergence course, your best path to GQL is to simply keep using Cypher as it evolves. You also have our commitment that we will continue supporting Cypher for many years. In other words, you can put away your forklift!
The core syntax for GQL and Cypher is largely identical
Many aspects of GQL are identical to Cypher. Most critically, the query structure is the same. GQL supports the familiar MATCH … RETURN statements, and uses ASCII art to describe graph patterns. Likewise, GQL uses the same basic expressions, linear composition, and more.
Where differences exist, support for the new GQL form will be added
Certain things are different in GQL than in Cypher. For example, GQL uses the keyword INSERT to add a node or relationship to the graph, whereas Cypher uses CREATE. Similarly, a new FOR statement in GQL does the equivalent of UNWIND. In cases such as these, the current Cypher language will remain supported, and we will also be adding support for the GQL variation, so that you can shift over to the GQL syntax in your own time.
Cypher functionality not yet in GQL remains supported as vendor extensions
The v1 GQL standard is substantial. It’s about equivalent in size and scope to the SQL 92 standard. (For reference, the first version of ISO SQL was SQL 87.) Even so, not everything in Cypher could make it into the v1 GQL standard. That’s OK. GQL will get there. In the meantime, GQL allows for vendor extensions. Therefore, a lot of what you’re using today is OK. Over time, we expect that commands currently in our products but not in GQL will be proposed for the GQL standard. Examples of commands in this category are MERGE, FOREACH, and LOAD CSV.
New GQL functionality offers opportunities for new capabilities
Finally, some great new functionality is available in GQL, such as quantified path patterns, which provide advanced pattern matching. Again, these are great additions for graph querying, and some vendors have already implemented parts of this. This work will continue, and we will add all of this to openCypher over time so that users looking for a straightforward path to GQL can gain access to this great functionality.
As for differences, there are some, but they are few, and the ones that exist aren’t that significant. These will be handled on a vendor-by-vendor basis, with clear compatibility flags, lots of advance notice, and deprecation flags in the way that’s typical for major versions.
Paving the way with openCypher
Many vendors who implement Cypher are doing so with openCypher. This open source framework provides an existing foundation of tools and tests for implementing Cypher inside of a product, be it a database or a tool. To help smooth and democratize the transition, openCypher will be providing new artifacts to align with GQL. This will then make its way into our product roadmaps in alignment with your needs: our customers and community.
This path is possible for a few reasons. First, it continues a trajectory of convergence work that has been ongoing for years. Not only was Cypher a major input into GQL, but we have been evolving the language concurrently with GQL so that the two can closely align. Another reason this makes sense is that many of the core individuals actively involved in the ISO GQL standard were (and are) also active in openCypher. All of this puts us in a great position to give you a smooth path to convergence.
Getting to GQL
We will do this in alignment with your priorities and needs, working backward from the customer, to make Cypher an implementation of GQL—in our products and in openCypher. You will see the term GQL show up increasingly often where it makes sense. You will retain the flexibility to use both Cypher and GQL syntax styles when you want, and the term Cypher isn’t going to be going away anytime soon. You will get the best of both worlds: a seamless multi-year transition with lots of optionality, and a strong and familiar path towards all the benefits of a formal international standard.
Cypher to GQL: Practical considerations
By definition, a database language standard transcends any and all details about an implementation. However, as the world continues its rapid transition to the cloud, we would be remiss in not saying some words about this increasingly common deployment scenario. For the large and growing number of you who are using, or who want to use, a managed graph database service, the transition to GQL will be even more seamless. The nature of a managed platform is that new syntaxes and features will automatically show up for you to use, speeding your path to GQL.
Generative AI impacts
A blog post would not be complete in 2024 without a mention of generative AI. As knowledge graphs increasingly become a part of the generative AI stack, a soft transition lets you get better usage from large language models (LLMs). They have, after all, already been trained on over 10 years of Cypher examples that are spread throughout the internet. As the term GQL becomes increasingly used alongside Cypher, models will gradually evolve to understand both.
Wrapping it all up
Thanks for being on the journey with us! With a formal ISO GQL standard as an added wind to all of our sails and a clear and smooth transition path for skills, certification, and query code, you will be able to continue to use your investments, while also benefiting from the power of the new standard. We are excited for this big step, which makes the graph community an even more exciting place to be.
“This post is a joint collaboration between Neo4j and AWS and is being cross-published on both the Neo4j blog and the AWS Database Blog.â€
About the authors
Philip Rathle is the Chief Technology Officer (CTO) of Neo4j, a graph database and analytics company that has enabled organizations worldwide to solve complex problems through data connections. Before taking on the CTO role, Philip led Product Management at the company for over ten years, transforming it from a single database product to a comprehensive portfolio and helping to build the category.
Brad is a Director at AWS, and the General Manager of Amazon Neptune and Timestream, AWS’s fully managed graph database (Neptune) and managed time-series database (Timestream). He believes that graphs and time series are awesome, and they help customers use the relationships in their data to gain insights. Prior to joining AWS in 2016, he was based in Washington, DC, where he was the CEO of Blazegraph and an active open-source contributor on the Blazegraph platform.
Source: Read More