In PostgreSQL, object binding (or dependencies) encompasses the relationships existing among various database elements. These interdependencies hold significant importance when it comes to the management and modification of objects within the database. They ensure that adjustments made to one object don’t inadvertently disrupt other dependent objects.
For instance, when a view relies on a table, any alterations to the table structure, such as modifying utilized columns, altering data types, or even dropping the table, can directly influence the functionality of the associated view. To make informed choices and preempt potential issues, it’s imperative to grasp the intricacies of these dependencies.
In this blog series, we explore the concept of object dependencies in PostgreSQL and their significance in managing a database effectively. Understanding and managing object dependencies is crucial for ensuring data integrity and making changes to the database without causing unexpected issues.
In this first post, we provide a comprehensive introduction to object dependencies and discuss various types of dependencies with examples. We also provide the queries that help you find the dependencies based on which you can take appropriate actions.
Prerequisites
To follow along with this post, complete the following prerequisites:
Create an Amazon Aurora PostgreSQL-Compatible Edition cluster or Amazon Relational Database Service (Amazon RDS) for PostgreSQL instance if you don’t already have one. For instructions, refer to Create an Amazon Aurora PostgreSQL-Compatible DB cluster or Create a PostgreSQL DB instance, respectively.
Create an Amazon Elastic Compute Cloud (Amazon EC2) instance to install the PostgreSQL client to access the Aurora PostgreSQL or RDS for PostgreSQL instance. For instructions, refer to Create your EC2 resources and launch your EC2 instance. Or you can set up connectivity between your RDS database and EC2 compute instance in one click.
On Amazon Linux 2023, you can use the following commands to download the psql command line tool to run all the sql statements:
Types of object dependencies
In this section, we introduce the different types of object dependencies and provide example code snippets.
Columns
Column dependencies in PostgreSQL can occur when one column relies on another column within the same table. This is commonly seen in scenarios where a calculated or derived column is based on the values of other columns. By establishing column dependencies, PostgreSQL ensures that the derived column is automatically updated whenever the values of the dependent columns change. This helps maintain data consistency and accuracy within the table.
Let’s consider the following example.
Create a sales table:
In this example, the sales table has a calculated column, total_amount, which depends on the values of unit_price and quantity. The total_amount column is generated as the product of unit_price and quantity. This is a typical use case where the values of one column depend on the values of other columns in the same table.
The GENERATED ALWAYS AS clause is used to define the derived column, and the STORED keyword ensures that the value is physically stored, allowing for efficient querying.
This is just one example of column dependencies in PostgreSQL, showcasing how you can create calculated or derived columns that rely on other columns within the same table.
Indexes
When an index is created on a table, it becomes dependent on that table. Any changes made to the table, such as inserting, updating, or deleting rows, can impact the index. For example, if a row is inserted into the table, the index is updated to include the new row for efficient data retrieval.
Let’s consider an example to illustrate this concept:
In this example, we create a table called customers with columns for customer information. We then create an index called idx_customers_email on the email column. The index is dependent on the customers table.
Now, suppose we insert a new row into the customers table:
This insertion triggers an update to the idx_customers_email index, ensuring that it includes the new row for efficient email-based searches.
Sequences
A sequence is an object that generates a sequence of unique values. It is often used to generate unique identifiers for tables.
When a column in a table is defined as a sequence, it establishes a dependency between the column and the sequence object. The column depends on the sequence to generate the next unique value when inserting new records into the table.
Let’s consider the following example:
In this example, we first create a sequence named employee_id_seq using the CREATE SEQUENCE statement. This sequence will be used to generate unique values for the employee_id column.
Next, we create a table called employees with three columns: employee_id, first_name, and last_name. The employee_id column is defined with a default value of nextval(’employee_id_seq’), which means that it will automatically obtain the next value from the sequence when a new record is inserted. The employee_id column is also declared as the primary key of the table.
User-defined data type
A user-defined data type (UDT) represents a structure or record containing single or multiple fields or attributes. It allows you to group related data together into a single object.
UDTs are used to define columns in tables or as the return type of functions. When a table column or a function returns a UDT, it establishes a dependency between the UDT and the table or function.
Let’s consider following example:
In this example, the employees table has a dependency on the composite data type employee_info. Any changes made to the definition of the employee_info type will affect the employees table.
Views
Views are virtual tables that are generated from the result of a query. They depend on the underlying tables and columns used in the query that defines the view. Any changes made to the referenced tables or columns can impact the views built on them. Views provide a dynamic and user-friendly way to interact with complex data structures, making them an essential aspect of database design.
Let’s consider the following example:
The orders_summary view relies on the orders table for its operation. Any adjustments applied to the orders table columns utilized in the view, whether it involves adding or removing columns or altering the data types of columns, can have an effect on how the orders_summary view functions.
In this specific case, the orders_summary view selects the order_id, customer_id, and order_date columns from the orders table. Therefore, any changes to these columns in the orders table will be reflected in the orders_summary view.
Functions and Procedures
Functions in PostgreSQL can have dependencies on tables, views, or other functions. Understanding these dependencies is crucial when developing complex database functions, because it ensures their reliability and consistency.
Let’s consider the following example:
This function, calculate_salary, depends on a table named employees. It retrieves the wage and hours values from the employees table based on the employee_id parameter. The function then calculates and returns the salary by multiplying the hourly_wage and hours_worked values.
This function demonstrates a dependency on the employees table because it relies on the existence of the table and the availability of the wage and hours columns within that table. Any changes to the table structure or the column names or types that were directly referenced in the function could potentially impact the functionality of this function.
Constraints
Constraints, such as primary key or unique constraints, establish dependencies on associated columns or tables. For example, a foreign key constraint on a table creates a direct dependency on another table. This ensures that the data in the referencing table adheres to specific rules defined in the referenced table. Constraints play a critical role in enforcing data integrity within a database.
Let’s consider the following example:
In this example, the orders_tab table has a dependency on the customers table through the customer_id foreign key constraint.
Triggers
Triggers are database objects that automatically run in response to specific events, such as data modifications. They can depend on tables, views, or other triggers. When a trigger is fired, it may run actions that depend on the data or structure of the database, creating a dynamic and event-driven aspect of database management.
Let’s consider the following example:
In this example, the trigger function update_total_sales() multiplies the quantity by 10 (assuming that the price per unit is 10) and updates the total_sales column. The trigger products_trigger is set to run this function before an INSERT or UPDATE operation on the products table.
Inspecting and analyzing object dependencies
In PostgreSQL, it is possible to determine what objects, such as tables, views, or functions, have dependencies on each other. This information is stored in the PostgreSQL system catalog, a set of tables and views that contain metadata about all the objects in the database. By querying these system catalog views with the appropriate SQL, you can gain insights into object dependencies. These dependencies can be important to understand when making changes to database objects, as changes can have cascading effects on dependent objects. Thus, understanding these dependencies can help you maintain the integrity and functionality of your database.
Here are some of the basic catalog views that PostgreSQL provides:
pg_attribute – This table stores detailed information regarding attributes for tables, views, composite types, and indices.
pg_namespace – This table holds information related to database schemas.
pg_class – It functions as a directory for various database objects.
pg_rewrite – This table contains rules for transforming queries and views.
pg_depend – It is used to track information about dependencies between different database objects.
pg_shdepend – This table is used to monitor dependencies on objects that are utilized across multiple databases.
With these tools, you can view the inner workings of your database and understand how different parts are linked. This knowledge helps you make smart decisions when you’re managing your PostgreSQL database, almost like being an investigator in a puzzle, figuring out how all the pieces fit together.
Queries to find the object dependencies
In this section, we walk you through the queries to find the object dependencies for each object type.
Find foreign key dependencies
To find the table dependencies with respect to referential integrity constraint references, use the following query:
From previous examples, the orders and order_items tables are dependent. The order_id column of the order_items table is dependent on the order_id column of the orders table. Let’s run the query for the orders table and see the dependent table:
Following is the output of the query:
The query shows the order_items table and order_id column as dependent in the table created.
Find table type dependencies
The following is a query to find the dependent table type for a table:
From our other example, the car table has a dependency on Person table because one of the column’s data types in the car table is the person table.
Following is the output of the query:
Find column dependencies on another column
The following is the query to find the dependent columns for a column in the same table:
From the example, the sales table has a total_amount column, which is a derived column from the quantity and unit_price columns. The following query shows those dependent columns:
Following is the output of the query:
Find indexes of a table
The following is a query to find the indexes for a table:
From the example, the customers table has a unique key and an index. The following is a query to find the indexes for the customers table:
Following is the output of the query:
Find sequence dependencies on the tables
The following is the query to find the sequence dependencies:
From the example, the employees table has an employee_id column with a default value to the next value from a sequence. The following is the query to find the sequence that refers to the employees table:
Following is the output of the query:
Find UDT dependencies on tables and columns
The following is the query to find UDTs defined to a table’s column:
From the example, the employees_info table’s info column is defined as the employee_info UDT. The following query finds the dependency:
Following is the output of the query:
Find view dependencies
The following is the query to find the dependent views of a table:
From the example, the orders_summary view is based on the orders table. The following query finds the dependent views of the orders table:
Following is the output of the query:
Find triggers
The following is the query to find the trigger dependencies of a table:
From the example, the products table has products_trigger. The following query finds the trigger dependencies of the products table:
Following is the output of the query:
Conclusion
In this first part of our series on object dependencies in PostgreSQL, we introduced the concept and discussed different types of dependencies. We also explored the importance of understanding object dependencies in managing a database effectively. Additionally, we highlighted some of the views and functions available in PostgreSQL for querying object dependencies.
Stay tuned for the next part of our series, where we will dive deep into the tools and techniques for managing object dependencies in PostgreSQL. We will explore how to use DROP and ALTER statements to handle dependencies and discuss best practices for avoiding potential problems.
By gaining a deeper understanding of object dependencies, you can optimize your PostgreSQL database operations, ensure data integrity, and make informed decisions when making changes to your database.
About the authors
Baji Shaik is a Sr. Database Consultant with AWS Professional Services, Global Competency Centre. He is a Database Migration Expert and has developed many successful database solutions addressing challenging business requirements for moving databases from on premises Oracle, SQL Server to Amazon RDS and Amazon Aurora PostgreSQL/MySQL. He is an eminent author, having written several books on PostgreSQL. A few of his recent works include “
Procedural Programming with PostgreSQL PL/pgSQL†“
PostgreSQL Configuration,†“
Beginning PostgreSQL on the Cloud,†and “
PostgreSQL Development Essentials.†Furthermore, he has delivered several conference and workshop sessions.
Wajid Ali Mir is a Database Consultant at AWS and works as a database migration specialist, helping and enabling customers to build highly available, cost-effective database solutions and migrate their database workloads to the AWS Cloud. Wajid works with AWS customers and partners to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS.
Viswanatha Shastry Medipalli is a Consultant with the AWS ProServe team. His background spans a wide depth and breadth of expertise and experience in database migrations. He has architected and designed many successful database solutions addressing challenging business requirements. He has provided solutions using Oracle, SQL Server and PostgreSQL for reporting, BI, applications, and development support. He also has a good knowledge of automation, and orchestration. His focus area is homogeneous and heterogeneous migrations of on-premise databases to Amazon.
Yaser Raja is a Principal Strategic Tech Advisor with Professional Services team at Amazon Web Services. He works with customers to build scalable, highly available and secure solutions in AWS cloud. His focus area includes database migrations and modernization, and leveraging Generative AI capabilities to build innovative solutions for the customers.â€
Source: Read More