Engineering Resources

An Overview of Relational Databases with AlgoJS

An in-depth look at relational databases, including what they are, how they're unique, and when to use them.

AlgoJS is streamlining the learning process for interview preparation for beginner Software Engineers one step at a time, making engineering roles more accessible to beginners from all backgrounds. You can find more resources from AlgoJS on YouTube.

In the simplest terms, a database is a systematic collection of data.

Its significance lies in its ability to provide organized and scalable storage, ensuring data integrity and easily supporting complex data retrieval. Databases can range from small systems that run on personal computers to massive ones that operate on servers.

Choosing the right database system is a critical strategic decision that impacts cost optimization and resource management. It's a bit like selecting the heart of your digital infrastructure—the choice dictates the immediate functionality and sets the stage for future growth and adaptability.

Investing time in selecting the most compatible and scalable database system is crucial for long-term technological and business success.

Relational databases vs. NoSQL

Relational databases are the go-to for structured data and transactional consistency, while NoSQL offers flexibility and scalability for varied data types and massive volumes. Choosing which database to use depends on your application's specific needs.

Here’s a breakdown of these common options:

Relational Databases (RDBMS): These are the most traditional form of databases. They store data in tables, which are interconnected through relationships. This type of database uses Structured Query Language (SQL) for data manipulation. Examples include MySQL, PostgreSQL, and Oracle.

Non-Relational Databases (NoSQL): These databases are more flexible in terms of data storage. They don’t require a fixed schema and can handle unstructured data like documents, key-value pairs, wide-column stores, or graphs. Popular NoSQL databases include MongoDB, Cassandra, and Redis.

Relational Databases

Today, we’re focusing on relational databases. A relational database organizes data into one or more tables (or "relations") of columns and rows, with each row representing a unique instance or record and each column representing a specific field. This format allows for the structured management of data, making it easier to establish relationships between different data points.

Relational databases provide a highly organized and intuitive system for managing data.

They’re popular for many reasons, including their

Proven reliability and stability: Relational databases have been around for decades, proving their reliability and stability in managing data across various industries.
Use of SQL: SQL is a powerful and standardized language widely understood and used in the industry. This universality makes it easier for database management and manipulation.
Strong transactional support: They provide strong transactional support (following ACID properties), which is essential for applications that require high levels of data integrity and consistency, such as financial systems.
Widespread adoption and community support: Due to their long history, relational databases have a vast community of developers and users who provide extensive support, tools, and resources.
Versatility and compatibility: They are well-suited for a wide range of applications and compatible with numerous other technologies and software systems, making them a versatile choice for many organizations.

Like any design choice, relational databases have benefits and trade-offs.

Benefits:

Faster read performance since a defined, predictable schema allows for more internal optimizations.
Enforces data integrity through primary keys and foreign keys.
Lookups can be done on non-primary key columns; however, they will take longer than O(1) complexity.
Natively supports complex table joins and filter clauses.

Trade-offs:

The rigid schema makes alterations difficult, storing null fields wastes disk space, and your application code must explicitly set all required fields.
Unnecessary data integrity checks slow down performance
Getting data from multiple tables requires resource-intensive joining to piece together the whole picture.

How relational databases work

The components of relational databases—tables, rows, and columns—work together to store data in an organized manner.

The schema then acts as the database's architectural plan, defining its structure and the rules governing data storage and relationships, ensuring efficient and consistent data handling within the database.

Let’s take a look at each component.

Tables

Function: A table in a relational database is like a spreadsheet. It's a collection of related data held in a structured format within the database.

Purpose: Each table represents a single entity type. For instance, in a database for a bookstore, you might have separate tables for Books, Authors, Customers, etc.

Rows

Function: Each row in a table represents a single record or instance of the entity defined by the table.

Purpose: Continuing with the bookstore example, in the Customers table, each row would represent a specific customer, containing their unique data.

Columns

Function: Columns in a table represent an entity's attributes or properties. Each column holds a specific type of data.

Purpose: In the Books table, columns might include Title, Author, Genre, Price, etc.

Schema

The schema in a relational database is essentially the blueprint or structure of the database. It defines how data is organized and how the relationships among data are associated.

It includes:

Tables definition: The definition of tables and their structure—which columns they contain and the type of data each column holds (like text, number, date).
Constraints: The schema also specifies rules known as constraints. These could be primary keys (unique identifiers for each record in a table), or other restrictions like not allowing null values in certain columns.
Relationships: It outlines how tables are related to each other. For instance, a Book table might be linked to an Author table, indicating which author wrote each book.

By defining the structure and rules, the schema ensures that data entered into the database is accurate and consistently formatted. The schema provides a clear model of the database's structure, which is crucial for database designers, developers, and administrators in understanding and working with the data effectively.

A well-designed schema makes it easier to manage, update, and query data efficiently.

Understanding table relationships and keys

In relational databases, table relationships are fundamental for organizing and linking data efficiently and meaningfully. These relationships reflect how different pieces of data relate to each other, like how chapters in a book are interconnected to tell a complete story.

There are different types of relationships within a database:

One-to-One: Each record in one table corresponds to exactly one record in another table. For example, a table of employees and a table of employee ID cards.
One-to-Many: A single record in one table is linked to multiple records in another. Consider a table of publishers linked to a table of books; one publisher can write several books.
Many-to-Many: Records in one table relate to multiple records in another, and vice versa. An example would be a table of students and a table of classes; students can enroll in multiple classes, and each class can have multiple students.

The role of Primary Keys (PK) and Foreign Keys (FK)

Table relationships are communicated through primary and foreign keys.

A primary key is a unique identifier for each record in a table. It ensures that each record can be distinctly identified. Imagine if you couldn't distinguish between two chapters because they had the same title. Similarly, primary keys prevent that, ensuring that each record is unique.

A foreign key is a column (or set of columns) in one table that uniquely identifies a row of another table. Essentially, it's a reference to a primary key in another table.

This setup is vital for preserving data accuracy and consistency. Primary keys prevent duplicate entries and ambiguities within a table, while foreign keys ensure referential integrity across tables.

This means that the relationships defined by these keys remain reliable and consistent.

Manage relational databases with SQL

Developed in the 1970s, SQL has become the cornerstone for all relational database management systems (RDBMS), like MySQL, PostgreSQL, Oracle, and SQL Server. It provides a systematic way of creating, retrieving, updating, and managing data in a structured format. SQL's power lies in its ability to handle complex data queries and operations, making it an indispensable tool for database administrators, developers, and analysts.

Bring it all together: online bookstore database

In this simplified example of an online bookstore, the relational database organizes and connects different types of data—books, authors, customers, and orders—in a clear, structured way. Each table holds specific information, and keys are used to link these tables and pull comprehensive information when needed.

Database Schema and Table Structure

In our online bookstore, the relational database schema might consist of several tables: ‘Books’, ‘Authors’, ‘Customers’, and ‘Orders’.

Books Table: This table has columns like ’BookID’, ‘Title’, ‘Genre’, ‘AuthorID’, and ’Price’. Each row represents a different book.
Authors Table: It includes ’AuthorID’, ‘Name’, and ‘Bio’. Each row is a unique author.
Customers Table: Contains ‘CustomerID’, ‘Name’, ‘Email’, and ‘Address’.
Orders Table: Includes ‘OrderID’, ‘CustomerID’, ‘BookID’, ‘Quantity’, and ‘OrderDate’.

Data Integrity and Column Constraints

In this schema, each table has a primary key (like ’BookID’ for the ’Books’ table), ensuring unique identification. If we want to add a new column to the ’Books’ table, say ’PageCount’, we would need to define a default value for this new column for all existing records.

Also, since relational databases are rigid in their schema, every record in the ‘Books’ table must have a value for each column, and storage space is allocated accordingly.

Joining Tables to Aggregate Data

Now, suppose we want to generate a report of all orders along with book titles and customer names. This requires combining data across the ‘Books’, ‘Customers’, and ‘Orders’ tables.

Here’s where the concept of JOINs comes in:

SQL JOIN Query: We can execute an SQL query that joins these tables on their related columns. For instance, we JOIN ‘Orders’ with ‘Books’ on ‘BookID’, and Orders with ‘Customers’ on ‘CustomerID’. This allows us to fetch a combined view that includes customer names, book titles, and order details in a single query.

Handling Relationships

Foreign Keys: The ‘Orders’ table has foreign keys like ‘CustomerID’ and ‘BookID’, which reference the primary keys in the ‘Customers’ and ‘Books’ tables, respectively. This maintains referential integrity, ensuring, for example, that each order is linked to a valid book and customer.

Updating and Expanding the Database

If the business model changes (like introducing audiobooks), we might add a new table ’Audiobooks’ or modify the existing ‘Books’ table to accommodate this new category. This change would need to be reflected in the application code to handle the new data structure or table.

Want to learn more?

You can find deeper dives into all things data structures, algorithms, and system design on the AlgoJS YouTube channel.

You can also join the Formation Fellowship to gain access to unconditional support from a team of engineering mentors, technical recruiters, career coaches, and more. Take the assessment to find out where you could use support.