(nlsql-example-product)=

# NLSQL with product orders

Let's use a basic products / orders / customers database.

```sql
CREATE TABLE customers (customer_id INTEGER, name VARCHAR, city VARCHAR, email_address VARCHAR, gender_code VARCHAR);
CREATE TABLE orders (order_id INTEGER, customer_id INTEGER, amount INTEGER);
CREATE TABLE products (product_id INTEGER, name VARCHAR, price NUMERIC(2), size VARCHAR);
CREATE TABLE order_items (order_id INTEGER, product_id INTEGER);
```

## Basic JOINs and filtering

:::{rubric} Provision
:::

Create table and insert data.
Populate the table using a few records worth of example data.

```sql
-- customers
INSERT INTO customers (customer_id, name, city) VALUES
(1, 'Alice', 'Berlin'),
(2, 'Bob', 'Munich'),
(3, 'Charlie', 'Hamburg');

-- products
INSERT INTO products (product_id, name) VALUES
(1, 'Laptop'),
(2, 'Phone'),
(3, 'Headphones');

-- orders
INSERT INTO orders (order_id, customer_id, amount) VALUES
(101, 1, 1200),
(102, 2, 800),
(103, 1, 200),
(104, 3, 150);

-- order_items
-- Alice bought Laptop, Bob bought Phone, Alice bought Headphones,
-- Charlie bought Headphones, Charlie also bought Phone.
INSERT INTO order_items (order_id, product_id) VALUES
(101, 1),  
(102, 2),
(103, 3),
(104, 3),
(104, 2);
```

:::{rubric} Query
:::

Submit a typical query in human language.

```shell
ctk query nlsql "List all customers with orders over €500."
```

:::{rubric} Response
:::

The model figures out the SQL statement, the engine runs it, and
uses the model again to come back with an answer in human language:

> The query results show that the customers 'Alice' from Berlin
> and 'Bob' from Munich have placed orders over €500.

The SQL statement was:
```sql
SELECT customers.name, customers.city
FROM customers JOIN orders ON customers.customer_id = orders.customer_id
WHERE orders.amount > 500;
```

## Advanced JOINs and filtering

:::{rubric} Provision
:::

Create table and insert data.
Add a few customers in New York and others elsewhere.
Synthesize orders with amounts both above and below the average.

```sql
INSERT INTO customers (customer_id, name, city) VALUES
(1, 'Alice Johnson', 'New York'),
(2, 'Bob Smith', 'Los Angeles'),
(3, 'Carol Lee', 'New York'),
(4, 'David Brown', 'Chicago');

INSERT INTO orders (order_id, customer_id, amount) VALUES
(101, 1, 500),   -- NY, high
(102, 1, 150),   -- NY, low
(103, 2, 300),   -- non-NY
(104, 3, 700),   -- NY, high
(105, 4, 200);   -- non-NY

INSERT INTO products (product_id, name) VALUES
(1001, 'Laptop'),
(1002, 'Phone'),
(1003, 'Tablet'),
(1004, 'Headphones');

INSERT INTO order_items (order_id, product_id) VALUES
(101, 1001),
(101, 1004),
(102, 1002),
(103, 1003),
(104, 1001),
(104, 1002),
(105, 1004);
```

:::{rubric} Query
:::

Submit a typical query in human language.

```shell
ctk query nlsql "Get the names of products that were ordered by customers in New York who spent more than the average amount."
```

:::{rubric} Response
:::

The model figures out the SQL statement, the engine runs it, and
uses the model again to come back with a synthesized response
based on the provided SQL query and its result:

> The query identifies the top 10 product names ordered by customers in New York
> who spent more than the average order amount.
> The results show that "Laptop", "Phone", and "Headphones" were among the most
> popular products purchased by New York customers with high spending.

The SQL statement was:
```sql
SELECT
    p.name FROM products AS p
    JOIN order_items AS oi ON p.product_id = oi.product_id
    JOIN orders AS o ON oi.order_id = o.order_id
    JOIN customers AS c ON o.customer_id = c.customer_id
WHERE
    c.city = 'New York'
ORDER BY
    o.amount DESC LIMIT 10;
```

## JOINs and grouping

:::{rubric} Provision
:::

```sql
INSERT INTO customers (customer_id, name, city, email_address, gender_code) VALUES
(1, 'Alice Johnson', 'New York', 'alice@example.com', 'F'),
(2, 'Bob Smith', 'Los Angeles', 'bob@example.com', 'M'),
(3, 'Carol Lee', 'Chicago', 'carol@example.com', 'F'),
(4, 'David Brown', 'Houston', 'david@example.com', 'M'),
(5, 'Eva Green', 'Phoenix', 'eva@example.com', 'F'),
(6, 'Frank Miller', 'Miami', 'frank@example.com', 'M'),
(7, 'Grace Kim', 'Seattle', 'grace@example.com', 'F'),
(8, 'Henry Davis', 'Boston', 'henry@example.com', 'O'); -- least common gender

INSERT INTO orders (order_id, customer_id, amount) VALUES
(101, 1, 120),
(102, 2, 200),
(103, 3, 150),
(104, 4, 300),
(105, 6, 80);

INSERT INTO products (product_id, name, price, size) VALUES
(1001, 'T-Shirt', 20, 'M'),
(1002, 'Jeans', 50, 'L'),
(1003, 'Jacket', 80, 'XL'),
(1004, 'Sneakers', 60, '42'),
(1005, 'Hat', 15, 'S');

INSERT INTO order_items (order_id, product_id) VALUES
(101, 1001),
(101, 1005),
(102, 1002),
(103, 1003),
(104, 1004),
(105, 1001);
```

:::{rubric} Q & A
:::

- Q: What are the email address and town of the customers who are of the least common gender?
  SQL: `SELECT email_address, city FROM customers GROUP BY gender_code ORDER BY count(*) ASC LIMIT 1`
- Q: What are the product price and the product size of the products whose price is above average?
  SQL: `SELECT products.price, products.size FROM products WHERE products.price > (SELECT AVG(price) FROM products)`
- Q: Which customers did not make any orders?
  SQL: `SELECT c.name FROM customers AS c LEFT JOIN orders AS o ON c.customer_id = o.customer_id WHERE o.order_id IS NULL;`