Case Study: Danny’s Diner

Context

Danny seriously loves Japanese food so in the beginning of 2021, he decides to embark upon a risky venture and opens up a cute little restaurant that sells his 3 favourite foods: sushi, curry and ramen.

Danny’s Diner is in need of your assistance to help the restaurant stay afloat – the restaurant has captured some very basic data from their few months of operation but have no idea how to use their data to help them run the business.

Problem Statement

Danny wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent and also which menu items are their favourite. Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers.

He plans on using these insights to help him decide whether he should expand the existing customer loyalty program – additionally he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL.

Questions
  1. What is the total amount each customer spent at the restaurant?
  2. How many days has each customer visited the restaurant?
  3. What was the first item from the menu purchased by each customer?
  4. What is the most purchased item on the menu and how many times was it purchased by all customers?
  5. Which item was the most popular for each customer?
  6. Which item was purchased first by the customer after they became a member?
  7. Which item was purchased just before the customer became a member?
  8. What is the total items and amount spent for each member before they became a member?
  9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier – how many points would each customer have?
  10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi – how many points do customer A and B have at the end of January

Answers & Solutions


SELECT sales_customer, SUM(menu.price) AS total_spent_per_customer
FROM dannys_diner.sales
INNER JOIN dannys_diner.menu
ON sales.product_id = menu.product_id
GROUP BY sales.customer_id
ORDER BY total_spent_per_customer DESC;
Q1. What is the total amount each customer spent at the restaurant?

Answer:

Customer_idtotal_spent_per_customer
A76
B74
C36

Solution: since we are trying to determine the total amount each customer spent, we should use the SUM aggregate function on the price column in the menu schema to add up the values representing how much each customer (customer_id) spent on purchased menu items. We use AS to assign the result of the clause to the alias total_spent_per_customer. The ORDER BY clause organizes the rows by total_spent_per customer values from greatest to least.


SELECT sales.customer_id, COUNT(DISTINCT sales.order_date) AS days_visited_restaurant
FROM dannys_diner.sales
GROUP BY sales.cusomter_id
ORDER BY days_visited_restaurant DESC;
Q2. How many days has each customer visited the restaurant?

Answer:

customer_iddays_visited_restaurant
B6
A4
C2

Solution: We select the customer_id column from the sales table and perform the aggregate COUNT function to add up the distinct occurrences of order_date values in the sales table. The DISTINCT clause will prevent us from counting duplicate values in the order_date column and is important because customers could potentially make multiple purchases within a given day. The result of this clause will be categorized under days_visited_restaurant using the AS function. We group by sales.customer_id and order by days_visited_restaurant from greatest to least.


WITH sales.CTE AS(SELECT
sales.customer_id,
menu.product_name,
sales.order_date,
RANK() OVER (PARTITION BY sales.customer_id ORDER BY sales.order_date ASC) AS purchase_order
FROM dannys_diner.sales
INNER JOIN dannys_diner.menu
ON sales.product_id = menu.product_id)

SELECT *
FROM sales_CTE
WHERE purchase order = 1;
Q3. What was the first item from the menu purchased by each customer?

Answer:

customer_idproduct_nameproduct_idorder_datepurchase_order
Acurry221-01-011
Asushi121-01-011
Bcurry221-01-011
Cramen321-01-011
Cramen321-01-011

Solution: We need to SELECT the columns containing customer id’s, product names, product_id, and order dates. The RANK clause allows us to create a new column and assign numbers to outputs based on order_date in ascending order (this is especially important since we want to return the row(s) with the earliest order_date). This ranking-output is given under the alias purchase_order by using the AS function. Since RANK is a window function, we need to designate our code as a Common Table Expression (CTE) so we can conduct further conduct additional functions. We wish to select all the rows from the sales_CTE common table expression where the value in the purchase_order column is equal to 1.


SELECT menu.product_name, COUNT(sales.order_date) AS times_purchased
FROM dannys_diner.menu
INNER JOIN dannys_diner.sales
ON menu.product_id = sales.product_id
GROUP BY menu.product_name
ORDER BY times purchased DESC
LIMIT 1;
Q4. What is the most purchased item on the menu and how many times was it purchased?

Answer:

product_nametimes_purchased
ramen8

Solution: We SELECT column product_name from the menu table and aggregate COUNT of the order_date column under the alias times_purchased using the AS clause. This works because we are counting the frequency in which the dates are assigned per product name. We ORDER BY the times_purchased alias in descending order, from greatest value to least. We use the limit clause to return only 1 row, in this case the highest value in the times_purchased column since we ordered by DESC.


SELECT sales.customer_id, menu.product_name, COUNT(sales.order_date) AS times_purchased
FROM dannys_diner.sales
INNER JOIN dannys_diner.menu
ON sales.product_id = menu.product_id
GROUP BY sales.customer_id, menu.product_name
ORDER BY times_purchased DESC;
Q5. Which item was the most popular for each customer?
customer_idproduct_nametimes_purchased
Cramen3
Aramen3
Bramen2
Acurry2
Bsushi2
Bcurry2
Asushi1

Answer:

Solution: Operating in the dannys_diner schema, we SELECT columns sales.customer_id, menu.product_name, and COUNT the frequency of values in the order_date column. We want to count the frequency of sales order date values per given customer because this will reveal how many times an item was purchased when grouped by product names. Thus, we can assign the alias times_purchased using AS. The sales table contains the customer_id and order_dates for customer orders, but we need to use an INNER JOIN to join the menu table and retrieve product names from the product_name column. We group by customer id’s and product name and then sort the results by the alias times_purchased in descending order


WITH customer_cte AS(SELECT
members.customer_id,
members.join_date,
sales.order_date,
menu.product_name,
sales.product_id,
RANK() OVER(PARITION BY members.customer_id ORDER BY sales.order_date ASC) AS rnk
FROM dannys_diner.members
INNER JOIN dannys_diner.sales
ON members.customer_id = sales.customer_id
INNER JOIN dannys_diner.menu
ON menu.product_id = sales.product_id
WHERE sales.order_date >= members.join_date)

SELECT *
FROM customer_cte
WHERE rnk = 1;
Q6. 6. Which item was purchased first by the customer after they became a member?

WITH customer_cte AS(SELECT
members.customer_id,
members.join_date,
sales.order_date,
menu.product_name,
sales.product_id,
RANK() OVER(PARTITION BY members.customer_id ORDER BY sales.order_date DESC) AS rnk
FROM dannys_diner.sales
ON members.customer_id = sales.customer_id
INNER JOIN dannys.diner.menu
ON menu.product_id = sales.product_id
WHERE sales.order_date < members.join_date)

SELECT customer_id, product_name
FROM customer_cte
WHERE rnk = 1;
Q7. Which item was purchased just before the customer became a member?

Solution: Operating in the dannys_diner schema, we need to SELECT columns customer_id, join_date, order_date, product_name, and product_id. These columns contain the customer id (which serves as a unique identifier), the join date for each customer, the date each product was purchased, name of each product purchased, and the product’s identifier, respectively. We use RANK to assign values based on the order_date column date in descending order (greatest to least). We assign the alias “rnk” to simplify. We join other tables sales and menu from the dannys_diner schema. The WHERE clause will allow us to direct our query to pull results where the date in which a product was purchased is less than (before) the date in which a customer became a member. Since we’ve used RANK in our statement earlier, which is a window function, we need to assign the query to a common table expression using WITH customer_cte AS (customer_cte is the name of the common table expression). This allows us to perforsm a subquery with customer_cte to pull rows where the rank (or “rnk”) is equal to 1. This is the first item purchased by customers before becoming a member. Notice how sushi and curry share a rank of 1 for customer A, this is because both products were purchased on the same day.

Answer:

customer_idproduct_name
Asushi
Acurry
Bsushi
C N/A (never became a member)

WITH customer_spending_cte AS(SELECT
sales.customer_id,
COUNT(sales.product_id) AS total_items_purchased,
SUM(menu.price) AS total_amount_spent
FROM dannys_diner.sales
INNER JOIN dannys_diner.menu ON sales.product_id = menu.product_id
INNER JOIN dannys_diner.members ON members.customer_id = sales.customer_id
WHERE sales.order_date < members.join_date
GROUP BY sales.customer_id)

SELECT customer_id, total items_purchased, total_amount_spent
FROM customer_spending_cte
ORDER BY total_amount_spent DESC;
Q8. What is the total items and amount spent for each member before they became a member?

SELECT sales.customer_id, SUM(CASE
WHEN menu.product_name = 'sushi' THEN price * 10 * 2 ELSE price * 10 END) AS points
FROM dannys_diner.menu
INNER JOIN dannys_diner.sales
ON sales.product_id = menu.product_id
GROUP BY customer_id
ORDER BY points DESC;
Q9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier, how many points would each customer have?