I’ve been a long-term fan of oracle model by clause, talking about here and there, doing some small&funny examples (like eight queens solution in one select) but lacked some real-life necessity to use it. It always turned out that partition by was enough, or some tricky analytical function was already built-in (like trend calculations I wrote about).
And at last, I’ve encountered a suitable problem.
Task is quite simple&straightforward: calculate number of periods in which current stock will be sold in a store. Like: today we have sales enough for 3 weeks. That’s a very common sales demand analysis parameter (Months of Sales (MoS), Weeks of Sales (WoS), DoS and etc).
Formula is quite simple: Current Stock / Average Sales over some period
But there were some difficulties in my case.
I’ve got only partial data, some periods were completely missing, and the requirement was to treat missing sales data as zeros. And I also had to get the last known stock as current stock, because stock data could be missing too.
Last stock is pretty easy to get using last_value
oracle function, but getting average sales is hard. You cannot use fixed window width in partition by, since data can be missing. Moreover period varied other the years, so I’ve ended up adding up a field in time dimension called periods_ago
, that contained time_key reference from which to sum sales.
So I’ve started with simple subqueries
select
time_key,
product,
company,
sales_qty,
(select sum(sales_qty)
from dwh.sales sales2
where
sales2.product = sales.product,
sales2.company = sales.company,
sales2.time_key between weeks.periods_ago and sales.time_key
) as running_sales_sum
from
dwh.time_dim weeks, dwh.sales sales
where
weeks.key = sales.time_key
and that is terribly slow on 10^6+ rows. ~20 minutes.
But with model by it’ll be like:
select
time,
product,
company,
sales_qty,
last_stock_qty
from
dwh.time_dim weeks, dwh.sales sales
where
weeks.key = sales.time_key
model
IGNORE NAV
partition by (product,company)
dimension by (time_key)
measures
(
periods_ago,
sales_qty,
0 eight_weeks_sales_sum_qty,
nvl(LAST_VALUE(STOCK IGNORE NULLS)
OVER (PARTITION BY COMPANY, PRODUCT, SEGMENT, DISTRIB_LEVEL, SYSTEM
ORDER BY TIME ASC),0) AS last_stock_qty
rules UPDATE
(
running_sales_sum[any] = sum(sales_qty)[time between periods_ago[cv()] and cv(time_key)]
end
)
)
around 3 minutes ) Notice the last stock calculation.
Here’s a couple links to good model by examples:
A wonderful tutorial at SQLSnippets
Official documentation restriction section — read that carefully