PSQL Functions and Django
FREQ=WEEKLY;BYDAY=TU,TH
You can probably infer just from reading something along the lines “weekly on Tuesdays and Thursdays”.
Correct! A recurrence rule describe a repeating pattern for events. The recurrence rule is part of the standard specification iCalendar [1]
The formal format definition follows recur-rule-part *( ";" recur-rule-part )
.
The structure consists of recurrence grammar parts and comes in NAME=VALUE pair. Each part is separated by semicolon character “;”. And different parts have different constraints as well - must only be defined once, unordered, sequences separated by comma character, dependencies over the other, etc. The rules are well defined in the RFC document. For now, we’ll stick with the simple rrule above.
Our goal is to find a way to ease filtering records based on rrule parts. Being simple and flexible is always welcome.
Parsing RRULE
For now, the rrule is simply a CharField
or a TextField
. In Django one might have defined a custom field to support validation using rrule
of the dateutil
module [2]. So let’s just assume that every rrule we have as input are valid.
We do not want to always rely on substring search as it is very limiting.
# Get events that happen weekly.
weekly_events = Events.objects.filter(recurrence__contains='FREQ=WEEKLY') # OK
# Now which of these events happen on tuesdays?
events = weekly_events.filter(recurrence__contains='BYDAY=TU') # OK
# what about thursdays?
events = weekly_events.filter(recurrence__contains='BYDAY=TH') # OK
# Tuesdays and Thursdays?
events = weekly_events.filter(recurrence__contains='BYDAY=TU,TH')
| weekly_events.filter(recurrence__contains='BYDAY=TH,TU')
# Mondays, Tuesdays, Thursdays?
events = weekly_events.filter(recurrence__contains='BYDAY=MO,TU,TH')
| weekly_events.filter(recurrence__contains='BYDAY=TU,TH,MO')
| weekly_events.filter(recurrence__contains='BYDAY=TH,MO,TU')
# Because the order doesn't matter, each question we can come up with
# requires a combination of the valid queries
-- Query
SELECT "events_event"."id",
"events_event"."name",
"events_event"."recurrence"
FROM "events_event"
WHERE "events_event"."recurrence"::text LIKE FREQ=WEEKLY%
-- Explain
Seq Scan on events_event (cost=0.00..31.50 rows=99 width=46)
Filter: (recurrence ~~ 'FREQ=WEEKLY%'::text)
It might be a good idea to leverage PostgreSQL JSON(b) support. We can make a function that takes in an rrule and returns a JSON object which we can conveniently use for filter expressions.
Something similar to '{"FREQ": "WEEKLY", "BYDAY=["TU","TH"]}'::jsonb->'FREQ' = '"DAILY"'
.
To achieve that we need to create a SQL function.
-- Parses rrule into json parts
CREATE OR REPLACE FUNCTION extract_rrule(rrulestr text) RETURNS jsonb AS $$
DECLARE
rout text[];
rrulparam text;
part text[];
BEGIN
IF rrulestr <> '' IS NOT TRUE THEN
RETURN TO_JSONB(rout);
END IF;
FOREACH rrulparam in ARRAY REGEXP_SPLIT_TO_ARRAY(rrulestr, ';')
LOOP
part := REGEXP_SPLIT_TO_ARRAY(rrulparam, '=');
-- check if expecting datetime
IF part[1] = 'UNTIL' THEN
part[2] := to_timestamp(RTRIM(part[2], 'Z'), 'YYYYMMDDTHH24MISS');
-- check if sequence
ELSEIF POSITION(',' IN part[2]) > 1 THEN
part[2] := ARRAY_TO_JSON(REGEXP_SPLIT_TO_ARRAY(part[2], ','));
END IF;
rout := rout || part;
END LOOP;
RETURN JSONB_OBJECT(rout);
END
$$ LANGUAGE plpgsql;
Basically the function does the following.
- First, check if rrule is empty. If it is, return right away.
- Identify each rrule parameter. Split the string by semi-colon “;”.
- For each parameter, identify which is the grammar key and which is the value by splitting it further by the equal sign “=”.
- These are extra bits on the grammar value. Identify if the value is intended to be date time.
- If it isn’t, check if it’s a collection splitting by comma “,”.
- Otherwise, accept it as is.
--- Show SQL runs
SELECT "events_event"."id",
extract_rrule(recurrence) AS frequency,
"events_event","recurrence"
FROM "events_event";
id | frequency | events_event | recurrence
------+------------------------------------------------------------------------+------------------------------------------------------------------------------+-----------------------------------------------
1075 | {"FREQ": "WEEKLY", "BYDAY": "TH"} | (1075,"beach day 75",FREQ=WEEKLY;BYDAY=TH) | FREQ=WEEKLY;BYDAY=TH
1076 | {"FREQ": "WEEKLY", "BYDAY": "TH"} | (1076,"dinner 76",FREQ=WEEKLY;BYDAY=TH) | FREQ=WEEKLY;BYDAY=TH
...
1001 | {"FREQ": "WEEKLY", "BYDAY": "[\"TU\",\"TH\"]"} | (1001,"party 1","FREQ=WEEKLY;BYDAY=TU,TH") | FREQ=WEEKLY;BYDAY=TU,TH
1002 | {"FREQ": "WEEKLY", "BYDAY": "[\"TU\",\"TH\"]"} | (1002,"beach day 2","FREQ=WEEKLY;BYDAY=TU,TH") | FREQ=WEEKLY;BYDAY=TU,TH
1003 | {"FREQ": "WEEKLY", "BYDAY": "[\"TU\",\"TH\"]"} | (1003,"baptism 3","FREQ=WEEKLY;BYDAY=TU,TH") | FREQ=WEEKLY;BYDAY=TU,TH
SELECT "events_event"."id",
extract_rrule(recurrence) as frequency,
"events_event","recurrence"
FROM "events_event" WHERE extract_rrule("events_event"."recurrence")::jsonb->'FREQ' = '"DAILY"';
id | frequency | events_event | recurrence
------+-------------------+-------------------------------------------+------------
1100 | {"FREQ": "DAILY"} | (1100,"baptism 100",FREQ=DAILY) | FREQ=DAILY
1101 | {"FREQ": "DAILY"} | (1101,"trip to Jerusalem 101",FREQ=DAILY) | FREQ=DAILY
1102 | {"FREQ": "DAILY"} | (1102,"party 102",FREQ=DAILY) | FREQ=DAILY
1103 | {"FREQ": "DAILY"} | (1103,"baptism 103",FREQ=DAILY) | FREQ=DAILY
1104 | {"FREQ": "DAILY"} | (1104,"party 104",FREQ=DAILY) | FREQ=DAILY
Filtering RRULE
Now it’s time to load the function in Django.
./manage.py makemigrations --name func_extract_rrule --empty events
Migrations for 'redemption':
events/migrations/0002_func_extract_rrule.py
# migrations/0002_func_extract_rrule.py
class Migration(migrations.Migration):
dependencies = [
('events', '0001_initial'),
]
operations = [
migrations.AlterModelManagers(
name='event',
managers=[
('recurring', django.db.models.manager.Manager()),
],
),
migrations.RunSQL(
sql=CREATE_FUNC_SQL, # the exact SQL function above
reverse_sql=migrations.RunSQL.noop
)
]
./manage.py migrate events
Operations to perform:
Apply all migrations: admin, auth, contenttypes, events, sessions
Running migrations:
Applying events.0002_func_extract_rrule... OK
And to complete our fat model that uses our custom function.
# events/models/model.py
from dateutil.rrule import rrulestr
from django.contrib.postgres.fields import JSONField
from django.db import models
from django.db.models import F, Func
from django.db.models.functions import Cast
def validate_rrule(value):
try:
rrulestr(value)
except ValueError:
return False
return True
class ExtractRRule(Func):
function = 'extract_rrule'
class RecurringEventQuerySet(models.QuerySet):
def daily(self):
return self.filter(rrule__FREQ='DAILY')
def weekly(self):
return self.filter(rrule__FREQ='WEEKLY')
def monthly(self):
return self.filter(rrule__FREQ='MONTHLY')
def yearly(self):
return self.filter(rrule__FREQ='YEARLY')
class RecurringEventsManager(models.Manager):
def get_queryset(self):
return RecurringEventQuerySet(self.model, using=self._db).annotate(
rrule=Cast(
ExtractRRule(F('recurrence')),
output_field=JSONField()
)
)
def daily(self):
return self.get_queryset().daily()
def weekly(self):
return self.get_queryset().weekly()
def monthly(self):
return self.get_queryset().monthly()
def yearly(self):
return self.get_queryset().yearly()
class Event(models.Model):
name = models.CharField(max_length=200, blank=True)
recurrence = models.TextField(
validators=[validate_rrule]
)
recurring = RecurringEventsManager()
def __str__(self):
return f"{self.name} - {self.recurrence}"
Event.recurring.weekly()
<RecurringEventQuerySet [<Event: beach day 75 - FREQ=WEEKLY;BYDAY=TH>, <Event: dinner 76 - FREQ=WEEKLY;BYDAY=TH>, <Event: trip to Jerusalem 77 - FREQ=WEEKLY;BYDAY=TH>, <Event: beach day 78 - FREQ=WEEKLY;BYDAY=TH>, <Event: baptism 79 - FREQ=WEEKLY;BYDAY=TH>, <Event: party 80 - FREQ=WEEKLY;BYDAY=TH>, <Event: baptism 81 - FREQ=WEEKLY;BYDAY=TH>, <Event: trip to Jerusalem 82 - FREQ=WEEKLY;BYDAY=TH>, <Event: dinner 83 - FREQ=WEEKLY;BYDAY=TH>, <Event: party 84 - FREQ=WEEKLY;BYDAY=TH>, <Event: baptism 85 - FREQ=WEEKLY;BYDAY=TH>, <Event: party 86 - FREQ=WEEKLY;BYDAY=TH>, <Event: party 87 - FREQ=WEEKLY;BYDAY=TH>, <Event: dinner 88 - FREQ=WEEKLY;BYDAY=TH>, <Event: beach day 89 - FREQ=WEEKLY;BYDAY=TH>, <Event: trip to Jerusalem 90 - FREQ=WEEKLY;BYDAY=TH>, <Event: dinner 91 - FREQ=WEEKLY;BYDAY=TH>, <Event: dinner 92 - FREQ=WEEKLY;BYDAY=TH>, <Event: party 93 - FREQ=WEEKLY;BYDAY=TH>, <Event: beach day 94 - FREQ=WEEKLY;BYDAY=TH>, '...(remaining elements truncated)...']>
Event.recurring.weekly().filter(rrule__BYDAY__contained_by=['TH']).count()
25
Event.recurring.weekly().filter(rrule__BYDAY__contained_by=['TU']).count()
25
Event.recurring.weekly().filter(rrule__BYDAY__contained_by=['TU','TH']).count()
50
print(Event.recurring.weekly().filter(rrule__BYDAY__contained_by=['TH','TU']).only('id').query)
SELECT "events_event"."id", (extract_rrule("events_event"."recurrence"))::jsonb AS "rrule" FROM "events_event" WHERE (((extract_rrule("events_event"."recurrence"))::jsonb -> FREQ) = '"WEEKLY"' AND ((extract_rrule("events_event"."recurrence"))::jsonb -> BYDAY) <@ '["TH", "TU"]')
Scope and Limitations
Would it have been better to have the rrule as a JSONField initially? With GinIndex?
Would it have been better to split each grammar part as it’s own field?
References:
- Internet Calendaring and Scheduling Core Object Specification (iCalendar) iCalendar.org retrieved from https://tools.ietf.org/html/rfc5545
- dateutil - rrule. Retrieved from https://dateutil.readthedocs.io/en/stable/rrule.html#module-dateutil.rrule
- PostgreSQL specific model fields. Retrieved from https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/fields/