# prawo-jazdy-resources

`db/main.py` is an *extremely* badly-written script that reads files published by the Ministry of Infrastructure and imports them into a database; reads data from `pytania.csv` and `punkty.csv`
- `db/pytania.csv` is the `katalog_pytania_egzminacyjne_kandydat__14112024.xlsx` file from [here](https://www.gov.pl/web/infrastruktura/prawo-jazdy), opened in Libreoffice Calc and saved as csv
- `db/punkty.csv` isn't published by the Ministry, so you have to send a freedom of information request to acquire it; it is processed the same way as `pytania.csv`

The database structure is as follows: 
```sql
CREATE TABLE tasks (
    id integer NOT NULL,
    correct_answer boolean,
    media_url text,
    weight smallint
);

CREATE TABLE questions (
    task_id integer,
    lang character(2),
    text text
);

CREATE TABLE tasks_advanced (
    id integer NOT NULL,
    correct_answer character(1),
    media_url text,
    weight smallint
);

CREATE TABLE questions_advanced (
    task_id integer,
    lang character(2),
    text text,
    answer_a text,
    answer_b text,
    answer_c text
);

CREATE TABLE categories (
    name text,
    task_id integer
);
```

The basic tasks can be queried like so:
```sql
SELECT tasks.correct_answer, tasks.media_url, tasks.weight, questions.text
FROM tasks
    LEFT JOIN questions ON tasks.id = questions.task_id
    LEFT JOIN categories ON tasks.id = categories.task_id
WHERE
    categories.name = 'B' AND
    questions.lang = 'PL'
ORDER BY random()
LIMIT (20);
```
And the advanced tasks like this:
```sql
SELECT tasks_advanced.correct_answer, tasks_advanced.media_url, tasks_advanced.weight,
       questions_advanced.text, questions_advanced.answer_a, questions_advanced.answer_b, questions_advanced.answer_c
FROM tasks_advanced
    LEFT JOIN questions_advanced ON tasks_advanced.id = questions_advanced.task_id
    LEFT JOIN categories ON tasks_advanced.id = categories.task_id
WHERE
    categories.name = 'B' AND
    questions_advanced.lang = 'PL'
ORDER BY random()
LIMIT (20);
```