db-prawo-jazdy/README.md
2025-03-17 13:09:42 +01:00

68 lines
No EOL
2 KiB
Markdown

# prawo-jazdy-resources
`db/main.py` is an *extremely* badly-written script that reads files (`pytania.csv` and `punkty.csv`) published by the Ministry of Infrastructure and imports them into a database
- `db/pytania.csv` is the `katalog_pytania_egzminacyjne_kandydat__14112024.xlsx` file from [here](https://www.gov.pl/web/infrastruktura/prawo-jazdy), opened in Libreoffice Calc and saved as csv
- `db/punkty.csv` isn't published by the Ministry, so you have to send a freedom of information request to acquire it; it is processed the same way as `pytania.csv`
The database structure is as follows:
```sql
CREATE TABLE tasks (
id integer NOT NULL,
correct_answer boolean,
media_url text,
weight smallint
);
CREATE TABLE questions (
task_id integer,
lang character(2),
text text
);
CREATE TABLE tasks_advanced (
id integer NOT NULL,
correct_answer character(1),
media_url text,
weight smallint
);
CREATE TABLE questions_advanced (
task_id integer,
lang character(2),
text text,
answer_a text,
answer_b text,
answer_c text
);
CREATE TABLE categories (
name text,
task_id integer
);
```
The basic tasks can be queried like so:
```sql
SELECT tasks.correct_answer, tasks.media_url, tasks.weight, questions.text
FROM tasks
LEFT JOIN questions ON tasks.id = questions.task_id
LEFT JOIN categories ON tasks.id = categories.task_id
WHERE
categories.name = 'B' AND
questions.lang = 'PL'
ORDER BY random()
LIMIT (20);
```
And the advanced tasks like this:
```sql
SELECT tasks_advanced.correct_answer, tasks_advanced.media_url, tasks_advanced.weight,
questions_advanced.text, questions_advanced.answer_a, questions_advanced.answer_b, questions_advanced.answer_c
FROM tasks_advanced
LEFT JOIN questions_advanced ON tasks_advanced.id = questions_advanced.task_id
LEFT JOIN categories ON tasks_advanced.id = categories.task_id
WHERE
categories.name = 'B' AND
questions_advanced.lang = 'PL'
ORDER BY random()
LIMIT (20);
```