Commit Graph

164 Commits

Author SHA1 Message Date
b9ac713cb2 minor 2025-11-11 11:41:48 +01:00
05bee9b718 mrg 2025-11-11 10:54:36 +01:00
34589efbde [Fix] Mitigate XSS attack vector by cleaning input before it hits our Database (#171)
## Changes

- All text input fields are now cleaned with nh3 to remove html tags. We allow certain html tags under `settings.py/ALLOWED_HTML_TAGS` so we can easily update the tags we allow in the future.
- All names and descriptions now use the template tag `nh_safe` in all html files.
- Usernames and emails are a small exception and are not allowed any html tags

Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Co-authored-by: jebus <lorsbach@envipath.com>
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#171
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
pre_frontpage_update
2025-11-11 22:49:55 +13:00
1cccefa991 [Feature] Basic Test Workflow (#186)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#186
Reviewed-by: liambrydon <lbry121@aucklanduni.ac.nz>
Reviewed-by: Tobias O <tobias.olenyi@envipath.com>
2025-11-11 21:07:25 +13:00
0d9947e6ce chore: add tailwindcss autosort 2025-11-10 18:36:00 +13:00
e3b381ab41 chore: ignore code-workspace 2025-11-10 18:34:09 +13:00
97626337aa chore: add prettier formatting to html 2025-11-10 18:30:07 +13:00
2aded2ddd7 Merge remote-tracking branch 'origin/develop' into feature/frontend_update 2025-11-10 17:52:00 +13:00
e26d5a21e3 [Enhancement] Refactor Dataset (#184)
# Summary
I have introduced a new base `class Dataset` in `ml.py` which all datasets should subclass. It stores the dataset as a polars DataFrame with the column names and number of columns determined by the subclass. It implements generic methods such as `add_row`, `at`, `limit` and dataset saving. It also details abstract methods required by the subclasses. These include `X`, `y` and `generate_dataset`.

There are two subclasses that currently exist. `RuleBasedDataset` for the MLRR models and `EnviFormerDataset` for the enviFormer models.

# Old Dataset to New RuleBasedDataset Functionality Translation

- [x] \_\_init\_\_
    - self.columns and self.num_labels moved to base Dataset class
    - self.data moved to base class with name self.df along with initialising from list or from another DataFrame
    - struct_features, triggered and observed remain the same
- [x] \_block\_indices
    - function moved to base Dataset class
- [x] structure_id
    - stays in RuleBasedDataset, now requires an index for the row of interest
- [x] add_row
    - moved to base Dataset class, now calls add_rows so one or more rows can be added at a time
- [x] times_triggered
    - stays in RuleBasedDataset, now does a look up using polars df.filter
- [x] struct_features (see init)
- [x] triggered (see init)
- [x] observed (see init)
- [x] at
    - removed in favour of indexing with getitem
- [x] limit
    - removed in favour of indexing with getitem
- [x] classification_dataset
    - stays in RuleBasedDataset, largely the same just with new dataset construction using add_rows
- [x] generate_dataset
    - stays in RuleBasedDataset, largely the same just with new dataset construction using add_rows
- [x] X
    - moved to base Dataset as @abstract_method, RuleBasedDataset implementation functionally the same but uses polars
- [x] trig
    - stays in RuleBasedDataset, functionally the same but uses polars
- [x] y
    - moved to base Dataset as @abstract_method, RuleBasedDataset implementation functionally the same but uses polars
- [x] \_\_get_item\_\_
    - moved to base dataset, now passes item to the dataframe for polars to handle
- [x] to_arff
    - stays in RuleBasedDataset, functionally the same but uses polars
- [x] \_\_repr\_\_
    - moved to base dataset
- [x] \_\_iter\_\_
    - moved to base Dataset, now uses polars iter_rows

# Base Dataset class Features
The following functions are available in the base Dataset class

- init - Create the dataset from a list of columns and data in format list of list. Or can create a dataset from a polars Dataframe, this is essential for recreating itself during indexing. Can create an empty dataset by just passing column names.
- add_rows - Add rows to the Dataset, we check that the new data length is the same but it is presumed that the column order matches the existing dataframe
- add_row - Add one row, see add_rows
- block_indices - Returns the column indices that start with the given prefix
- columns - Property, returns dataframe.columns
- shape - Property, returns dataframe.shape
- X - Abstract method to be implemented by the subclasses, it should represent the input to a ML model
- y - Abstract method to be implemented by the subclasses, it should represent the target for a ML model
- generate_dataset - Abstract and static method to be implemented by the subclasses, should return an initialised subclass of Dataset
- iter - returns the iterable from dataframe.iter_rows()
- getitem - passes the item argument to the dataframe. If the result of indexing the dataframe is another dataframe, the new dataframe is  packaged into a new Dataset of the same subclass. If the result of indexing is something else (int, float, polar Series) return the result.
- save - Pickle and save the dataframe to the given path
- load - Static method to load the dataset from the given path
- to_numpy - returns the dataframe as a numpy array. Required for compatibility with training of the ECC model
- repr - return a representation of the dataset
- len - return the length of the dataframe
- iter_rows - Return dataframe.iterrows with arguments passed through. Mainly used to get the named iterable which returns rows of the dataframe as dict of column names: column values instead of tuple of column values.
- filter - pass to dataframe.filter and recreates self with the result
- select - pass to dataframe.select and recreates self with the result
- with_columns - pass to dataframe.with_columns and recreates self with the result
- sort - pass to dataframe.sort and recreates self with the result
- item - pass to dataframe.item
- fill_nan - fill the dataframe nan's with value
- height - Property, returns the height (number of rows) of the dataframe

- [x] App domain
- [x] MACCS alternatives

Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#184
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
2025-11-07 08:46:17 +13:00
f5133c1980 fix: remove obsolete page id 2025-11-06 10:36:04 +13:00
7fbc49afd3 chore: update citations 2025-11-05 17:50:56 +13:00
a087a518f6 chore: remove incorrect license header 2025-11-05 17:39:21 +13:00
881e0e6798 chore: fix typo 2025-11-05 17:38:52 +13:00
2eab66e9ee refactor: added meta.site_id for matomo 2025-11-05 17:37:44 +13:00
ab927b11a2 refactor: remove dependency-groups 2025-11-05 17:36:43 +13:00
fde60c3ad3 refactor: remove optional stubs 2025-11-05 17:35:45 +13:00
61a43da822 refactor: set enviformer to main 2025-11-05 17:34:31 +13:00
211ebfd19b refactor: remove enviformer loading in settings 2025-11-05 17:33:41 +13:00
06a6c23d05 fix: add tailwindcss/cli 2025-11-05 17:30:15 +13:00
3536a14e47 Merge remote-tracking branch 'origin/develop' into feature/frontend_update 2025-11-05 17:25:27 +13:00
98d62e1d1f [Feature] Make Matomo Site ID configurable via .env (#183)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#183
2025-11-05 10:19:07 +13:00
7eb4029ac9 refactor: add public_mode for static pages to remove nav elements 2025-11-04 19:34:04 +13:00
7b38fc2e37 fix: remove jobs clash 2025-11-04 19:33:31 +13:00
4834348454 Merge remote-tracking branch 'origin/develop' into feature/frontend_update 2025-10-30 14:02:57 +13:00
13ed86a780 [Feature] Identify Missing Rules (#177)
Fixes #97
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#177
2025-10-30 00:47:45 +13:00
f1b4c5aadb [Feature] Adding list_display to various django admin sites (#180)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#180
2025-10-29 22:26:28 +13:00
0a52b12f02 fix: handle line-clamp issue with news 2025-10-29 19:59:45 +13:00
14571d23a6 docs: add pnpm note 2025-10-29 18:23:28 +13:00
ea8475f0e2 docs: update README regarding dev command 2025-10-29 18:07:56 +13:00
442d139217 chore: remove obsolete doc 2025-10-29 18:06:21 +13:00
1ba511a31d chore: minimize fallback data 2025-10-29 18:02:30 +13:00
5d89341955 chore: delete obsolete runserver command 2025-10-29 18:01:21 +13:00
5f390ac2d2 fix: reenable modal showing 2025-10-29 17:52:10 +13:00
46d21e60d2 chore: add example input to search 2025-10-29 16:36:01 +13:00
13be240226 feat: working search redirect 2025-10-29 16:30:00 +13:00
167a72f5a3 fix: remove obsolete menu list 2025-10-29 16:01:13 +13:00
1736319bd7 style: update navbar and add browse back 2025-10-29 15:58:17 +13:00
e87aae6bf7 style: add legal footers on login 2025-10-29 12:16:42 +13:00
253523c81f feat: add mock legal (impressum page) 2025-10-29 12:16:24 +13:00
15809a4ccf style: update login pages 2025-10-29 12:01:35 +13:00
b7e1dac66a feat: add mockup for static pages 2025-10-29 11:13:31 +13:00
849ebbe7f8 style: update hero 2025-10-29 11:13:07 +13:00
c5dcb36452 fix: dev command working 2025-10-29 10:59:22 +13:00
37e0e18a28 [Fix] Fixed Incremental Prediction Typo (#176)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#176
2025-10-28 23:29:08 +13:00
de44c22606 [Migration] Added missing Migration for JobLog (#175)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#175
2025-10-27 22:41:16 +13:00
a952c08469 [Feature] Basic logging of Jobs, Model Evaluation (#169)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#169
2025-10-27 22:34:05 +13:00
551cfc7768 [Enhancement] Create ML Models (#173)
## Changes

- Ability to change the threshold from a command line argument.
- Names of data packages included in model name
- Names of data, rule and eval packages included in the model description
- EnviFormer models are now viewable on the admin site
- Ignore CO2 for training and evaluating EnviFormer

Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#173
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
2025-10-23 06:20:22 +13:00
16a991220a Slim down Navbar 2025-10-22 12:10:46 +13:00
05c8e130b1 Add documentation link 2025-10-22 12:08:31 +13:00
4fd7856043 Remove fields from navbar 2025-10-22 12:07:55 +13:00