Commit Graph

19 Commits

Author SHA1 Message Date
901de4640c [Fix] Stereochemistry prediction handling (#228 and #238) (#250)
**This pull request will need a separate migration pull-request**

I have added an alert box in two places when the user tries to predict with stereo chemistry.

When a user predicts a pathway with stereo chemistry an alert box is shown in that node's hover.
To do this I added two new fields. Pathway now has a "predicted" BooleanField indicating whether it was predicted or not. It is set to True if the pathway mode for prediction is "predict" or "incremental" and False if it is "build". I think it is a flag that could be useful in the future, perhaps for analysing how many predicted pathways are in enviPath?
Node now has a `stereo_removed` BooleanField which is set to True if the Node's parent Pathways has "predicted" as true and the node SMILES has stereochemistry.
<img width="500" alt="{927AC9FF-DBC9-4A19-9E6E-0EDD3B08C7AC}.png" src="attachments/69ea29bc-c2d2-4cd2-8e98-aae5c5737f69">

When a user does a prediction on a model's page it shows at the top of the list. This did not require any new fields as the entered SMILES does not get saved anywhere.
<img width="500" alt="{BED66F12-5F07-419E-AAA6-FE1FE5B4F266}.png" src="attachments/5fcc3a9b-4d1a-4e48-acac-76b7571f6507">

I think the alert box is an alright solution but if you have a great idea for something that looks/fits better please change it or let me know.

Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#250
Co-authored-by: Liam Brydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: Liam Brydon <lbry121@aucklanduni.ac.nz>
2025-12-03 10:19:34 +13:00
e8ae494c16 [Feature] Implemented SMARTS filtering for Rules (#246)
Reactant Filter SMARTS as well as Product Filter SMARTS are now reflected when applying rules.

Fixes #245

Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#246
2025-11-28 23:28:41 +13:00
67b1baa5b0 [Feature] Legacy API (#224)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#224
2025-11-19 20:45:16 +13:00
34589efbde [Fix] Mitigate XSS attack vector by cleaning input before it hits our Database (#171)
## Changes

- All text input fields are now cleaned with nh3 to remove html tags. We allow certain html tags under `settings.py/ALLOWED_HTML_TAGS` so we can easily update the tags we allow in the future.
- All names and descriptions now use the template tag `nh_safe` in all html files.
- Usernames and emails are a small exception and are not allowed any html tags

Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Co-authored-by: jebus <lorsbach@envipath.com>
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#171
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
2025-11-11 22:49:55 +13:00
e26d5a21e3 [Enhancement] Refactor Dataset (#184)
# Summary
I have introduced a new base `class Dataset` in `ml.py` which all datasets should subclass. It stores the dataset as a polars DataFrame with the column names and number of columns determined by the subclass. It implements generic methods such as `add_row`, `at`, `limit` and dataset saving. It also details abstract methods required by the subclasses. These include `X`, `y` and `generate_dataset`.

There are two subclasses that currently exist. `RuleBasedDataset` for the MLRR models and `EnviFormerDataset` for the enviFormer models.

# Old Dataset to New RuleBasedDataset Functionality Translation

- [x] \_\_init\_\_
    - self.columns and self.num_labels moved to base Dataset class
    - self.data moved to base class with name self.df along with initialising from list or from another DataFrame
    - struct_features, triggered and observed remain the same
- [x] \_block\_indices
    - function moved to base Dataset class
- [x] structure_id
    - stays in RuleBasedDataset, now requires an index for the row of interest
- [x] add_row
    - moved to base Dataset class, now calls add_rows so one or more rows can be added at a time
- [x] times_triggered
    - stays in RuleBasedDataset, now does a look up using polars df.filter
- [x] struct_features (see init)
- [x] triggered (see init)
- [x] observed (see init)
- [x] at
    - removed in favour of indexing with getitem
- [x] limit
    - removed in favour of indexing with getitem
- [x] classification_dataset
    - stays in RuleBasedDataset, largely the same just with new dataset construction using add_rows
- [x] generate_dataset
    - stays in RuleBasedDataset, largely the same just with new dataset construction using add_rows
- [x] X
    - moved to base Dataset as @abstract_method, RuleBasedDataset implementation functionally the same but uses polars
- [x] trig
    - stays in RuleBasedDataset, functionally the same but uses polars
- [x] y
    - moved to base Dataset as @abstract_method, RuleBasedDataset implementation functionally the same but uses polars
- [x] \_\_get_item\_\_
    - moved to base dataset, now passes item to the dataframe for polars to handle
- [x] to_arff
    - stays in RuleBasedDataset, functionally the same but uses polars
- [x] \_\_repr\_\_
    - moved to base dataset
- [x] \_\_iter\_\_
    - moved to base Dataset, now uses polars iter_rows

# Base Dataset class Features
The following functions are available in the base Dataset class

- init - Create the dataset from a list of columns and data in format list of list. Or can create a dataset from a polars Dataframe, this is essential for recreating itself during indexing. Can create an empty dataset by just passing column names.
- add_rows - Add rows to the Dataset, we check that the new data length is the same but it is presumed that the column order matches the existing dataframe
- add_row - Add one row, see add_rows
- block_indices - Returns the column indices that start with the given prefix
- columns - Property, returns dataframe.columns
- shape - Property, returns dataframe.shape
- X - Abstract method to be implemented by the subclasses, it should represent the input to a ML model
- y - Abstract method to be implemented by the subclasses, it should represent the target for a ML model
- generate_dataset - Abstract and static method to be implemented by the subclasses, should return an initialised subclass of Dataset
- iter - returns the iterable from dataframe.iter_rows()
- getitem - passes the item argument to the dataframe. If the result of indexing the dataframe is another dataframe, the new dataframe is  packaged into a new Dataset of the same subclass. If the result of indexing is something else (int, float, polar Series) return the result.
- save - Pickle and save the dataframe to the given path
- load - Static method to load the dataset from the given path
- to_numpy - returns the dataframe as a numpy array. Required for compatibility with training of the ECC model
- repr - return a representation of the dataset
- len - return the length of the dataframe
- iter_rows - Return dataframe.iterrows with arguments passed through. Mainly used to get the named iterable which returns rows of the dataframe as dict of column names: column values instead of tuple of column values.
- filter - pass to dataframe.filter and recreates self with the result
- select - pass to dataframe.select and recreates self with the result
- with_columns - pass to dataframe.with_columns and recreates self with the result
- sort - pass to dataframe.sort and recreates self with the result
- item - pass to dataframe.item
- fill_nan - fill the dataframe nan's with value
- height - Property, returns the height (number of rows) of the dataframe

- [x] App domain
- [x] MACCS alternatives

Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#184
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
2025-11-07 08:46:17 +13:00
13ed86a780 [Feature] Identify Missing Rules (#177)
Fixes #97
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#177
2025-10-30 00:47:45 +13:00
d5ebb23622 [Fix] AppDomain Leftovers (#161)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#161
2025-10-16 08:17:39 +13:00
afeb56622c [Chore] Linted Files (#150)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#150
2025-10-09 07:25:13 +13:00
b757a07f91 [Misc] Performance improvements, SMIRKS Coverage, Minor Bugfixes (#132)
Bump Python Version to 3.12
Make use of "epauth" optional
Cache `srs` property of rules to speed up apply
Adjust view names for use of `reverse()`
Fix Views for Scenario Attachments
Added Simply Compare View/Template to identify differences between rdkit and ambit
Make migrations consistent with tests + compare
Fixes #76
Set default year for Scenario Modal
Fix html tags for package description
Added Tests for Pathway / Rule
Added remove stereo for apply

Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#132
2025-09-26 19:33:03 +12:00
50db2fb372 [Feature] MultiGen Eval (Backend) (#117)
Fixes #16

Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#117
2025-09-18 18:40:45 +12:00
4158bd36cb [Feature] Legacy API Layer (#80)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#80
2025-09-03 01:35:51 +12:00
52931526c1 If Molecule contains a ~ load it as SMARTS (#71)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#71
2025-08-29 09:14:24 +12:00
fc8192fb0d Fix App Domain Bug when a Rule can be applied more than once (#49)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#49
2025-08-19 22:10:18 +12:00
3308d47071 Fix bond breaking (#46)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#46
2025-08-15 09:06:07 +12:00
ec52b8872d Functional Group Calculation (#44)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#44
2025-08-11 09:07:07 +12:00
579cd519d0 Experimental App Domain (#43)
Backend App Domain done, Frontend missing

Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#43
2025-08-08 20:52:21 +12:00
df896878f1 Basic System (#31)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#31
2025-07-23 06:47:07 +12:00
9323a9f7d7 Implement Compound CRUD (#22)
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#22
2025-07-03 07:17:04 +12:00
ded50edaa2 Current Dev State 2025-06-23 20:13:54 +02:00