Initial bayer app
Show Pack Classification
Adjusted docker compose to bayer specifics
Adjusted Dockerfile for Bayer
Adding secret flags to group, add secret pools to packages
Adjusted View for Package creation
Prep configs, added Package Create Modal
wip
More on PES
wip
wip
This is an initial implementation that creates a working minimal .i6z document.
It passes schema validation and can be imported into IUCLID.
Caveat:
IUCLID files target individual compounds.
Pathway is not actually covered by the format.
It can be added in either soil or water and soil OECD endpoints.
**I currently only implemented the soil endpoint for all data.**
This sort of works, and I can report all degradation products in a pathway (not a nice view, but we can report many transformation products and add a diagram attachment in the future).
Adding additional information is an absolute pain, as we need to explicitly map each type of information to the relevant OECD field.
I use the XSD scheme for validation, but unfortunately the IUCLID parser is not fully compliant and requires a specific order, etc.
The workflow is: finding the AI structure from the XSD scheme -> make the scheme validation pass -> upload to IUCLID to get obscure error messages -> guess what could be wrong -> repeat 💣
New specifications get released once per year, so we will have to update accordingly.
I believe that this should be a more expensive feature, as it requires significant effort to uphold.
Currently implemented for root compound only in SOIL:
- Soil Texture 2
- Soil Texture 1
- pH value
- Half-life per soil sample / scenario (mapped to disappearance; not sure about that).
- CEC
- Organic Matter (only Carbon)
- Moisture content
- Humidity
<img width="2123" alt="image.png" src="attachments/d29830e1-65ef-4136-8939-1825e0959c62">
<img width="2124" alt="image.png" src="attachments/ac9de2ac-bf68-4ba4-b40b-82f810a9de93">
<img width="2139" alt="image.png" src="attachments/5674c7e6-865e-420e-974a-6b825b331e6c">
Reviewed-on: enviPath/enviPy#338
Co-authored-by: Tobias O <tobias.olenyi@envipath.com>
Co-committed-by: Tobias O <tobias.olenyi@envipath.com>
The scenarios lists both in /scenarios and /package/<id>/scenario no longer show related scenarios (children).
All related scenarios are shown on the scenario page under Related Scenarios if there are any.
<img width="500" alt="{C2D38DED-A402-4A27-A241-BC2302C62A50}.png" src="attachments/1371c177-220c-42d5-94ff-56f9fbab761f">
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#323
Co-authored-by: Liam Brydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: Liam Brydon <lbry121@aucklanduni.ac.nz>
Add API key authentication to v1 API
Also includes:
- management command to create keys for users
- Improvements to API tests
Minor:
- more robust way to start docker dev container.
Reviewed-on: enviPath/enviPy#327
Co-authored-by: Tobias O <tobias.olenyi@envipath.com>
Co-committed-by: Tobias O <tobias.olenyi@envipath.com>
**Warning depends on Timeseries feature to be merged**
Implements a way to display OECD 301F data on the pathway view.
This is mostly a PoC and needs to be improved once the pathway rendering is updated.

Co-authored-by: jebus <lorsbach@envipath.com>
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#319
Co-authored-by: Tobias O <tobias.olenyi@envipath.com>
Co-committed-by: Tobias O <tobias.olenyi@envipath.com>
This implements a version of #274, relying on Pydantics built in JSON schema and JSON rendering.
Requires additional UI tagging in the ai model repo but will remove HTML tags.
Example scenario with filled information: 5882df9c-dae1-4d80-a40e-db4724271456/scenario/3a4d395a-6a6d-4154-8ce3-ced667fceec0
Reviewed-on: enviPath/enviPy#282
Co-authored-by: Tobias O <tobias.olenyi@envipath.com>
Co-committed-by: Tobias O <tobias.olenyi@envipath.com>
**This pull request will need a separate migration pull-request**
I have added an alert box in two places when the user tries to predict with stereo chemistry.
When a user predicts a pathway with stereo chemistry an alert box is shown in that node's hover.
To do this I added two new fields. Pathway now has a "predicted" BooleanField indicating whether it was predicted or not. It is set to True if the pathway mode for prediction is "predict" or "incremental" and False if it is "build". I think it is a flag that could be useful in the future, perhaps for analysing how many predicted pathways are in enviPath?
Node now has a `stereo_removed` BooleanField which is set to True if the Node's parent Pathways has "predicted" as true and the node SMILES has stereochemistry.
<img width="500" alt="{927AC9FF-DBC9-4A19-9E6E-0EDD3B08C7AC}.png" src="attachments/69ea29bc-c2d2-4cd2-8e98-aae5c5737f69">
When a user does a prediction on a model's page it shows at the top of the list. This did not require any new fields as the entered SMILES does not get saved anywhere.
<img width="500" alt="{BED66F12-5F07-419E-AAA6-FE1FE5B4F266}.png" src="attachments/5fcc3a9b-4d1a-4e48-acac-76b7571f6507">
I think the alert box is an alright solution but if you have a great idea for something that looks/fits better please change it or let me know.
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#250
Co-authored-by: Liam Brydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: Liam Brydon <lbry121@aucklanduni.ac.nz>
Reactant Filter SMARTS as well as Product Filter SMARTS are now reflected when applying rules.
Fixes#245
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#246
## Changes
- All text input fields are now cleaned with nh3 to remove html tags. We allow certain html tags under `settings.py/ALLOWED_HTML_TAGS` so we can easily update the tags we allow in the future.
- All names and descriptions now use the template tag `nh_safe` in all html files.
- Usernames and emails are a small exception and are not allowed any html tags
Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Co-authored-by: jebus <lorsbach@envipath.com>
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#171
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
# Summary
I have introduced a new base `class Dataset` in `ml.py` which all datasets should subclass. It stores the dataset as a polars DataFrame with the column names and number of columns determined by the subclass. It implements generic methods such as `add_row`, `at`, `limit` and dataset saving. It also details abstract methods required by the subclasses. These include `X`, `y` and `generate_dataset`.
There are two subclasses that currently exist. `RuleBasedDataset` for the MLRR models and `EnviFormerDataset` for the enviFormer models.
# Old Dataset to New RuleBasedDataset Functionality Translation
- [x] \_\_init\_\_
- self.columns and self.num_labels moved to base Dataset class
- self.data moved to base class with name self.df along with initialising from list or from another DataFrame
- struct_features, triggered and observed remain the same
- [x] \_block\_indices
- function moved to base Dataset class
- [x] structure_id
- stays in RuleBasedDataset, now requires an index for the row of interest
- [x] add_row
- moved to base Dataset class, now calls add_rows so one or more rows can be added at a time
- [x] times_triggered
- stays in RuleBasedDataset, now does a look up using polars df.filter
- [x] struct_features (see init)
- [x] triggered (see init)
- [x] observed (see init)
- [x] at
- removed in favour of indexing with getitem
- [x] limit
- removed in favour of indexing with getitem
- [x] classification_dataset
- stays in RuleBasedDataset, largely the same just with new dataset construction using add_rows
- [x] generate_dataset
- stays in RuleBasedDataset, largely the same just with new dataset construction using add_rows
- [x] X
- moved to base Dataset as @abstract_method, RuleBasedDataset implementation functionally the same but uses polars
- [x] trig
- stays in RuleBasedDataset, functionally the same but uses polars
- [x] y
- moved to base Dataset as @abstract_method, RuleBasedDataset implementation functionally the same but uses polars
- [x] \_\_get_item\_\_
- moved to base dataset, now passes item to the dataframe for polars to handle
- [x] to_arff
- stays in RuleBasedDataset, functionally the same but uses polars
- [x] \_\_repr\_\_
- moved to base dataset
- [x] \_\_iter\_\_
- moved to base Dataset, now uses polars iter_rows
# Base Dataset class Features
The following functions are available in the base Dataset class
- init - Create the dataset from a list of columns and data in format list of list. Or can create a dataset from a polars Dataframe, this is essential for recreating itself during indexing. Can create an empty dataset by just passing column names.
- add_rows - Add rows to the Dataset, we check that the new data length is the same but it is presumed that the column order matches the existing dataframe
- add_row - Add one row, see add_rows
- block_indices - Returns the column indices that start with the given prefix
- columns - Property, returns dataframe.columns
- shape - Property, returns dataframe.shape
- X - Abstract method to be implemented by the subclasses, it should represent the input to a ML model
- y - Abstract method to be implemented by the subclasses, it should represent the target for a ML model
- generate_dataset - Abstract and static method to be implemented by the subclasses, should return an initialised subclass of Dataset
- iter - returns the iterable from dataframe.iter_rows()
- getitem - passes the item argument to the dataframe. If the result of indexing the dataframe is another dataframe, the new dataframe is packaged into a new Dataset of the same subclass. If the result of indexing is something else (int, float, polar Series) return the result.
- save - Pickle and save the dataframe to the given path
- load - Static method to load the dataset from the given path
- to_numpy - returns the dataframe as a numpy array. Required for compatibility with training of the ECC model
- repr - return a representation of the dataset
- len - return the length of the dataframe
- iter_rows - Return dataframe.iterrows with arguments passed through. Mainly used to get the named iterable which returns rows of the dataframe as dict of column names: column values instead of tuple of column values.
- filter - pass to dataframe.filter and recreates self with the result
- select - pass to dataframe.select and recreates self with the result
- with_columns - pass to dataframe.with_columns and recreates self with the result
- sort - pass to dataframe.sort and recreates self with the result
- item - pass to dataframe.item
- fill_nan - fill the dataframe nan's with value
- height - Property, returns the height (number of rows) of the dataframe
- [x] App domain
- [x] MACCS alternatives
Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#184
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
## Changes
- Ability to change the threshold from a command line argument.
- Names of data packages included in model name
- Names of data, rule and eval packages included in the model description
- EnviFormer models are now viewable on the admin site
- Ignore CO2 for training and evaluating EnviFormer
Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#173
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
The caching is now finished. The cache is created in `settings.py` giving us the most flexibility for using it in the future.
The cache is currently updated/accessed by `tasks.py/get_ml_model` which can be called from whatever task needs to access ml models in this way (currently, `predict` and `predict_simple`).
This implementation currently caches all ml models including the relative reasoning. If we don't want this and only want to cache enviFormer, i can change it to that. However, I don't think there is a harm in having the other models be cached as well.
Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#156
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
## Changes
- I have finished the backend integration of EnviFormer (#19), this includes, dataset building, model finetuning, model evaluation and model prediction with the finetuned model.
- `PackageBasedModel` has been adjusted to be more abstract, this includes making the `_save_model` method and making `compute_averages` a static class function.
- I had to bump the python-version in `pyproject.toml` to >=3.12 from >=3.11 otherwise uv failed to install EnviFormer.
- The default EnviFormer loading during `settings.py` has been removed.
## Future Fix
I noticed you have a little bit of code in `PackageBasedModel` -> `evaluate_model` for using the `eval_packages` during evaluation instead of train/test splits on `data_packages`. It doesn't seem finished, I presume we want this for all models, so I will take care of that in a new branch/pullrequest after this request is merged.
Also, I haven't done anything for a POST request to finetune the model, I'm not sure if that is something we want now.
Co-authored-by: Liam Brydon <62733830+MyCreativityOutlet@users.noreply.github.com>
Reviewed-on: enviPath/enviPy#141
Reviewed-by: jebus <lorsbach@envipath.com>
Co-authored-by: liambrydon <lbry121@aucklanduni.ac.nz>
Co-committed-by: liambrydon <lbry121@aucklanduni.ac.nz>
Bump Python Version to 3.12
Make use of "epauth" optional
Cache `srs` property of rules to speed up apply
Adjust view names for use of `reverse()`
Fix Views for Scenario Attachments
Added Simply Compare View/Template to identify differences between rdkit and ambit
Make migrations consistent with tests + compare
Fixes#76
Set default year for Scenario Modal
Fix html tags for package description
Added Tests for Pathway / Rule
Added remove stereo for apply
Co-authored-by: Tim Lorsbach <tim@lorsba.ch>
Reviewed-on: enviPath/enviPy#132