Welcome to VICTA’s documentation!¶
Indices and tables¶
Vegetation Information Classification Tool Automator (VICTA) core library¶
This is a minimal implementation of a core library that can be built on to develop a new VICTA application.
Contents¶
Introduction¶
The VICTA core library consists of a classification key built from a directed graph (NetworkX DiGraph). The key is constructed from a set of couplets (graph nodes) and associated rules (graph edges). The classification key exposes a well defined API for passing data, key and rule logic in, requesting the classification of a data record, the decision path of that classification and the reporting of application exception and validation errors. There is no business logic hard-coded into the core library, it is a business/data agnostic decision tree only.
A simple diagram of the minimal implemented core library:
+------------------------------------------------+
|Command line script/GUI/Web App/Jupyter Notebook|
| |
| +-----------------+ +------------------+ |
| | INPUTS | | OUTPUTS | |
| | Data/Rules/Keys | | Classification | |
| | | | Decision path | |
| +--------------+--+ +--^---------------+ |
| | | |
| +------v--------+-------+ |
| | CORE API | |
| | Classification Tree | |
| | Rule and Key parsers | |
| +-----------------------+ |
+------------------------------------------------+
Data Model¶
The key (couplets) and rules are passed in as Pandas DataFrames, a tabular/iterable in-memory data structure. The exact format is similar to the existing “Key to Key” and “Key to MVG” format, but merged to a single table.
Key¶
The key dataframe must have the following column structure:
Attribute Name | Description |
---|---|
INPUT_COUPL ET | Unique integer identifying the parent couplet. |
RULES | String containing expression** to test. |
OUTPUT_COUP LET | Couplet to output if rules expression is True (mutally exclusive with OUTPUT_CLASS) |
OUTPUT_CLAS S | Class to output if rules expression is True (mutally exclusive with OUTPUT_COUPLET) |
OUTPUT_NAME | Output couplet/class name |
COMMENTS | Additional comments [optional] |
** The RULES expression format must be valid python syntax and conform to the following grammar:
[not] rule_id [[and|or][not][rule_id]]Where:
rule_id
is an integer identifying each rule to be tested.Examples:
NNN not NNN NNN or NN NNN or NN or N not (NNN or NN) (NNN or NN) or (N and NNNN) NNN and not NN
Rules¶
The rules dataframe must have the following column structure:
Attribute Name | Description |
---|---|
ID | Unique integer identifying the rule. |
ATTRIBUTE | Attribute/column to use when rule is tested (i.e. in the record to be classified by the key) |
OPERATOR | Positive comparison operator: in ,
= , >= , > , <= , < ,
regex (where: regex is a valid
regular
expression
) |
VALUE | Text string to look for in ATTRIBUTE |
NAME | Rule name |
COMMENTS | Additional comments [optional] |
Code Example¶
1 import os 2 import pandas as pd 3 from victa import Key, ClassificationError, MultipleMatchesError 4 5 if __name__ == '__main__': 6 7 id_field = 'NVIS_ID' 8 9 output_results = '../data/mvgs_nvis_results.xlsx' 10 output_steps = '../data/mvgs_nvis_steps.xlsx' 11 12 for output in (output_results, output_steps): 13 if os.path.exists(output): 14 os.unlink(output) 15 16 17 # Read couplets & rules 18 # Here we read from a spreadsheet, but you could get these from anywhere, 19 # a database, url, etc... 20 # All we need is pandas.DataFrame objects conforming to the structures 21 # documented in victa.rules.build_rules and victa.key.build_key 22 ruledf = pd.read_excel(open('../data/rules_nvis.xlsx', 'rb')) 23 keydf = pd.read_excel(open('../data/keys_nvis.xlsx', 'rb')) 24 25 # Build key 26 key = Key(keydf, 'MVG Key', ruledf) 27 28 # Read in tha records 29 # Here we read from a spreadsheet, but you could get the data from anywhere, 30 # a database, url, etc... All we need is a pandas.DataFrame object 31 recsdf = pd.read_excel(open('../data/FLATNVIS_VEG_DESC5.xlsx', 'rb')) 32 33 # iterate yerself 34 all_results = [] 35 all_steps = [] 36 for idx, record in recsdf.iterrows(): 37 try: 38 # Perform the classification 39 result, steps = key.classify(record, id_field=id_field) 40 all_results += [result] 41 all_steps += [steps] 42 except ClassificationError as e: 43 print(e) 44 # Can also do something with e.record and e.steps 45 except MultipleMatchesError as e: 46 print(e) 47 # Can also do something with e.record, e.couplet and e.rulesets 48 49 # Write out the results 50 all_results = pd.DataFrame(all_results) 51 all_steps = pd.concat(all_steps, ignore_index=True) 52 all_results.to_excel(output_results, index=False) 53 all_steps.to_excel(output_steps, index=False)
Installation¶
conda-env create -f victa.yml
activate victa
API Reference¶
Tests¶
Some basic tests of rules started. Needs more test coverage.
Contributors¶
Luke Pinner
License¶
Apache 2.0