Logo
latest

QUICKSTART

  • Getting Started

ILLUSTRATIONS

  • Cheatsheet
  • Kaggle & UCI Experiments
  • Case Studies

THEORETICAL FOUNDATION

  • Memoryless Observations
  • Memoryful Observations

PYTHON CODE DOCUMENTATION

  • Data Valuation
  • Model-Free Variable Selection
  • Model Compression
  • Early Termination
  • Model Explanation
  • Model Improvability

MISCELLANEOUS

  • Your Data
  • DataFrame Extension Deep Dive

OTHER LANGUAGES

  • RESTful API

INDEX

  • Indices and tables
KXY (Lean AutoML, As A Service)
  • »
  • Naval Propulsion (UCI, Regression, n=11934, d=16)
  • Edit on GitHub

Naval Propulsion (UCI, Regression, n=11934, d=16)¶

Loading The Data¶

In [1]:
from kxy_datasets.uci_regressions import NavalPropulsion # pip install kxy_datasets
In [2]:
dataset = NavalPropulsion()
df = dataset.df # Retrieve the dataset as a pandas dataframe
y_column = dataset.y_column # The name of the column corresponding to the target
problem_type = dataset.problem_type # 'regression' or 'classification'
In [3]:
df.kxy.describe() # Visualize a summary of the data

-----------
Column: GGn
-----------
Type:   Continuous
Max:    9,797
p75:    9,132
Mean:   8,200
Median: 8,482
p25:    7,058
Min:    6,589

---------------------------
Column: GT Compressor Decay
---------------------------
Type:   Continuous
Max:    1.0
p75:    1.0
Mean:   1.0
Median: 1.0
p25:    1.0
Min:    0.9

------------------------
Column: GT Turbine Decay
------------------------
Type:   Continuous
Max:    1.0
p75:    1.0
Mean:   1.0
Median: 1.0
p25:    1.0
Min:    1.0

-----------
Column: GTT
-----------
Type:   Continuous
Max:    72,784
p75:    39,001
Mean:   27,247
Median: 21,630
p25:    8,375
Min:    253

-----------
Column: GTn
-----------
Type:   Continuous
Max:    3,560
p75:    2,678
Mean:   2,136
Median: 1,924
p25:    1,386
Min:    1,307

----------
Column: LP
----------
Type:   Continuous
Max:    9.3
p75:    7.1
Mean:   5.2
Median: 5.1
p25:    3.1
Min:    1.1

----------
Column: P1
----------
Type:   Continuous
Max:    1.0
p75:    1.0
Mean:   1.0
Median: 1.0
p25:    1.0
Min:    1.0

----------
Column: P2
----------
Type:   Continuous
Max:    23
p75:    15
Mean:   12
Median: 11
p25:    7.4
Min:    5.8

-----------
Column: P48
-----------
Type:   Continuous
Max:    4.6
p75:    3.0
Mean:   2.4
Median: 2.1
p25:    1.4
Min:    1.1

------------
Column: Pexh
------------
Type:   Continuous
Max:    1.1
p75:    1.0
Mean:   1.0
Median: 1.0
p25:    1.0
Min:    1.0

----------
Column: T1
----------
Type:   Continuous
Max:    288
p75:    288
Mean:   288
Median: 288
p25:    288
Min:    288

----------
Column: T2
----------
Type:   Continuous
Max:    789
p75:    693
Mean:   646
Median: 637
p25:    578
Min:    540

-----------
Column: T48
-----------
Type:   Continuous
Max:    1,115
p75:    834
Mean:   735
Median: 706
p25:    589
Min:    442

-----------
Column: TIC
-----------
Type:   Continuous
Max:    92
p75:    44
Mean:   33
Median: 25
p25:    13
Min:    0.0

----------
Column: Tp
----------
Type:   Continuous
Max:    645
p75:    332
Mean:   227
Median: 175
p25:    60
Min:    5.3

----------
Column: Ts
----------
Type:   Continuous
Max:    645
p75:    332
Mean:   227
Median: 175
p25:    60
Min:    5.3

---------
Column: V
---------
Type:   Continuous
Max:    27
p75:    21
Mean:   15
Median: 15
p25:    9.0
Min:    3.0

----------
Column: mf
----------
Type:   Continuous
Max:    1.8
p75:    0.9
Mean:   0.7
Median: 0.5
p25:    0.2
Min:    0.1

Data Valuation¶

In [4]:
df.kxy.data_valuation(y_column, problem_type=problem_type)
[====================================================================================================] 100% ETA: 0s    Duration: 0s
Out[4]:
Achievable R-Squared Achievable Log-Likelihood Per Sample Achievable RMSE
0 0.01 -4.42 1.47e-02

Automatic (Model-Free) Variable Selection¶

In [5]:
df.kxy.variable_selection(y_column, problem_type=problem_type)
[====================================================================================================] 100% ETA: 0s    Duration: 0s
Out[5]:
Variable Running Achievable R-Squared Running Achievable RMSE
Selection Order
0 No Variable 0.00 0.0147
1 GGn 0.01 0.0147
2 P48 0.01 0.0147
3 T2 0.01 0.0147
4 GTT 0.01 0.0147
5 T48 0.01 0.0147
6 LP 0.01 0.0147
7 Tp 0.01 0.0147
8 GT Turbine Decay 0.01 0.0147
9 Ts 0.01 0.0147
10 V 0.01 0.0147
11 Pexh 0.01 0.0147
12 TIC 0.01 0.0147
13 GTn 0.01 0.0147
14 mf 0.01 0.0147
15 P2 0.01 0.0147

© Copyright 2021, KXY Technologies, Inc.

Built with Sphinx using a theme provided by Read the Docs.