Yacht Hydrodynamics (UCI, Regression, n=308, d=6)¶
Loading The Data¶
In [1]:
from kxy_datasets.uci_regressions import YachtHydrodynamics # pip install kxy_datasets
In [2]:
dataset = YachtHydrodynamics()
df = dataset.df # Retrieve the dataset as a pandas dataframe
y_column = dataset.y_column # The name of the column corresponding to the target
problem_type = dataset.problem_type # 'regression' or 'classification'
In [3]:
df.kxy.describe() # Visualize a summary of the data
--------------------------
Column: Beam-Draught Ratio
--------------------------
Type: Continuous
Max: 5.3
p75: 4.2
Mean: 3.9
Median: 4.0
p25: 3.8
Min: 2.8
---------------------
Column: Froude Number
---------------------
Type: Continuous
Max: 0.5
p75: 0.4
Mean: 0.3
Median: 0.3
p25: 0.2
Min: 0.1
-------------------------
Column: Length-Beam Ratio
-------------------------
Type: Continuous
Max: 3.6
p75: 3.5
Mean: 3.2
Median: 3.1
p25: 3.1
Min: 2.7
---------------------------
Column: Length-Displacement
---------------------------
Type: Continuous
Max: 5.1
p75: 5.1
Mean: 4.8
Median: 4.8
p25: 4.8
Min: 4.3
-----------------------------
Column: Longitudinal Position
-----------------------------
Type: Continuous
Max: 0.0
p75: -2.3
Mean: -2.4
Median: -2.3
p25: -2.4
Min: -5.0
------------------------------
Column: Prismatic Coeefficient
------------------------------
Type: Continuous
Max: 0.6
p75: 0.6
Mean: 0.6
Median: 0.6
p25: 0.5
Min: 0.5
----------------------------
Column: Residuary Resistance
----------------------------
Type: Continuous
Max: 62
p75: 12
Mean: 10
Median: 3.1
p25: 0.8
Min: 0.0
Data Valuation¶
In [4]:
df.kxy.data_valuation(y_column, problem_type=problem_type)
[====================================================================================================] 100% ETA: 0s Duration: 0s
Out[4]:
Achievable R-Squared | Achievable Log-Likelihood Per Sample | Achievable RMSE | |
---|---|---|---|
0 | 0.99 | -1.30 | 1.46 |
Automatic (Model-Free) Variable Selection¶
In [5]:
df.kxy.variable_selection(y_column, problem_type=problem_type)
[====================================================================================================] 100% ETA: 0s Duration: 0s
Out[5]:
Variable | Running Achievable R-Squared | Running Achievable RMSE | |
---|---|---|---|
Selection Order | |||
0 | No Variable | 0.00 | 1.51e+01 |
1 | Froude Number | 0.98 | 1.90 |
2 | Beam-Draught Ratio | 0.99 | 1.46 |
3 | Longitudinal Position | 0.99 | 1.46 |
4 | Length-Displacement | 0.99 | 1.46 |
5 | Prismatic Coeefficient | 0.99 | 1.46 |
6 | Length-Beam Ratio | 0.99 | 1.46 |